
Gnarly Data Waves Episode
Overview of Dremio’s Data Lakehouse
On our 1st episode of Gnarly Data Waves, Read Maloney provides an Overview of Getting Started with Dremio's Data Lakehouse and showcase Dremio Use Cases advantages.
Learn moreCUSTOMER STORY
returning in seconds versus hours
governance and security
improves productivity
NewDay is transforming its data infrastructure from legacy systems to a cloud-based, secure PCI DSS-compliant data platform on AWS, leveraging the Dremio data lake engine for self-service data access.
NewDay is one of the UK’s leading consumer credit finance companies, with over five million customers and receivables of over £3 billion. Founded in 2001, they provide credit card solutions across the breadth of the market, with a clear aim to help customers be better with credit. NewDay’s Ownbrand business offers near-prime revolving credit and unsecured loans to customers who may not have easy access to mainstream lenders. The NewDay Co-brand business partners with online and traditional retailers such as Amazon and UK-based AO to offer credit to their customers together with loyalty and other reward programs. NewDay’s access to and understanding of data allows them to generate in-depth customer insights and provide valued support to its retail partners.
NewDay has been going through a rapid digital transformation, and as part of this effort, they recognized the need to build and enhance advanced analytics and data science capabilities.
NewDay’s legacy systems were inefficient and relied on third parties to allow the business to access and analyze the data they needed. NewDay’s data users range from report consumers to advanced data science and model creators. The existing system could not scale to meet the needs of diverse users, which resulted in the creation of data silos and different data definitions across the business.
To support their users and implement a future-proofed capability, NewDay chose a cloud-based AWS data platform utilizing a data lake centralized on Amazon S3. One of the immediate challenges was how to enable less technical users to leverage the scale and performance of Apache Parquet on S3 using familiar tools like Microsoft Excel and SQL, while also empowering advanced developers and data scientists to access and utilize the same data. It was important to avoid creating data silos to ensure the entire business was working from a common data definition.
NewDay tried a number of methods to abstract and present the Parquet files on S3 so users could access them as virtual tables. Unfortunately, the user experience was less than optimal and the tools were inflexible. Role-based access control was slow and difficult to manage as users and roles had to be managed per tool, rather than via the central directory. They were not able to leverage their Tableau investment because the integration with cloud data sources did not function well and was difficult to use, preventing them from successfully rolling out Tableau to all users.
The NewDay Data Platforms team was an avid proponent of open source tools and the Data Science team was already leveraging Apache Arrow as part of their data preparation and model development. Consequently, NewDay had been following the progress of Dremio as active participants of the Apache Arrow project. Dremio seemed to solve the challenges NewDay was facing and a successful proof of concept on NewDay’s cloud led to adoption of Dremio.
Dremio is now a core part of NewDay’s Nexus data platform. It sits over NewDay’s core transactional credit bureau and partner data in the data lake on S3 and enables the broad range of NewDay employees to use whichever tools they are comfortable with to interface with the data. NewDay’s users are able to leverage Dremio’s performance with Apache Airflow to develop their own ETL workflows to produce custom reports and extracts.
Dremio allows NewDay to improve productivity for business users and data scientists through self-service access to data on S3. The solution is considerably faster with most queries returning in seconds versus minutes or hours on the legacy systems. The virtual data sets enable users to quickly find required fields, which is especially important when migrating from a legacy system.
With Dremio, NewDay is able to reduce costs compared to the old system by consolidating data onto one platform and removing the need to pay third parties to host data warehouse solutions.
Dremio gives the S3 Nexus platform a secure, self-service data access layer. The fine-grained access control coupled with transparent and visible logging enables NewDay to ensure data is appropriately controlled while enabling self-service and flexibility.
NewDay wanted to improve its capabilities in data science, data warehousing and business intelligence solutions across the organization. Dremio is helping to enable machine learning, analytical and reporting solutions. Data analysts and engineers can now explore and prep data at scale, data scientists can quickly train and deploy new models using features that would not have been possible with the legacy system, and developers and report creators can utilize Tableau or the Dremio API to create real-time custom front ends and dashboards.
On our 1st episode of Gnarly Data Waves, Read Maloney provides an Overview of Getting Started with Dremio's Data Lakehouse and showcase Dremio Use Cases advantages.
Learn moreA SQL data lakehouse uses SQL commands to query cloud data lake storage, simplifying data access and governance for both BI and data science.
Learn moreDownload this white paper to get a step-by-step roadmap for adopting Dremio and migrating workloads while maintaining coexistence and interoperability with existing systems and technologies.
Learn more