Gnarly Data Waves Episode
Overview of Dremio’s Data Lakehouse
On our 1st episode of Gnarly Data Waves, Read Maloney provides an Overview of Getting Started with Dremio's Data Lakehouse and showcase Dremio Use Cases advantages.
Learn moreCUSTOMER STORY
Hungary’s OTP Bank uses Dremio to consolidate and streamline data management across multiple data domains and provide fast access to customer data from diverse sources.
OTP Group is one of the leading banking groups in Central and Eastern Europe. OTP Bank and its subsidiaries offer banking services to more than 16 million enterprises, retail, commercial and government customers in 11 countries. Established more than 70 years ago in Budapest, Hungary, OTP Group has gradually acquired banks in many countries across the region and continually works toward increasing its market share through organic growth and further acquisitions.
OTP Bank adapted its operating model to have shared services and a centralized data management platform to increase efficiency, so all subsidiaries could standardize reporting across the enterprise. The bank embarked on an IT transformation program to improve data management across its infrastructure and core banking systems. Agile transformation has been going on in the group as well, this method helps OTP Bank adapt to rapidly changing digitalization trends.
There were several challenges in the development of this new data management platform. Many of the bank’s shared support business units, like risk, compliance and internal security, are required to provide accurate and timely data to comply with regulations. However, their legacy data management solutions made it difficult to access data and were very expensive to manage. This created a redundant data silos problem that was hard to maintain.
They also required better business intelligence to recognize what potential customers needed. When OTP Bank wanted to sell a credit card to a customer online using their old system, they needed a big data solution to consolidate and manage the flood of data from social media and microservices to build the capability to offer customers more personalized offers and communications.
OTP Bank learned about Dremio when they were looking into Denodo and saw a blog post that mentioned Dremio. “We did not want a vendor that would lock us in. We thought the Apache Arrow-based solution would bring big data closer to the business and give us better access to it,” says Lotar Schin, Big Data Team Lead, OTP Bank Hungary. “Based on Dremio’s leadership team and Apache Arrow, we thought those guys should know what they are talking about.”
The primary driver for using Dremio was to get better insights into activity happening across the bank. Schin says they did not want to change a complete ecosystem which would be overwhelming for IT. “Implementing Dremio was quite quick and only took about one week, and a few more weeks to finetune the connections,” he says. “Dremio was a great fit. It was quite easy to join data in the data lake to data warehouses.
Dremio is integrated into OTP Bank’s Cloudera cluster (currently 1.2 petabytes) running on Yarn resource manager. They have redirected all data sources (HDFS, MS SQL and Oracle) to get all the data required for reporting to go to HDFS, so it is more efficient. Implementing security was also one of the first requirements. For business intelligence, they use Power BI and Tableau.
In the past, OTP Bank did not have such easy connectivity among their various data sources. The Dremio solution gives OTP Bank visibility into all activity across the bank. Connecting the CRM with HDFS gives them fast insight into the business without having to make a request to IT for data access. “That’s a huge achievement because to see data in the past, we had to request extractions and develop ETLs. It was very costly,” Schin says. “Business units could reduce their data requests to IT, more-and-more often they can just look at it and use it. This makes the core business, IT, compliance, risk and internal security all very satisfied.”
OTP Bank’s storage needs are higher than their compute needs and are increasing every day, so storage is a high and growing cost. OTP Bank plans to use the Dremio query engine to decouple storage and computing, saving them significant storage costs. To scale storage horizontally, they would need to buy more servers. By decoupling storage and compute, they can manage the same amount of data cheaper.
Faster access to more data means OTP Bank can be more agile and relevant in responding to new opportunities. In the past, business users were not able to effectively use the data in a timely manner. If a user wanted to connect a new data source to the existing datasets, they had to request a new system integration to the data warehouse, and this could take months. With Dremio, they just upload the raw files to HDFS and publish them to users who can immediately get answers to questions or decide what is necessary to copy and transfer into the data warehouse. Dremio’s query speed enables faster responsiveness and time to market. Now OTP Bank has immediate access to information about who they are selling to and what to offer them.
Initially, the semantic layer was not a major reason OTP Bank chose Dremio because they didn’t believe that the virtual layer would be enough for their needs. But now that is changing. “We are running into more and more use cases that require a virtual layer. We are now starting to go down this road, and we’re pretty impressed,” Schin says.
OTP Bank’s first BI implementation is a data mart using Dremio’s semantic layer. The data mart gathers data from Microsoft SQL databases. They don’t need to build a third consolidated database, they can just use the data. If there are any performance issues, they can use Dremio’s reflections capability, which provides physically optimized representations of source data.
Currently, OTP Bank has a Cloudera ecosystem but is looking into public cloud data systems. They appreciate that they can go into an AWS marketplace, and Dremio can be installed on the data, so they don’t need to introduce a new tool to the business. In the future, they may also reduce the number of databases and go to an object store approach.
“This is a huge advantage that Dremio offers. When you decide to change the infrastructure of your data you don’t need to impact your reporting layer. Having an independent query engine enables us to play with different solutions, and we can optimize for cost, time to market and easily move data back and forth,” Schin says. “All this can be done in the backend without affecting the business.”
OTP Bank wanted to avoid lock-in with a single vendor. They felt comfortable with Dremio because they shared the same vision. “We believe that the future is software as a service, a self-service solution that leverages the public cloud,” he says. “We wanted to focus on our business, not on IT. These guys understand the technology, so we can focus on data management and business enablement instead of IT.”
The ability for business users to do self-service crossplatform queries across multiple data storage solutions (Oracle, MS SQL, HDFS, NFS, etc.) not only improves time to market, but it also enables IT resources to be used more efficiently. Each year, OTP Bank can shift hundreds of days of IT manpower from just moving data around to instead focusing on work that really adds business value.
On our 1st episode of Gnarly Data Waves, Read Maloney provides an Overview of Getting Started with Dremio's Data Lakehouse and showcase Dremio Use Cases advantages.
Learn moreA SQL data lakehouse uses SQL commands to query cloud data lake storage, simplifying data access and governance for both BI and data science.
Learn moreDownload this white paper to get a step-by-step roadmap for adopting Dremio and migrating workloads while maintaining coexistence and interoperability with existing systems and technologies.
Learn more