The data lakehouse has captured the imagination of modern enterprises looking to streamline their architectures, reduce cost and assist in the governance of self-service analytics.
From data mesh support to providing a unified access layer for analytics and data modernisation for the hybrid cloud, it offers plenty of business cases, but many organizations are unsure where to start building one.
https://googleads.g.doubleclick.net/pagead/ads?client=ca-pub-6676241418785266&output=html&h=250&slotname=7708477045&adk=1300247213&adf=3440538110&pi=t.ma~as.7708477045&w=300&lmt=1702662918&format=300x250&url=https%3A%2F%2Fbetanews.com%2F2023%2F12%2F15%2Fhow-to-build-a-successful-data-lakehouse-strategy-qa%2F&ea=0&wgl=1&uach=WyJtYWNPUyIsIjEzLjYuMCIsImFybSIsIiIsIjEyMC4wLjYwOTkuMTA5IixudWxsLDAsbnVsbCwiNjQiLFtbIk5vdF9BIEJyYW5kIiwiOC4wLjAuMCJdLFsiQ2hyb21pdW0iLCIxMjAuMC42MDk5LjEwOSJdLFsiR29vZ2xlIENocm9tZSIsIjEyMC4wLjYwOTkuMTA5Il1dLDBd&dt=1702662918464&bpp=1&bdt=497&idt=145&shv=r20231207&mjsv=m202312070101&ptt=9&saldr=aa&abxe=1&prev_fmts=0x0&nras=1&correlator=5483556057378&frm=20&pv=1&ga_vid=1967501153.1702662919&ga_sid=1702662919&ga_hid=1009516551&ga_fc=0&u_tz=-420&u_his=1&u_h=900&u_w=1440&u_ah=820&u_aw=1440&u_cd=30&u_sd=2&dmc=8&adx=584&ady=1220&biw=1440&bih=727&scr_x=0&scr_y=0&eid=44759876%2C44759927%2C31079714%2C44798934%2C95320884%2C31078663%2C31078665%2C31078668%2C31078670&oid=2&pvsid=3129136130714092&tmod=1590183045&uas=0&nvt=1&fc=1920&brdim=0%2C25%2C0%2C25%2C1440%2C25%2C1440%2C814%2C1440%2C727&vis=1&rsz=%7C%7ClEbr%7C&abl=CS&pfx=0&fu=0&bc=31&td=1&psd=W251bGwsbnVsbCxudWxsLDNd&nt=1&ifi=2&uci=a!2&btvi=1&fsb=1&dtd=148
We spoke to Jonny Dixon, Senior Product Manager at open data lakehouse platform Dremio, to discuss their benefits and how to make the most of them.
BN: Where did the concept of the data lakehouse originate from?
JD: Data is the compass that guides decision-making, helping leaders navigate choppy waters with confidence. However, in reality, data is siloed and stored across a business -- and myriad sources -- in multiple clouds and on-premises. But when a company can't put their data into the hands of those who need it quickly enough, they fail to realise its value and make critical decisions without a complete picture.
However, breaking down silos is expensive, time consuming and risky. The most common work-around for these challenges is making data copies. But this degrades data quality and reliability -- there's no longer a single source of truth -- and data is often stale as it takes time to integrate it from the data lake into analytics and BI tools. Further, in complex regulatory environments, leaders worry they'll lose control of their data if it's shared, copied or duplicated across platforms and the business. Securing and governing data at scale is a complex task given its manual, error-prone and often applied inconsistently, exposing businesses to compliance risk.
The data lakehouse is designed to solve all these issues. It brings together the structure and performance of a data warehouse with the flexibility of a data lake, allowing for high-speed data transformation and querying, and the consolidation of multi-structured data in flexible object stores. While still early on in its adoption, many businesses see data lakehouses as an efficient way to cut costs, facilitate the governance of self-service analytics and streamline their architectures.
Read the full article here.