Apache Iceberg – An Architectural Look Under the Covers

   
  • Jason HughesDirector of Technical Advocacy

Session Abstract

Data Lakes have been built with a desire to democratize data – to allow more and more people, tools, and applications to make use of data. A key capability needed to achieve it is hiding the complexity of underlying data structures and physical data storage from users. The de-facto standard has been the Hive table format, released by Facebook in 2009 that addresses some of these problems, but falls short at data, user, and application scale. So what is the answer? Apache Iceberg. Apache Iceberg table format is now in use and contributed to by many leading tech companies like Netflix, Apple, Airbnb, LinkedIn, Dremio, Expedia, and AWS.Join Jason Hughes, Technical Director at Dremio, for this webinar to learn the architectural details of why the Hive table format falls short and why the Iceberg table format resolves them, as well as the benefits that stem from Iceberg’s approach.You will learn:

  • The issues that arise when using the Hive table format at scale, and why we need a new table format
  • How a straightforward, elegant change in table format structure has enormous positive effects
  • The underlying architecture of an Apache Iceberg table, how a query against an Iceberg table works, and how the table’s underlying structure changes as CRUD operations are done on it
  • The resulting benefits of this architectural design

Ready to Get Started? Here Are Some Resources to Help

Case Study

Case Study

Dremio Supports Moonfare’s High-Performance Culture with a High-Performance Lakehouse

Moonfare replaced a PostgreSQL-based data warehouse on Amazon Web Services (AWS) with a Dremio data lakehouse to offer data engineers, analysts and business users a high performance platform for business intelligence and predictive analytics empowering them to make better data-driven decisions.

read more

Case Study

Case Study: DB Cargo Gives Users the Green Light to All Data with Dremio

Deutsche Bahn Group (DB) is one of the world's leading mobility and logistics companies. The DB Cargo business unit manages DB's rail freight business.

read more
Case Study

Case Study

Case Study: Amazon Accelerates Supply Chain Decision Making with Dremio

Amazon's Supply Chain Finance Analytics team developed a new analytics architecture with Dremio to simplify ETL processes, accelerate queries, and provide analytics on a unified view of the data.

read more

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us