July 22, 2021

Why and How Netflix Created and Migrated to a New Table Format: Iceberg

Netflix created Apache Iceberg to address the performance and the inherent issues as well as usability challenges of using Apache Hive tables in large and demanding data lake environments. In this session, join Ted Gooch, Database Architect at Netflix, as he explains:

How Iceberg was developed by Netflix to solve some of the inherent issues in the Hive table format
How increased expressive partitioning capability allows for highly selective filters over large data
How data marts were required to use data directly in a data lake and how migrating to Iceberg allows workloads with low latency
How the flexible format of Iceberg allows for evolution of both schema and partitioning
How the separation of logical table from physical layout allows for background optimization of storage

Topics Covered

Table Formats

Unlocking Potential with Apache Iceberg

Speakers

Ted Gooch

Ted works on the big data compute team at Netflix which manages a petabyte-scale Data-warehouse Platform that services analytic and reporting needs for the entire Netflix organization. He has a passion for solving problems in the big data space. At Netflix, Ted partners with data engineers, data scientists and analytic engineers to leverage the newest platform features for large scale, high performance advanced analytics. Ted holds a Masters in Computer Science from the University of Southern California and is an Apache Iceberg committer.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Why and How Netflix Created and Migrated to a New Table Format: Iceberg

Speakers

Try Dremio’s Interactive Demo

Get Started Free

See Dremio in Action

Talk to an Expert

Make data engineers and analysts 10x more productive