Why and How Netflix Created and Migrated to a New Table Format: Iceberg

Thursday, July 22 2021

Netflix created Apache Iceberg to address the performance and the inherent issues as well as usability challenges of using Apache Hive tables in large and demanding data lake environments.

In this session, join Ted Gooch, Database Architect at Netflix, as he explains:

· How Iceberg was developed by Netflix to solve some of the inherent issues in the Hive table format
· How increased expressive partitioning capability allows for highly selective filters over large data
· How data marts were required to use data directly in a data lake and how migrating to Iceberg allows workloads with low latency
· How the flexible format of Iceberg allows for evolution of both schema and partitioning
· How the separation of logical table from physical layout allows for background optimization of storage