December 10, 2025

Iceberg and Chill: Because You and Your Data Are Meant to Be Together

This is the story of how we modernized our data platform by introducing Apache Iceberg—without breaking everything. We set up Iceberg tables in parallel with Hive, rewired our ETL pipelines to write to both, and gradually shifted our entire workload over. Along the way, we implemented a Medallion architecture, adjusted our Spark jobs (both on EMR and serverless), and discovered a few icy surprises—like EMR version bugs that broke Iceberg metadata.

We’ll share what worked, what didn’t and why Iceberg is absolutely worth the chill. Come for the war stories. Stay for the Iceberg love story.

Topics Covered

Apache Iceberg

Data lakehouse

Lakehouse

Modernization and Migration

Table Formats

Sign up to watch all Subsurface 2025 sessions

Speaker

Chaim Sender

Data Engineer