December 10, 2025

Breaking the Warehouse Walls: Embracing the Flexibility of Lakehouse with Apache Iceberg

Traditional SQL data warehouses, while powerful for structured analytics, suffer from rigid schemas, high costs, and poor support for real-time or semi-structured data. The shift to a Lakehouse architecture, powered by Apache Iceberg, overcomes these limitations by merging the best of data lakes and warehouses into a unified, open, and scalable platform.

Iceberg revolutionizes data management by delivering ACID transactions on open file formats, enabling schema evolution without downtime, and providing time-travel capabilities for auditing and rollback. Unlike traditional warehouses, Iceberg supports fine-grained partitioning, metadata pruning, and hidden partitioning—dramatically improving query performance and cost efficiency. Critically, it eliminates the need for complex ETL pipelines by allowing direct, consistent access from multiple engines (Spark, Trino, Flink) while ensuring data integrity at petabyte scale.

Beyond performance, Iceberg fosters interoperability, breaking vendor lock-in and enabling seamless data sharing across analytics, ML, and real-time applications. Organizations adopting Iceberg gain a future-proof foundation that scales with evolving business needs—whether for batch processing, streaming analytics, or AI/ML pipelines. This talk explores real-world implementations, benchmarks, and best practices for transitioning from legacy warehouses to an open, performant, and cost-effective Lakehouse powered by Iceberg.

Topics Covered

Modernization and Migration

Sign up to watch all Subsurface 2025 sessions