November 13, 2025

Streaming with Iceberg: From Zero to Hero

Streaming data into Iceberg is gaining traction in modern data platforms, but it brings its own set of challenges that go beyond the usual batch processing problems. In this talk, we’ll dive into the best practices and advanced tips for building reliable and efficient streaming pipelines with Iceberg.

We’ll cover some of the trickier aspects of streaming, like dealing with the constant creation of small files and how Iceberg’s architecture can amplify their impact on performance and storage. You’ll learn practical ways to address these issues, such as optimizing partitioning and sorting, fine-tuning write configurations, managing the cost and complexity of compaction in high-throughput scenarios, and handling late-arriving data. We’ll also look at challenges like working with large numbers of manifests, keeping query planning times and performance consistent.

Topics Covered

Data Lake
Data Optimization
Lakehouse
Streaming Analytics

Sign up to watch all Subsurface 2025 sessions