13 minute read · September 12, 2025

What’s New in Apache Iceberg 1.10.0, and what comes next!

Alex Merced

Alex Merced · Head of DevRel, Dremio

The Apache Iceberg project has steadily evolved into the backbone of modern open data lakehouses, delivering performance, reliability, and interoperability across a wide range of engines. With the release of Apache Iceberg 1.10.0, the community takes a major step forward by bringing the long-anticipated format-version 3 (V3) features into general availability. This release isn’t just about incremental fixes, it introduces fundamental capabilities that make data lakes more efficient, governable, and adaptable to real-world workloads.

From binary deletion vectors that streamline row-level deletes, to default column values that simplify schema evolution, to row-level lineage that unlocks transparency and regulatory compliance, version 1.10.0 redefines what’s possible in large-scale data management. On top of that, support for new data types like variant and geospatial types widens the scope of analytics Iceberg can power. And this is just the beginning, the roadmap ahead includes features like bulk delete support, enhanced partition transforms, and tighter ecosystem integrations.

In this blog, we’ll break down what’s new in Iceberg 1.10.0, why these changes matter, and what’s on the horizon for future releases.

Key Highlights of Apache Iceberg 1.10.0

Apache Iceberg 1.10.0 is the first release to fully embrace format-version 3 (V3), and with it comes a wave of innovations that significantly improve how data is stored, evolved, and managed. Let’s explore some of the most impactful changes:

Binary Deletion Vectors

Traditional row-level deletes in Iceberg relied on position delete files, which could become bulky and slow when handling high-churn workloads. With 1.10.0, Iceberg introduces binary deletion vectors, compact Roaring bitmaps that attach to data files and track which rows are deleted. This drastically reduces metadata overhead and speeds up queries, especially in change-data-capture (CDC) and streaming scenarios.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Default Column Values

Schema evolution just got easier. Instead of rewriting data when adding a new non-nullable column, Iceberg tables can now specify default values for both historical and future rows. This means organizations can introduce new fields without heavy rewrites, making schema evolution seamless and less disruptive to production pipelines.

Row-Level Lineage

Transparency and traceability are becoming increasingly important for compliance and governance. Iceberg 1.10.0 introduces row-level lineage with built-in row IDs and update sequence numbers. These identifiers make it possible to track the lifecycle of individual rows across snapshots, enabling features like incremental queries, fine-grained auditing, and more trustworthy CDC pipelines.

Expanded Data Types

With the release of V3 spec, Iceberg tables can now handle richer data natively:

  • Variant type for semi-structured data like JSON.
  • Geospatial types (geometry, geography) for advanced spatial analytics.
  • Nanosecond-precision timestamps and an unknown type for evolving schemas without forcing immediate resolution.

Stronger Ecosystem Support

Iceberg 1.10.0 also broadens its compatibility with query engines and ecosystems. Spark 4.0 and Flink 2.0 are now supported out-of-the-box, ensuring that teams using cutting-edge engines can fully leverage the benefits of V3 tables.

Why These Features Matter

The advancements in Iceberg 1.10.0 aren’t just technical niceties, they directly address some of the toughest challenges enterprises face when building and operating data lakehouses.

Handling Change Without Heavy Lifts

Data is rarely static. Whether you’re adding new attributes, handling deletions, or updating records, legacy table formats often force painful rewrites or workarounds. With binary deletion vectors and default column values, Iceberg minimizes the need for full rewrites, making evolving datasets cheaper, faster, and less error-prone.

Trust and Transparency

Organizations need more than fast queries; they need auditability. The new row-level lineage capabilities allow businesses to answer critical questions like: “When was this record last updated?” or “Which snapshot introduced this row?” This kind of lineage is essential for regulatory compliance, data quality monitoring, and incremental ETL pipelines.

Expanding the Scope of Analytics

By adding variant and geospatial data types, Iceberg extends its reach into domains that traditionally required specialized systems. Semi-structured data from APIs, event logs, and IoT devices can now live alongside structured relational data in the same Iceberg lakehouse. Similarly, geospatial analytics for logistics, mobility, or environmental studies can be done without resorting to niche databases.

Keeping Pace with the Ecosystem

Finally, the extended engine support in Spark 4.0 and Flink 2.0 ensures Iceberg remains a first-class citizen in the rapidly evolving data stack. This means teams can confidently upgrade to modern compute engines while still relying on Iceberg for consistent table management.

A Closer Look at Format Version 3

Apache Iceberg 1.10.0 is significant because it delivers the general availability of format-version 3 (V3), the most ambitious evolution of the Iceberg spec since its inception. V3 is designed to handle the realities of today’s data lakes, where data volumes are massive, workloads are dynamic, and governance requirements are increasingly strict.

Efficiency Through Smarter Deletes

With deletion vectors, Iceberg eliminates the inefficiencies of legacy row-level deletes by storing row deletions as compact bitmaps. Instead of maintaining sprawling position delete files, queries can now check a bitmap attached to each data file, which drastically reduces I/O and speeds up analytics.

Evolving Without Pain

The addition of default column values means schema changes no longer come with costly rewrites. For data teams, this makes evolving datasets a background operation rather than a disruptive project. It’s a big win for agility, particularly in enterprises where schema changes are frequent.

Lineage for Governance and Incremental Workloads

V3 introduces built-in row-level lineage, making it possible to identify precisely when a row was created or last updated. This is invaluable for regulatory compliance, incremental ETL, and data auditing. It also lays the groundwork for more advanced CDC pipelines.

Expanding the Language of Data

New data types like variant, geometry, and geography reflect Iceberg’s expanding role in diverse domains, from semi-structured API payloads to geospatial analytics. With nanosecond-precision timestamps and an unknown type for schema placeholders, V3 ensures Iceberg tables can support everything from financial tick data to flexible schema evolution.

Partitioning and Beyond

Finally, the move toward multi-argument partition transforms and support for new constructs like source-ids points to a future where partitioning strategies can become far more expressive. This means better performance tuning without sacrificing simplicity.

What’s Coming Next in Iceberg

The release of Apache Iceberg 1.10.0 is a milestone, but it’s far from the finish line. The community is already deep into work for Iceberg 1.11 and beyond, tackling performance optimizations, ecosystem integrations, and even more flexible table capabilities. Here are some of the most exciting developments in progress:

Multi-Argument Partition Transforms

While the V3 spec lays the groundwork, upcoming releases will bring full support for multi-argument partition transforms. This includes the introduction of source-ids, allowing partition specs to reference multiple columns at once. The result: more expressive partitioning strategies that can better balance query performance and storage efficiency.

Bulk Delete Support in HadoopFileIO

Deleting thousands (or millions) of files one by one can create serious bottlenecks. Future versions of Iceberg will introduce bulk delete operations in HadoopFileIO, enabling mass file removals through a single API call. This is especially valuable in cloud object stores like S3, where reducing per-request overhead can translate into major cost and speed improvements.

Kafka Connect Visibility

Iceberg already has a powerful Kafka Connect sink, but plans are underway to publish the connector runtime to Confluent Hub. This will give the Iceberg community a more official, discoverable, and widely accessible streaming option, lowering the barrier for streaming data ingestion into Iceberg tables.

PyIceberg and Catalog Improvements

Python has become a central tool for data engineers and scientists, and the Iceberg community is evolving to meet that need. Work is being done to publish the REST Catalog spec as a Python package, aligning Java and Python APIs more closely. There are also ongoing discussions about slimming down or modularizing PyIceberg’s optional dependencies to improve maintainability.

S3 Analytics Accelerator Integration

Performance tuning in cloud environments continues to be a focus. Integration of the S3 Analytics Accelerator library is on the roadmap, with early investigations showing promise for reducing read times and memory usage. This could make analytical workloads over large Iceberg tables in S3 even faster and more cost-efficient.

Stronger Governance and Security

Finally, efforts are underway to integrate external authentication and role management systems, like Dremio’s Auth Manager, with Iceberg catalogs. This will make it easier to apply consistent governance policies across teams and tools.

The Bigger Picture

Apache Iceberg’s 1.10.0 release is more than just a collection of new features, it signals the maturity of the project as the foundation for the open lakehouse ecosystem. By delivering deletion vectors, default values, lineage, and richer data types, Iceberg makes it possible for organizations to treat their data lake with the same reliability and flexibility as a data warehouse, but without the vendor lock-in or high costs.

At the same time, the active development of upcoming capabilities, like bulk delete support, multi-argument partitioning, and S3 accelerator integration, shows the community’s focus on performance, governance, and interoperability. This constant innovation ensures Iceberg remains future-proof as workloads evolve, whether that means serving real-time analytics, supporting AI pipelines, or managing petabyte-scale compliance data.

In short, Iceberg 1.10.0 lays a strong foundation, and the roadmap ahead promises to make open lakehouses faster, smarter, and more accessible than ever before.

Conclusion

Apache Iceberg 1.10.0 represents a turning point in the evolution of the open lakehouse. With the general availability of format-version 3, Iceberg now offers a more complete solution for organizations seeking the flexibility of data lakes combined with the reliability of data warehouses. Features like binary deletion vectors, default column values, and row-level lineage aren’t just incremental improvements, they redefine what’s possible in managing massive, ever-changing datasets.

And the story doesn’t end here. The community is actively working on multi-argument partitioning, bulk delete support, Kafka Connect publication, and S3 analytics accelerator integration, all of which signal that Iceberg’s future will be even more performant, interoperable, and enterprise-ready.

If you’re building modern analytics, streaming, or AI workloads, Iceberg offers a foundation that evolves with you. Now is the perfect time to explore Iceberg 1.10.0, experiment with its new capabilities, and get involved in the community shaping its future.

Ready to Get Started?

Enable the business to accelerate AI and analytics with AI-ready data products – driven by unified data and autonomous performance.