Dremio Blog

11 minute read · April 23, 2026

What “Apache Iceberg Native” Actually Means

Alex Merced Head of DevRel, Dremio

Start For Free

Copied to clipboard

What “Apache Iceberg Native” Actually Means

Iceberg by Default, Not by Request

Iceberg-Native Acceleration

Reflections: Iceberg as the Faster Format

Why the Engine Is Fast on Iceberg Natively

Operationalizing Iceberg

Built-In Apache Polaris Catalog

Iceberg V3 Spec Support

Automated Table Maintenance

The Tradeoffs

Where Your Data Gravity Should Live

FREE Books

Every major data platform now lists Apache Iceberg somewhere on its feature page. Snowflake has Iceberg Tables. Databricks has UniForm. BigQuery has BigLake. This is a genuinely good thing for the ecosystem because it gives users more choice and more portability.

But "supports Iceberg" and "Iceberg native" are not the same thing. The distinction matters when you are deciding where your data gravity should live, meaning where your primary analytical data is stored, optimized, and governed. A platform that bolts Iceberg support onto a proprietary core makes different engineering tradeoffs than one built on Iceberg from the ground up.

Dremio claims to be Iceberg native. Here are the three things that back up that claim.

Iceberg by Default, Not by Request

There is no "Dremio Table" format. When you create a table in Dremio, it is an Apache Iceberg table. There is no proprietary alternative, no internal format that gets better performance, no checkbox to opt into openness. Iceberg is the only option because it is the foundation of the platform.

This is a bigger deal than it sounds. Engineering priority follows the default format. When a platform's default is a proprietary format, that is where the optimization budget goes: faster writes, smarter compaction, deeper query optimizer integration. Iceberg support becomes a secondary project staffed by a smaller team, maintained to a "good enough" standard.

Dremio's engineers spend their time on one thing: making the Dremio query engine faster on Apache Iceberg tables and operations. Every query optimizer improvement, every table management feature, every caching enhancement targets Iceberg directly. There is no internal format competing for attention.

The practical result: you don't face a choice between "fast but proprietary" and "open but slower." In Dremio, Iceberg IS the fast format.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Iceberg-Native Acceleration

Speed is where the "supports Iceberg" vs. "Iceberg native" distinction gets concrete.

Most platforms that added Iceberg support did not build their query engines for it. Their engines are tuned for their proprietary format. So when customers ask how to make Iceberg queries faster, the answer is often: create materialized views stored in the platform's proprietary format, in the platform's managed storage.

Think about what that means. The whole point of adopting Iceberg was to avoid writing data into proprietary formats. But the performance optimization path pushes you right back into one.

Dremio does the opposite.

Reflections: Iceberg as the Faster Format

Dremio's Reflections are pre-computed, optimized materializations stored as Apache Iceberg tables. When you query a federated source like PostgreSQL or MongoDB through Dremio, the engine can create a Reflection in Iceberg format that serves future queries faster.

Read that again: Dremio uses Iceberg to speed up non-Iceberg sources. The industry pattern is "use a proprietary format to speed up Iceberg." Dremio flips it: "use Iceberg to speed up everything."

Autonomous Reflections take this further. Dremio analyzes your query patterns over a 7-day window and automatically creates, manages, and drops Reflections without human intervention. Reflections on Iceberg tables support incremental updates, so only changed data gets reprocessed. No full rebuilds.

Why the Engine Is Fast on Iceberg Natively

Dremio's speed on Iceberg is not just about caching tricks. It is architectural:

Apache Arrow as native in-memory format: Dremio co-created Apache Arrow. Data stays in columnar Arrow format in memory, which eliminates the serialization overhead that traditional engines pay when converting between formats.
Vectorized Parquet reader: Maximizes parallelism when reading Iceberg's underlying Parquet files.
Columnar Cloud Cache (C3): Stores frequently accessed data on local NVMe drives at executor nodes. Cloud storage latency becomes local-disk speed.
Results and Query Plan Cache: Identical queries return instantly.

The combination means Dremio doesn't need a proprietary format to deliver fast analytics. Iceberg plus Arrow plus aggressive caching is the performance strategy.

Operationalizing Iceberg

Apache Iceberg is a table format specification. It is not, by itself, a managed analytics experience. The gap between "I have Iceberg tables" and "I have a production lakehouse" is filled with operational work that never ends: choosing a catalog, configuring compaction jobs, monitoring small file accumulation, managing schema evolution, vacuuming obsolete snapshots, and building alerting for all of it.

Dremio closes this gap so that working with Iceberg feels like working with a traditional data warehouse: you onboard datasets, write SQL, use AI agents, and run analytics. The open format stays open underneath.

Built-In Apache Polaris Catalog

Dremio's Open Catalog is built on Apache Polaris, the open Iceberg REST catalog standard. It tracks and governs your lakehouse assets with Role-Based Access Control (RBAC) at the folder, dataset, and column level, plus Fine-Grained Access Control (FGAC) via UDFs for row-level security and column-level masking.

Dremio can also connect to your external Iceberg catalogs (AWS Glue, Unity Catalog, and others) alongside the built-in one, so you govern everything from a single namespace.

Iceberg V3 Spec Support

Dremio tracks the latest Apache Iceberg spec features, including Variant type for semi-structured data and Geospatial types. Being Iceberg native means adopting new spec capabilities as they land, not waiting for them to be translated through a compatibility layer.

Automated Table Maintenance

Background jobs handle the operational work that makes DIY lakehouses "perpetually a work-in-progress":

Maintenance TaskWhat Dremio AutomatesCompactionConsolidates small files into optimally-sized filesClusteringSorts data by frequently filtered columns for faster queriesSnapshot managementVacuums obsolete snapshots and orphan files to control storage costsManifest rewritingOptimizes metadata structure for faster query planning

These jobs run based on actual table usage patterns. No cron jobs. No custom scripts. No on-call rotations for your data lake.

The Tradeoffs

Being Iceberg native is a deliberate architectural choice, and it comes with boundaries.

Dremio is an analytical engine. OLTP workloads (high-frequency transactional writes) belong in purpose-built databases like PostgreSQL or MongoDB. You can query those systems through Dremio's federation, but you should not try to replace them.

Dremio's opinionated commitment to Iceberg also means it is not the right fit if your organization's strategy centers on a different format. If Delta Lake is your standard and you have no plans to move, a Delta-native platform is the better match while Dremio can access Uniform tables in Unity Catalog for data unification use cases.

And Iceberg itself is still evolving. V3 features like Variant and Geospatial are new, and the tooling ecosystem around them is maturing. Being on the leading edge of the spec means occasional rough edges.

Where Your Data Gravity Should Live

It is a great thing that so many platforms now support Apache Iceberg. More support means more flexibility for everyone. But if your intention is to make Iceberg your primary analytics format, then "supports Iceberg" and "built for Iceberg" lead to very different outcomes.

A platform where Iceberg is the default format, where acceleration stays in Iceberg, and where the operational complexity of running a lakehouse is abstracted away is a platform built for that Iceberg-native use case. That is what Dremio is designed to be.

Try Dremio Cloud free for 30 days and see how an Iceberg-native platform handles the workloads you are running today.

FREE Books

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.

Start For Free

Article Topics

Dremio Blog: Open Data Insights

Sep 22, 2023 Dremio Blog: Open Data Insights

Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop

We're always looking for ways to better handle and save money on our data. That's why the "data lakehouse" is becoming so popular. It offers a mix of the flexibility of data lakes and the ease of use and performance of data warehouses. The goal? Make data handling easier and cheaper. So, how do we […]

Alex Merced

Aug 16, 2023 Dremio Blog: News Highlights

5 Use Cases for the Dremio Lakehouse

With its capabilities in on-prem to cloud migration, data warehouse offload, data virtualization, upgrading data lakes and lakehouses, and building customer-facing analytics applications, Dremio provides the tools and functionalities to streamline operations and unlock the full potential of data assets.

Alex Merced

Aug 31, 2023 Dremio Blog: News Highlights

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.

Jeremiah Morrow

What “Apache Iceberg Native” Actually Means