Metadata management and agent discoverability
Agentic Lakehouse
Built on Apache Polaris, one open catalog where every engine, agent, and knowledge worker can discover and access your data, with governance defined once and enforced everywhere

Built on Apache Polaris, Open Catalog gives every engine an entry point to your Iceberg tables, with autonomous table maintenance, unified governance, and Iceberg Clustering built on top, without locking you in.

Any Iceberg REST-compatible engine, tool, or agent (Spark, Flink, Trino, DuckDB, PyIceberg, and more) connects through the same interface. Open by standard, fully governed.

Fine-grained access control is enforced consistently across every connected engine, source, and agent. Role-based permissions, row filters, column masking, and credential vending, configured once at the catalog layer and applied everywhere.
HOW IT WORKS
Open Catalog sits at the center of your lakehouse. Every engine connects through the Iceberg REST API: Dremio, Spark, Flink, Trino, DuckDB, and more. Tables live in open format on your object storage. Governance and table maintenance are managed once at the catalog layer, and every dataset is discoverable by any connected engine or agent.

CAPABILITIES
COMPARE
See how Dremio's Intelligent Query Engine compares across the capabilities that matter most.
| Capability | Dremio | Snowflake | Databricks |
|---|---|---|---|
| Query Engine Architecture | Arrow-native. LLVM codegen. Iceberg-native. No format conversion required. | Proprietary columnar format with query compilation layer. | Photon engine with Delta Lake optimization. |
| Autonomous Query Acceleration | Reflections auto-optimize queries. ML-driven acceleration without manual tuning. | Materialized views require manual definition. | Delta caching with cluster-level optimization. |
| Federated Query | Query across any source without data movement. True federation with the semantic layer. | Limited federation via external tables. | Unity Catalog with lakehouse federation. |
| Open Catalog | Native Iceberg, Hive, Delta and Hudi support. Open catalog with no vendor lock-in. | Proprietary catalog with limited export. | Unity Catalog with Delta Lake focus. |
| Native Iceberg Support | Built for Iceberg from the ground up. Full spec compliance with time travel and schema evolution. | Iceberg tables are supported via external tables. | UniForm for Iceberg interoperability. |
| Query Caching | Multi-layer intelligent caching. Result, metadata, and reflection caching with predictive prefetch. | Result set caching at the warehouse level. | Delta cache and Spark cache layers. |
FAQs
Common questions about Dremio's Apache Polaris-based catalog, multi-engine interoperability, governance and table management.
Open Catalog is Dremio’s robust catalog built on Apache Polaris, the open standard for Iceberg catalog interoperability. Dremio co-created Polaris and donated it to the Apache Software Foundation to keep it vendor-neutral.
Open Catalog delivers core Polaris metadata management capabilities, plus additional capabilities
: autonomous table maintenance, Iceberg Clustering, fine-grained governance, and deep integration with Dremio’s Intelligent Qquery Eengine and AI Agent.
Any engine that implements the Iceberg REST Catalog specification can read from and write to Open Catalog: Spark, Flink, Trino, DuckDB, Pyiceberg, StarRocks, and others. No proprietary connectors or SDKs required.
Dremio also connects to external catalogs as data sources, including AWS Glue, Databricks Unity Catalog, Snowflake Open Catalog, Nessie, S3 Tables, and Confluent Tableflow, all through the same Iceberg REST interface.
Governance is enforced through Dremio’s Open Catalog across all connected engines. Our solution
applies four layers of control: storage-level via credential vending (temporary, scoped credentials per query), table-level via role based access controls (RBAC) privilege grants, external- source access via impersonated access and column-and-row-level via column masking and row-access policies.
Because policies live in the open source data catalog, not in any individual engine, one configuration covers Dremio, Spark, Flink, Trino, DuckDB, AI agents, and any other connected engine or source.
For queries running through Dremio’s engine or AI agents via MCP, all four layers apply. For external engines accessing tables via Iceberg REST directly, credential vending governs storage access; row and column policies are a current industry-wide limitation not yet enforced across the wire.
Auto-compaction and auto-cleanup run continuously, in real time, without any scheduling: small files are merged, orphaned files are removed, and old snapshots are expired within your retention window. Iceberg Clustering, which uses Z-order via space-filling curves to optimize query performance, is initiated by the user selecting clustering keys and running the operation.
Iceberg-native time travel via AT SNAPSHOT or AT TIMESTAMP remains available throughout the snapshot retention period.
Open Catalog exposes every cataloged dataset, including AI-generated wikis, labels, data lineage and semantic descriptions, through Dremio’s MCP Server. Any MCP-compatible agent (Claude, ChatGPT, Cursor, or a custom agent) can discover data assets by description, inspect schemas, trace lineage, and execute governed SQL.
The semantic metadata in the catalog tells agents not just what a table contains but what it means in business terms, which is the difference between a hallucinated answer and a correct one.
RESOURCES
Apache Polaris provides the open foundation, but enterprises running enterprise-wide, AI-driven platforms need more than a standards-compliant catalog service. They need enterprise governance, automation, and business context that…
The Dremio Open Catalog is the vital bridge between open-source flexibility and enterprise-grade AI readiness. By moving from a "static mirror" of Iceberg files to a semantic, agent-ready architecture, Dremio allows organizations to finally realize the promise of a self-managing data lakehouse.
Dremio Catalog fundamentally changes the game for data lakehouse adoption. By providing a free, fully managed, and feature-rich service built on the open Apache Polaris project, Dremio removes the significant barriers to standing up a universal Iceberg catalog.