Metadata Management and Agent Discoverability
Agentic Lakehouse
Built on Apache Polaris, one open catalog where every engine, agent, and knowledge worker can discover and access your data, with governance defined once and enforced everywhere

Built on Apache Polaris, Open Catalog gives every engine an entry point to your Iceberg tables, with autonomous table maintenance, unified governance, and Iceberg Clustering built on top, without locking you in.

Any Iceberg REST-compatible engine, tool, or agent (Spark, Flink, Trino, DuckDB, PyIceberg, and more) connects through the same governed interface. Open by standard, fully governed.

Fine-grained access control enforced consistently across every connected engine, source, and agent. Role-based permissions, row filters, column masking, and credential vending, configured once at the catalog layer and applied everywhere.
HOW IT WORKS
Open Catalog sits at the center of your lakehouse. Every engine connects through the Iceberg REST API: Dremio, Spark, Flink, Trino, DuckDB, and more. Tables live in open format on your object storage. Governance and table maintenance are managed once at the catalog layer, and every dataset is discoverable by any connected engine or agent.

CAPABILITIES
COMPARE
See how Dremio's Intelligent Query Engine compares across the capabilities that matter most.
| Capability | Dremio | Snowflake | Databricks |
|---|---|---|---|
| Query Engine Architecture | Arrow-native. LLVM codegen. Iceberg-native. No format conversion required. | Proprietary columnar format with query compilation layer. | Photon engine with Delta Lake optimization. |
| Autonomous Query Acceleration | Reflections auto-optimize queries. ML-driven acceleration without manual tuning. | Materialized views require manual definition. | Delta caching with cluster-level optimization. |
| Federated Query | Query across any source without data movement. True federation with semantic layer. | Limited federation via external tables. | Unity Catalog with lakehouse federation. |
| Open Catalog | Native Iceberg, Hive, Delta, Hudi support. Open catalog with no vendor lock-in. | Proprietary catalog with limited export. | Unity Catalog with Delta Lake focus. |
| Native Iceberg Support | Built for Iceberg from the ground up. Full spec compliance with time travel and schema evolution. | Iceberg tables supported via external tables. | UniForm for Iceberg interoperability. |
| Query Caching | Multi-layer intelligent caching. Result, metadata, and reflection caching with predictive prefetch. | Result set caching at warehouse level. | Delta cache and Spark cache layers. |
FAQs
Common questions about Dremio's Apache Polaris-based catalog, multi-engine interoperability, governance, and table management.
Open Catalog is Dremio’s robust catalog built on Apache Polaris, the open standard for Iceberg catalog interoperability. Dremio co-created Polaris and donated it to the Apache Software Foundation to keep it vendor-neutral. Open Catalog delivers core Polaris metadata management capabilities plus added capabilities on top: autonomous table maintenance, Iceberg Clustering, fine-grained governance, and deep integration with Dremio’s query engine and AI Agent.
Any engine that implements the Iceberg REST Catalog specification can read from and write to Open Catalog: Spark, Flink, Trino, DuckDB, Pyiceberg, StarRocks, and others. No proprietary connectors or SDKs required. Dremio also connects to external catalogs as sources, including AWS Glue, Databricks Unity Catalog, Snowflake Open Catalog, Nessie, S3 Tables, and Confluent Tableflow, all through the same Iceberg REST interface.
Governance is enforced through Dremio’s Open Catalog across all connected engines. Open Catalog applies four layers of control: storage-level via credential vending (temporary, scoped credentials per query), table-level via RBAC privilege grants, external source access via impersonated access, and column-and-row-level via column masking and row-access policies. Because policies live in the catalog, not in any individual engine, one configuration covers Dremio, Spark, Flink, Trino, DuckDB, AI agents, and any other connected engine or source. For queries running through Dremio’s engine or AI agents via MCP, all four layers apply. For external engines accessing tables via Iceberg REST directly, credential vending governs storage access; row and column policies are a current industry-wide limitation not yet enforced across the wire.
Auto-compaction and auto-cleanup run continuously without any scheduling: small files are merged, orphaned files removed, and old snapshots expired within your retention window. Iceberg Clustering, which uses Z-order via space-filling curves to optimize query performance, is initiated by the user selecting clustering keys and running the operation. Iceberg-native time travel via AT SNAPSHOT or AT TIMESTAMP remains available throughout the snapshot retention period.
Open Catalog exposes every cataloged dataset, including AI-generated wikis, labels, lineage, and semantic descriptions, through Dremio’s MCP Server. Any MCP-compatible agent (Claude, ChatGPT, Cursor, or a custom agent) can discover datasets by description, inspect schemas, trace lineage, and execute governed SQL. The semantic metadata in the catalog tells agents not just what a table contains but what it means in business terms, which is the difference between a hallucinated answer and a correct one.
RESOURCES
Apache Polaris provides the open foundation, but enterprises running enterprise-wide, AI-driven platforms need more than a standards-compliant catalog service. They need enterprise governance, automation, and business context that…
The Dremio Open Catalog is the vital bridge between open-source flexibility and enterprise-grade AI readiness. By moving from a "static mirror" of Iceberg files to a semantic, agent-ready architecture, Dremio allows organizations to finally realize the promise of a self-managing data lakehouse.
Dremio Catalog fundamentally changes the game for data lakehouse adoption. By providing a free, fully managed, and feature-rich service built on the open Apache Polaris project, Dremio removes the significant barriers to standing up a universal Iceberg catalog.