6 minute read · April 8, 2025

Introducing the Enterprise Catalog, Powered By Apache Polaris (Incubating)

Ben Hudson

Ben Hudson · Principal Product Manager, Dremio

Companies of all sizes now use lakehouse architectures to power their analytics and AI workloads. Lakehouses give companies a single, trusted source of data for analytics and AI tools to access, and eliminate the need for data duplication and vendor lock-in.

The catalog, or metastore, is an integral part of the lakehouse that enables tools to discover, read, and write data in a safe and governed way. While the lakehouse offers a more streamlined, open data architecture than legacy, warehouse-based architectures, the catalog has been a source of friction for companies who want to adopt a lakehouse. In addition to the headache of having to choose from multiple catalog offerings, customers have had to procure, provision, manage, and figure out how to get enterprise support for their catalog as part of their lakehouse architecture.

Today, we’re excited to announce Dremio’s Enterprise Catalog powered by Apache Polaris (incubating), making Dremio the easiest way for customers to build a data lakehouse on their own terms.

"Empowering our customers with robust data governance on STACKIT is crucial. Dremio Catalog provides a secure central platform to manage data and ensures full data sovereignty within the trusted STACKIT Cloud," Benjamin Schweizer, Senior Manager, Domain Product Owner, STACKIT

Open Foundation, Enterprise Scale

The catalog is the heart of the lakehouse architecture, and enables companies to build an interoperable foundation for analytics and AI. Apache Polaris (incubating) is a catalog that implements the Iceberg REST Catalog spec and provides centralized, secure access to Iceberg tables across different REST-compatible query engines. Since its release in June 2024, Polaris has been quickly embraced by the Iceberg community, and boasts a diverse group of contributors from companies like Dremio, Snowflake, and AWS.

Polaris provides foundational catalog capabilities for companies to build an open, interoperable lakehouse. However, the catalog also needs to provide the security, automation, and support to enable enterprises to run production workloads at scale.

With Dremio’s newest release, companies now get an enterprise-grade catalog out of the box, powered by Polaris, that enables them to:

Use any engine to read and write data

Read and write from the Enterprise Catalog using any engine or framework compatible with the Iceberg REST API. For example, use Spark or Flink to ingest data into the catalog, and then use Dremio to curate and serve data products built on that data.

Meet compliance and security requirements

Secure data using Role-Based Access Control (RBAC) privileges, and ensure users only access the data they need with row filters and column masks. For example, create a column mask to obfuscate credit card numbers, or create a row filter on your employee details table that only returns rows with employees in your region.

Enable data analysts

Use natural language to discover AI-ready data products in Dremio’s enterprise catalog and easily understand how to use them to answer business questions using built-in descriptions and labels. Use built-in lineage graphs to easily understand how data products are derived and transformed, and assess the impact of changes on downstream datasets.

Automate maintenance operations

Dremio’s Enterprise Catalog automates Iceberg maintenance operations like compaction and vacuum, which maximizes query performance, minimizes storage costs, and eliminates the need to run manual data maintenance. Dremio Catalog also simplifies Iceberg table management and eliminates risk of poor performance from sub-optimal data layouts with support for Iceberg clustering keys.

What's Next?

To learn more, register for our Spring 2025 Product Release Virtual event on April 29th with a deep dive into this topic in the Getting Started with Dremio’s Enterprise Catalog Powered by Apache Polaris (incubating) on May 20th.

Ready to get started? Try Dremio for free today or contact our team to schedule a personalized demo.

Sign up for AI Ready Data content

Explore the Key Benefits of Enterprise Catalog for Building an Intelligent, Scalable Lakehouse

Ready to Get Started?

Enable the business to accelerate AI and analytics with AI-ready data products – driven by unified data and autonomous performance.