Dremio Lakehouse Management

The easiest way to manage Iceberg data lakehouses

arctic hero image

Dremio Lakehouse Management Service

Dremio delivers the only data lakehouse management service that features Git for Data, automatic optimization of Apache Iceberg tables, and a commitment to open data standards.

Arctic graphic chart
puzzle icon

Engine
Freedom

code icon

Git for
Data

refresh icon

Automatic
Data Optimization

Dremio Lakehouse Management makes it easy for data teams to manage and maintain their data lakehouse while enabling high-performance analytics based on a consistent view of the data.

Arctic Demo

6 min

Dremio provides a Lakehouse Catalog

Dremio provides an Iceberg-native data catalog that enables easy management of multiple domains and gives data teams centralized security and governance.

Iceberg-Native

list-icon Project Nessie (the catalog) is built into the open source Apache Iceberg project.
list-icon Use a variety of Iceberg-compatible engines, including Dremio, Spark, and Flink.

If you’re new to data lake table formats, start with this comparison ->

dremio iceberg icon
dremio workflow chart

Multiple Domains

list-icon Multiple isolated domains or catalogs in an organization, each containing a folder hierarchy of tables and views.
list-icon Designed to enable data mesh, including federated ownership and data sharing.

Access Control (Coming soon!)

list-icon Table, column- and row-based access control
list-icon Custom roles and integration with existing user/group directories like Microsoft Azure Active Directory/Entra ID and Okta.
access control icon

Git for Data enables catalog-level data versioning

Dremio uses Git-inspired versioning capabilities such as branches, tags, and commits to simplify data management, eliminate expensive data copies, and give data consumers a consistent view of the data.

zero-copy clones on ETL workloads graphic

Isolation

Easily create zero-copy clones of production data with branching for ETL workloads, data science and experimentation, development and testing, and more. Thoroughly test changes before exposing them atomically to production users.

Version Control

Reproduce models and dashboards from historical views of the data catalog, and easily recover from mistakes by rolling back accidental data or metadata changes. Use tagging to identify key milestones in the history of your data.

screenshot of drop down menu for adding a tag to identify key milestones in the history of your data
screenshot of feature in workload to track all changes to the data and metadata

Governance

Track all changes to the data and metadata. Every branch includes a full audit of commits, so it's easy to identify who made changes and when.

Automatic Data Optimization

Dremio automates many tedious data management tasks so Iceberg tables are always efficient and performant.

Table Optimization

Dremio automatically rewrites smaller files into larger files, and groups similar rows in a table together. Table optimization accelerates query performance.

screenshot of table optimization
screenshot of table cleanup

Table Cleanup

Dremio automatically removes unused manifest files, manifest lists, and data files. Table Cleanup ensures efficient use of storage resources.

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.