Dremio Lakehouse Management Service
Dremio delivers the only data lakehouse management service that features Git for Data, automatic optimization of Apache Iceberg tables, and a commitment to open data standards.
Dremio provides a Lakehouse Catalog
Dremio provides an Iceberg-native data catalog that enables easy management of multiple domains and gives data teams centralized security and governance.
If you’re new to data lake table formats, start with this comparison ->
Access Control (Coming soon!)
Git for Data enables catalog-level data versioning
Dremio uses Git-inspired versioning capabilities such as branches, tags, and commits to simplify data management, eliminate expensive data copies, and give data consumers a consistent view of the data.
Easily create zero-copy clones of production data with branching for ETL workloads, data science and experimentation, development and testing, and more. Thoroughly test changes before exposing them atomically to production users.
Reproduce models and dashboards from historical views of the data catalog, and easily recover from mistakes by rolling back accidental data or metadata changes. Use tagging to identify key milestones in the history of your data.
Track all changes to the data and metadata. Every branch includes a full audit of commits, so it's easy to identify who made changes and when.
Automatic Data Optimization
Dremio automates many tedious data management tasks so Iceberg tables are always efficient and performant.
Dremio automatically rewrites smaller files into larger files, and groups similar rows in a table together. Table optimization accelerates query performance.
Dremio automatically removes unused manifest files, manifest lists, and data files. Table Cleanup ensures efficient use of storage resources.