What is a Data Lakehouse?
A data lakehouse combines the performance, functionality and governance of a data warehouse with the scalability and cost advantages of a data lake.
With a data lakehouse, engines can access and manipulate data directly from data lake storage without copying data into expensive proprietary systems using ETL pipelines.
Dremio Overview
1 min
Meet Dremio
The Easy and Open Data Lakehouse
Dremio is an open data lakehouse, providing self-service analytics, data warehouse performance and functionality, and data lake flexibility across all of your data.
The Only Data Lakehouse with Self-Service SQL Analytics
Dremio is the only data lakehouse that empowers data engineers and analysts with easy-to-use self-service SQL analytics.
Unified View of Data
Easily connect to all of your data sources and expose the data through a unified business-friendly data model. Unified data improves data discovery, ensures consistent reporting and enables governed self-service data access.
Built For SQL
Query data from any SQL client (e.g., BI tools) and build data integration and consumption workflows using just SQL.
Modern and Intuitive User Interface
Create new data views, calculate new columns, see which datasets compose each view, and add descriptions and tags to datasets with just a few clicks.
Completely Open Data, No Lock-In
An open data lakehouse, based on community-driven standards like Apache Parquet, Apache Iceberg and Apache Arrow, enables organizations to use best-in-class processing engines and eliminates vendor lock‑in.
Comprehensive Support for Apache Iceberg
Apache Iceberg is the most broadly supported open table standard, with read/write support in Dremio Sonar and other popular engines with 2x the contributions of Delta Lake.
Co-creators of Apache Arrow
Apache Arrow is the standard in-memory data representation that every query engine can use. It enables high performance queries and makes data sharing seamless across platforms. Apache Arrow is downloaded 70 million times per month.
Open Data Projects We Actively Support
Sub-Second Performance, 1/10th the Cost of Cloud Data Warehouses
Dremio provides customers with sub-second SQL queries for BI on the data lakehouse, unmatched by other engines. Dremio exceeds the performance and scale requirements of the most demanding and largest enterprises in the world, including 5 of the Fortune 10.
Sub-Second Performance
Apache Arrow and Data Reflections, patent-pending optimization technology, combine to deliver the fastest query engine performance.
High Concurrency
High Concurrency and the ability to run multiple query patterns simultaneously is achieved with Workload Management (WLM) and the Dremio auto scaling multi-engine architecture.
1/10th the Cost of Data Warehouses
Dremio combines excellent price-performance, low cost storage (e.g. Amazon S3), and the elimination of creating and managing data copies. In total, Dremio is less than 1/10th the cost of cloud data warehouses, such as Snowflake.
Flexible Deployment Options
Dremio Cloud
Open & fully managed data lakehouse
Best Option if your data is on AWS. Forever Free Usage.
Dremio Software
Software for any environment
Download Dremio’s Community Edition