SQL Query Engine

Get Started for Free

A Lakehouse Query Engine optimized for high performance and efficiency on Apache Iceberg and all your data


The #1 SQL Engine for Apache Iceberg

Designed from the ground up to be the fastest, most performant, and powerful SQL query engine for Apache Iceberg, Dremio delivers superior query performance and optimized price-performance across all your data. Real-time memory management dynamically manages memory and optimizes allocation, reducing memory usage while heightening performance, scalability, and stability—ensuring successful analytical query operations even with vast datasets.

As co-creators of Apache Iceberg, Dremio delivers native optimizations that other query engines cannot match. Automatic Iceberg Clustering optimizes file layouts continuously, eliminating the manual tuning required by traditional systems and delivering 20× performance at the lowest cost.

Optimized price performance for every engine query

Execute every query with the optimal balance of speed and cost-effectiveness. Dremio's multi-engine architecture enables sophisticated workload management. It's simple to create multiple right-sized, physically isolated engines for various workloads in your organization, ensuring market-leading concurrency, performance predictability, and making it easy to manage critical SLAs.

Intelligent autoscaling dynamically manages your query workload based on established engine parameters, ensuring you only pay for what you use with consumption-based pricing.

Cost-based optimization ensures the fastest path to complete every engine query by understanding deep data statistics, including location, cardinality, and distribution—delivering industry-leading price-performance without manual tuning.

Bar chart comparing query times of Trino 443, two data cloud vendors, and Dremio; Dremio is the fastest at 22 seconds, highlighted as 20 times faster than the others.

Up to 100x faster performance with Reflections query acceleration

Attain near-instantaneous query performance with Autonomous Reflections—optimized relational caches that use algebraic matching to accelerate entire or partial queries automatically. Reflections are completely transparent to data users and require no manual configuration or maintenance. During query execution, the Dremio optimizer automatically matches the best Reflections to accelerate the query, freeing platform teams from manual tuning while delivering sub-second performance.

Performance for all queries is also accelerated using Columnar Cloud Cache (C3). C3 selectively caches only the data required to satisfy your workloads, eliminating 90% of I/O costs and delivering the lowest total cost of ownership.

Flexible, fast, lightweight data transformation

Reduce reliance on costly ETL tools and brittle, complex data pipelines. Dremio's query engine makes it easy to apply last-mile data transformations, including filtering, sorting, aggregating, joining, and casting. These transformations can be quickly built as Dremio Views—governed virtual data sets with layered transformations—that can be flexibly shared and modified for downstream data projects.

Views integrate seamlessly with the AI Semantic Layer, providing business context that enables AI agents to understand and leverage your transformations automatically. This eliminates the need for separate transformation pipelines while maintaining governance and lineage throughout the analytics workflow.

App interface displaying various data transformation options, including filtering, sorting, and aggregating features.
Dremio SQL Query Engine makes it easy to analyze all of the data

Federated querying for all your data, everywhere

Designed for interactive analytics and DML on the data lake, the Dremio SQL Query Engine makes it easy to analyze all of your data—whether on the data lake or in other data sources. Our connector ecosystem features dozens of integrations with an array of sources, including object storage, metastores, and databases in the cloud and on-premises. Because Dremio queries your data at the source with zero-copy architecture, there is no data movement, no data copies, and no complex ETL.

Federated querying enables unified analytics across your entire data landscape while maintaining governance policies and access controls. AI agents can leverage federated queries to access data from multiple sources through a single interface, accelerating insights without data duplication or operational overhead.

Built for the Cloud, Multi-Cloud, on-premises, and hybrid environments

Query your data where it lives—whether in the Cloud, across clouds, on-premises, or in hybrid environments. Dremio understands that many organizations choose to distribute their workloads across different cloud platforms, allowing them to leverage the strengths of each provider while avoiding vendor lock-in. With Dremio, you can access all of your data and put it to analytic work in seconds.

Built on open standards including Apache Iceberg, Polaris, and Arrow, Dremio ensures you're never locked into proprietary formats or vendor-specific architectures. This open approach enables interoperability across your entire data ecosystem while maintaining performance and governance.

Dremio connects your data everywhere
SELECT region, SUM(rev) FROM orders WHERE yr=2024 GROUP BY region ORDER BY 2 DESC ARROW-NATIVE region rev AMER EMEA APAC LATM MENA 0.28s ↓ 99.4% ⚡ Reflections INPUT ENGINE OUTPUT

Built on Apache Arrow for the fastest performance

Dremio's optimized SQL query engine, powered by Apache Arrow, is core to delivering the best price-performance for queries across all your data. Dremio delivers lightning-fast query performance as well as market-leading query concurrency for lakehouse analytical workloads. In fact, Dremio co-created Arrow and subsequently contributed it to the Apache Foundation.

Arrow Flight is designed from the ground up to support modern analytical workloads for columnar data structures and parallel processing, enabling the high-performance data transfer that powers sub-second query execution across distributed systems.

FAQs

Frequently asked questions

A query engine is a specialized software component that processes and executes data queries, translating high-level query languages like SQL into optimized operations that retrieve and transform data from storage systems. Query engines handle query execution by parsing SQL statements, optimizing execution plans, coordinating distributed computing resources, and returning results to users or applications—all while managing memory, parallelism, and performance optimization automatically.

Dremio’s SQL query engine goes beyond traditional query execution by integrating directly with Apache Iceberg table formats, enabling advanced optimizations like Autonomous Reflections and Automatic Iceberg Clustering that deliver sub-second performance without manual tuning. Built on Apache Arrow, Dremio’s query engine provides the fastest path for SQL querying across data lakes, warehouses, and federated sources—delivering 20× performance at the lowest cost through autonomous operations that eliminate operational overhead.

Make data engineers and analysts 10x more productive

Boost efficiency with AI-powered agents, faster coding for engineers, instant insights for analysts.