Dremio Jekyll

Lightning-fast queries, directly on data lake storage

Dremio technologies like Data Reflections, Columnar Cloud Cache (C3) and Predictive Pipelining work alongside Apache Arrow to make queries on your data lake storage very, very fast.


Accelerate reads with Predictive Pipelining and Columnar Cloud Cache

Dremio’s Predictive Pipelining technology fetches data just before the execution engine needs it, dramatically reducing the time the engine spends waiting for data. And our real-time Columnar Cloud Cache (C3) automatically caches data on local NVMe as it’s being accessed, enabling NVMe-level performance on your data lake storage.

A modern execution engine, built for the cloud

Dremio’s execution engine is built on Apache Arrow, the standard for columnar, in-memory analytics, and leverages Gandiva to compile queries to vectorized code that’s optimized for modern CPUs. A single Dremio cluster can scale elastically to meet any data volume or workload, and you can even have multiple clusters with automatic query routing.

Data Reflections – the on switch for extreme speed

With a few clicks, Dremio lets you create a Data Reflection, a physically optimized data structure that can accelerate a variety of query patterns. Create as many or as few as you want; Dremio invisibly and automatically incorporates Reflections in query plans and keeps their data up to date.

Arrow Flight moves data 1,000x faster

ODBC and JDBC were designed in the 1990s for small data, requiring all records to be serialized and deserialized. Arrow Flight replaces them with a high-speed, distributed protocol designed to handle big data, providing a 1,000x increase in throughput between client applications and Dremio. You can now populate a client-side Python or R data frame with millions of records in seconds.

Ready to get started?