Dremio Sonar vs. Databricks SQL

You want to run all BI workloads, from ad hoc to mission-critical and everything in between, directly against your data lake storage. Which SQL query technology should you use? Here’s what you should consider when comparing Dremio Sonar to Databricks SQL.

Dremio Sonar vs. Databricks SQL Feature Comparison

GeneralDremio SonarDatabricks SQL
EnvironmentsCloud and on-premisesCloud only
Use case focusPlatform supports all BI workloads, from ad hoc to mission-critical and everything in between, directly against the data lake.Focused on providing SQL against the data lake, mainly serving only ad hoc queries in practice.
BI experienceDremio SonarDatabricks SQL
BI accelerationTransparent reflections to accelerate queries. These are maintained and applied by Dremio Sonar across queries transparent to BI clients.Not supported.
Time-to-production for dashboardsOptimizes BI workloads for simple and fast interactivity with the help of reflection creation and virtual datasets.Requires complex ETL pipelines, data copies, cubes, aggregates and/or BI extracts to optimize the performance.
Data explorationSemantic layer with virtual datasets that make data exploration and curation off the data lake simple and governed.Limited support.
Data ManagementDremio SonarDatabricks SQL
ACID transactionsUses Apache Iceberg for ACID.
  – No lock-in, Iceberg is open and community-driven.
Uses Delta Lake for ACID. 
-Spark is the only engine that can write safely to Delta Lake.
Multi-statement transactionsMulti-statement transactions supported with Project Nessie.Delta Lake supports only single-statement transactions. Multi-statement transactions are not supported.
Data lineageYesNo
Git-like semanticsA separate data layer with a Git-like experience for tables and views via integration with Project Nessie to branch, tag, and time travel datasets all while automatically optimizing the files to ensure high-performance analytics.Not supported.
PerformanceDremio SonarDatabricks SQL
Client throughput-Standard throughput for ODBC client.

-High-performance JDBC transfers via Arrow Flight JDBC driver. 

-High-speed data transfers with Arrow Flight to deliver up to 20x throughput increase between clients and Dremio Sonar.
– Standard throughput for ODBC and JDBC clients.

– Supports mechanisms for fetching data in parallel via cloud storage such as AWS S3 and Azure Data Lake Storage to bring the data faster to BI tools.
Workload managementPrioritizes jobs that matter with automated query routing.Limited functionalities.

Dremio Overview

Dremio is the only lakehouse platform that is built for SQL, provides a Git-like experience, and is built on an open data architecture. It enables high-performing business intelligence (BI) and interactive analytics directly on data lake storage. Apache Arrow, an open source project co-created by Dremio engineers in 2017, is now downloaded over 60 million times per month.

Using end-to-end Apache Arrow, Dremio dramatically increase query performance. Dremio simplifies data engineering and eliminates the need to copy and move data to proprietary data warehouses or create cubes, aggregation tables, and BI extracts, providing flexibility and control for data architects and data engineers, and self-service for data consumers. Through seamless integrations with Tableau, Power BI, and other BI tools, Dremio enables interactive BI dashboards directly against data lakes.

Dremio meets you where you are. Most organizations don’t have 100% of their data in the data lake or in the cloud. Dremio allows you to derive business insights in your current state without any data movement.

Dremio Sonar: A Lakehouse Query Engine

Sonar is a lakehouse query engine that provides lightning-fast SQL queries directly on data lakes and a self-service user experience that makes data consumable, consistent, and collaborative. Sonar helps organizations access more data freely so they can make better business decisions. Sonar does this by combining a best-in-class query engine with a seamless, self-service user experience for data consumers. Here’s a quick rundown of the key technologies that make this possible:

  • Query Engine (powered by Apache Arrow): Sonar’s query engine delivers all the performance and functionality of a data warehouse directly on the data lake, including DML operations. The query engine is built to support all SQL workloads on the lakehouse, from ad-hoc & exploratory to mission-critical BI dashboards. You can also connect to a variety of RDBMSs, enabling analysts to join data between the lake and other data sources all with ANSI SQL compatibility.
  • Reflections: A query acceleration technology that speeds up queries behind the scenes, so data applications and analysts can interact seamlessly with data without needing to worry about optimizing their data and queries. Reflections enable sub-second query response times by automatically and transparently rewriting query plans to utilize different aggregations or layouts of tables and views.
  • Spaces: An integrated semantic layer that enables data teams to deliver a consistent and secure view of data to data consumers, and enables analysts to curate, analyze, and share datasets in a self-service manner. Spaces enable datasets across lakes and other sources to be exposed as reusable metrics.
  • SQL Runner and SQL Profiler: Sonar provides a best-in-class integrated experience for analysts who know and love SQL, including a feature-rich IDE (SQL Runner) and the world’s easiest and most advanced tool for understanding and troubleshooting query performance (SQL Profiler).
  • Arrow Flight and FlightSQL: A next-generation interface for interacting with databases that is 20x faster than ODBC and JDBC and supports a variety of programming languages.
  • Frictionless BI Tool Integrations: Native connectors in leading BI tools, including Power BI, Tableau, dbt, Hex, and Preset, enable users to quickly and easily visualize their data from their favorite BI tool.

Key Facts About Dremio Sonar Technology

  • Faster time to insight. provides much better self-service to data consumers while reducing the need for ETL pipelines and data copies, making data engineering more productive too.
  • Proven at high concurrency.
  • Supports all your BI workloads, from ad hoc to mission-critical and everything in between, without requiring a proliferation of ETL pipelines and data copies.
  • Built on an open data architecture, Dremio provides the flexibility to choose multiple best-of-breed engines on the same data and use cases, both today and tomorrow.
  • Query acceleration and Data Reflections power mission-critical BI dashboards directly against data lake storage.
  • Self-service semantic layer makes it easy to create and organize views for consistency or security purposes, and to collaborate on those views to derive valuable insights.
  • Integrates with multiple sources, including on-premises workloads that haven’t migrated to the cloud yet. Data doesn’t have to be in the cloud or even in the lake for you to take advantage of Dremio.
  • High performance via Arrow Flight and Arrow FlightSQL, accelerating results by 20x or more.
  • Support for Apache Iceberg (open source table format with no vendor lock-in) for ACID transactions.

Databricks Overview

Databricks is a lakehouse platform developed by the same team that created Apache Spark. Designed to support use cases around data science, machine learning (ML), and data engineering, Databricks originated as a way to handle Spark-based ETL jobs for data science applications.

The platform can support ML-based use cases such as advanced analytics for ML and graph processing at scale, proactive threat detection, deep learning to power image interpretation, and managing an end-to-end ML environment which includes experiment tracking, model training, feature development, model serving, and more. Databricks SQL allows users to run SQL queries and build visualization dashboards directly against data lakes with limited capabilities.

Key Facts About Databricks SQL Technology

  • Not proven for high-concurrency workloads
  • Not suitable for use cases that demand low-latency analytics
  • Lacks a self-service semantic layer
  • Photon (the query engine for Databricks SQL) comes with very limited capabilities:
    • Works on Delta Lake and Apache Parquet tables only for both read and write (not ideal if user wants to make use of Apache Iceberg table format or other open source table formats, other sources, or file types)
    • Does not support window and sort operators
    • Not expected to improve performance of short-running queries (<2 seconds), for example, queries against small amounts of data
    • Limited to accessing data only on the lake and doesn’t integrate with RDBMS platforms like Oracle, MS SQL, Postgres, etc.
  • Databricks SQL is a new offering, and not proven for mission-critical BI at scale
  • Subsecond query response is not in scope
  • No support for git like semantics
  • Cloud-only; not able to provide value for customers while they make the transition to the cloud

Ready to Get Started? Here Are Some Resources to Help

Whitepaper

WHITEPAPER

The Path to Self-Service Analytics on the Data Lake

Download this white paper to get a step-by-step roadmap of Dremio adoption. At each step, you’ll learn about benefits gained, as well as the complexities and risks reduced, as workloads are migrated from traditional systems to Dremio.

Read More
Whitepaper

WHITEPAPER

Ten Top of Mind Challenges for Data Engineering

Data engineers play a crucial role in designing, operating, and supporting the increasingly complex environments that power modern data analytics. What are their most important challenges and how can they solve them strategically?

Read More

DATASHEET

Intro to Dremio Cloud

Find out more about Dremio Cloud, the only data lakehouse platform built for SQL and built on open source technologies that both data engineers and data analysts love. Dremio powers BI dashboards and interactive analytics directly on data lake storage.

Read More

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now

Watch Demo

Not ready to get started today? See the platform in action.

Check Out Demo