Modernize to an Open Lakehouse

Data consumers need data for analytics to make business decisions. Data teams struggle to address stale data, poor self-service, and getting new analytics into production faster. Learn how to solve these challenges with an open lakehouse.

Open Lakehouse Essentials

A lakehouse is a data analytics architecture that converges the data lake and data warehouse in the cloud. An open lakehouse built on an open data architecture enables organizations to use their cloud data lake as their data warehouse so that they can make full use of their data for analytics.

Steps to an Open Lakehouse

  • small1

    Store data in open formats

    Use open-source formats (for instance, Apache Parquet for files and Apache Iceberg for tables) rather than proprietary formats tied to specific vendors.

  • small2

    Treat data as its own tier

    With an open lakehouse, data exists as its own independent layer, eliminating the need to move or copy data to data warehouses, cubes, or extracts for analysis.

  • small3

    Use a SQL query engine to accelerate time to insight

    In an open lakehouse, data is accessed by decoupled and elastic compute engines (for example, Dremio Sonar) with query acceleration for BI and ad hoc workloads.

  • small4

    Support self-service through use of a semantic layer

    A business-friendly semantic layer provides consistent shared access across all users and tools and also centralized security and governance.

  • small5

    Automate data management

    Use an intelligent metastore like Dremio Arctic for all the data management capabilities you’re used to with a data warehouse — and more.

3 Reasons to Modernize
to an Open Lakehouse

Today, many companies have data in cloud data storage (like Amazon S3, Azure Data Lake Storage, or Google Cloud Storage), but have needed to move and copy subsets of data into proprietary data warehouses for analytics — and from there create aggregates, cubes, and extracts for better performance. This leads to three significant challenges.

icon architecture

Slow, complex architecture

Moving data through complex ETL pipelines creates backlogs for data requests and headaches for data teams.

icon out of control costs

Out-of-Control Costs

Expensive data warehouses (along with multiple data copies, extracts, and cubes) add up to a high total cost of ownership.

icon green lock

Risky Vendor Lock-In

Proprietary data warehouse formats prevent you from using multiple best-of-breed engines on the same data or easily adopting new engines.

Why move data from your cloud data storage if you don’t have to?

With an open lakehouse, you keep your data where it is and make all your data available for analytics.

Get Started with an Open Lakehouse

Dremio’s open lakehouse platform is available as a fully managed cloud service with a forever-free tier. Sign up now with a forever-free account on Dremio Cloud.

The Open Source Technology Behind the Open Lakehouse

Dremio’s open lakehouse platform makes use of key open source technologies.

icon eye2 1

Apache Arrow

An in-memory columnar format that supports zero-copy reads for fast data access without serialization.

More About Apache Arrow

icon graph up2 1

Apache Arrow Flight

Open source data connectivity technology that provides 20x times faster data transfer rates than JDBC and ODBC.

More about Apache Arrow Flight

icon magnify glass graph2 1

Apache Iceberg

An open-source table format for huge analytic datasets, Iceberg enables multiple applications to work on the same data in a transactionally consistent manner.

More about Apache Iceberg

icon cursor click2 1

Project Nessie

Nessie is a lakehouse metastore that provides a Git-like experience on data lake storage.

More about Project Nessie

get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.