Dremio Blog

8 minute read · June 24, 2024

The Unified Apache Iceberg Lakehouse: Self Service & Ease of Use

Alex Merced Head of DevRel, Dremio

Start For Free

Copied to clipboard

The Unified Apache Iceberg Lakehouse: Self Service & Ease of Use

The Value of a Self-Service Data Lakehouse

How Dremio Enables Self-Service Data Access

Conclusion

Data Mesh, Data Lakehouse, Data Fabric, Data Virtualization—there are many buzzwords describing ways to build your data platform. Regardless of the terminology, everyone seeks the same core features in their data platform:

The ability to govern data in compliance with internal and external regulations.
The ability to access all data seamlessly.
The ability to query data and receive quick answers.
The assurance that the data is up-to-date.
Achieving all the above at minimal cost.

Many of these "Data X" concepts address different aspects of these goals. However, when you integrate solutions that cover all these needs, you often converge on a combination of a data lakehouse (treating your data lake as both a data warehouse and the primary data source) and data virtualization (connecting to multiple data sources and interacting with them through a unified interface). We’ll refer to this combination as the "Unified Apache Iceberg Lakehouse." This approach typically involves:

Storing most of your analytics data in Apache Iceberg tables within your data lake.
Enriching this data through data virtualization, drawing from a diverse array of databases and data warehouses.
Using Dremio as the unified access and governance layer for all this data.

This series of blogs aims to explore the various benefits of this architecture, providing a deep dive into the value of this approach.

The Value of a Self-Service Data Lakehouse

Empowering users to access and analyze data independently can lead to faster insights, more informed decisions, and a more agile organization. Self-service eliminates bottlenecks in data access and allows users to explore data and generate reports without relying on IT or data engineering teams. This democratization of data enhances productivity and ensures that data-driven decision-making is not restricted to a few experts.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

How Dremio Enables Self-Service Data Access

Dremio facilitates self-service data access through several key features:

Unified Data Access: Dremio makes it easy to connect data from disparate sources into one centralized location. This unification allows users to seamlessly query and analyze data from multiple sources without needing to move the data.

Robust Governance: Dremio simplifies data access governance by providing granular controls based on user, role, column, and row levels for all your data, regardless of where it lives. This ensures that sensitive information is protected and that access is compliant with governance policies.

User-Friendly Interface: Dremio's web application UI is designed for ease of use, enabling users to explore and discover datasets effortlessly. Within the integrated semantic layer, users can craft SQL queries to create desired business metrics. The UI includes features like generative AI text-to-SQL, wizards for generating SQL to join datasets, changing column types, creating derived columns, and more.

Integrated Documentation: Dremio's semantic layer includes an integrated wiki to document datasets, providing better context and understanding. The generative AI wiki generation feature helps kickstart documentation, making it easier for users to understand and utilize the data.

Flexible Data Access: Outside the UI, data in Dremio can be accessed via common interfaces such as JDBC/ODBC, REST API, and Apache Arrow Flight. This flexibility allows users to connect their favorite BI tools or Python notebooks, facilitating a wide range of analytical workflows.

Empowering Analysts: Dremio enables data analysts and analytics engineers to handle more transformation work in the final stages of data processing. This reduces the workload on data engineers, freeing them up to focus on new data projects instead of addressing endless data request tickets from downstream users.

Experimentation with Zero-Copy Environments: The Dremio integrated catalog, powered by open-source Nessie, allows for the creation of zero-copy environments for data experimentation. Analysts can model different scenarios without the need to duplicate or triplicate the data, promoting efficient and flexible analysis.

Conclusion

Self-service data access is a game-changer for modern organizations, fostering a data-driven culture and enabling faster, more informed decision-making. Dremio's comprehensive suite of features makes it an ideal platform for achieving self-service data access. By unifying data sources, simplifying governance, providing an intuitive UI, and supporting flexible data access methods, Dremio empowers users to independently explore and analyze data. This not only enhances productivity but also allows data engineers to focus on strategic projects, ultimately driving innovation and growth.

Want to begin the transition to a Unified Apache Iceberg Lakehouse? Contact Us

Here are Some Exercises for you to See Dremio’s Features at Work on Your Laptop

Explore Dremio University to learn more about Data Lakehouses and Apache Iceberg and the associated Enterprise Use Cases. You can even learn how to deploy Dremio via Docker and explore these technologies hands-on.

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.

Start For Free

Article Topics

Product Insights from the Dremio Blog

Blog coverpage for Ingesting Data into Aparche Iceberg with Dremio

Feb 1, 2024 Product Insights from the Dremio Blog

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.

Alex Merced

Oct 12, 2023 Product Insights from the Dremio Blog

Table-Driven Access Policies Using Subqueries

This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.

Albert Vernon

Aug 31, 2023 Dremio Blog: News Highlights

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.

Jeremiah Morrow

The Unified Apache Iceberg Lakehouse: Self Service & Ease of Use

Table of Contents

The Value of a Self-Service Data Lakehouse

Try Dremio’s Interactive Demo

How Dremio Enables Self-Service Data Access

Conclusion

Here are Some Exercises for you to See Dremio’s Features at Work on Your Laptop

Try Dremio Cloud free for 30 days

Ready to Get Started?

Table of Contents

The Value of a Self-Service Data Lakehouse

Try Dremio’s Interactive Demo

How Dremio Enables Self-Service Data Access

Conclusion

Here are Some Exercises for you to See Dremio’s Features at Work on Your Laptop

Try Dremio Cloud free for 30 days

Related Dremio Articles

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

Table-Driven Access Policies Using Subqueries

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Ready to Get Started?