Dremio Blog

7 minute read · February 3, 2022

Apache Iceberg Becomes Industry Open Standard with Ecosystem Adoption

Mark Lyons Vice President of Product Management, Dremio

Start For Free

Copied to clipboard

Apache Iceberg Becomes Industry Open Standard with Ecosystem Adoption

Signs of Apache Iceberg Growth and Adoption

Top 10 Apache Iceberg Contributors and Influencers

How to Get Involved and Learn More about Apache Iceberg

Cloud data lakes are now the go-to architecture for data storage and analytics across organizations of all types and sizes because cloud storage is scalable, easy and inexpensive. Digital experiences are ubiquitous and every company needs to make their data accessible to unlock innovation and offset competitive threats.

Organizations are now able to use cloud data lakes for workloads that traditionally went to data warehouses (such as BI and analytics). This shift is possible in part because of Apache Iceberg, an open source table format that provides many of the same features and capabilities found with traditional databases and data warehouses but within an open, flexible data lake environment.

Apache Iceberg continues to gain mindshare in the data ecosystem because of its well documented, engine-agnostic and open standard. While Apache Parquet is the de facto standard file format to track the rows and columns of data, we need the next layer of abstraction, a table, to track the files so we can efficiently access the minimum data necessary per query. In addition to a better user experience based upon SQL, Apache Iceberg tables also provide atomic transactions, data consistency guarantee, time travel and versioning.

Signs of Apache Iceberg Growth and Adoption

In May 2021, Apache Iceberg emerged from incubation to a top level Apache Software Foundation project.

A project like this requires vast ecosystem adoption to become an industry standard. Let’s look at what has happened in the Iceberg ecosystem over the past few months which makes us at Dremio very bullish on its future:

At re:Invent AWS announced Athena support for Apache Iceberg.
More recently AWS announced EMR support for Apache Iceberg.
Adobe Experience Cloud adopts Apache Iceberg.
Ryan Blue the Creator of Apache Iceberg at Netflix starts Tabular and raises series A funding.
Snowflake announces support for Apache Iceberg external table query.
It is easy to create a data lake based upon Apache Iceberg.
- Getting started with Apache Iceberg using AWS Glue and Dremio
- Creating S3 data lake with Apache Iceberg

Over the past 3 years code additions to the Apache Iceberg Project have increased and there is no signs of this slowing down based upon the recent ecosystem announcements.

Source: https://github.com/apache/iceberg/graphs/code-frequency

Top 10 Apache Iceberg Contributors and Influencers

There is a vibrant community of Apache Iceberg contributors and thought leaders that help drive growth and continuing innovation. Based on GitHub, LinkedIn, and our own research, here is a top 10 list of people we think are worth following on the topic of Apache Iceberg.

Ryan Blue - Tabular.io, previously Netflix
Anton Okolnychyi - Apple
Kyle Bendickson - Tabular.io, previously Apple
Jack Ye - AWS Athena
Openinx - Alibaba
Rusell Spitzer - Apple
Eduard Tudenhöfner - Dremio
Junjie Chen - Tencent
Fokko Driesprong - Datafold
Jun-he - Netflix

How to Get Involved and Learn More about Apache Iceberg

To learn more about Apache Iceberg check out these other resources:

Register for Subsurface LIVE Winter 2022 to hear more from Ryan Blue, the co-creator of Apache Iceberg, as well as other companies contributing to the project, including Uber and Apple.

We have some exciting Iceberg sessions in the agenda, including:

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.

Start For Free

Article Topics

Dremio Blog: Open Data Insights

Sep 22, 2023 Dremio Blog: Open Data Insights

Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop

We're always looking for ways to better handle and save money on our data. That's why the "data lakehouse" is becoming so popular. It offers a mix of the flexibility of data lakes and the ease of use and performance of data warehouses. The goal? Make data handling easier and cheaper. So, how do we […]

Alex Merced

Aug 16, 2023 Dremio Blog: News Highlights

5 Use Cases for the Dremio Lakehouse

With its capabilities in on-prem to cloud migration, data warehouse offload, data virtualization, upgrading data lakes and lakehouses, and building customer-facing analytics applications, Dremio provides the tools and functionalities to streamline operations and unlock the full potential of data assets.

Alex Merced

Aug 31, 2023 Dremio Blog: News Highlights

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.

Jeremiah Morrow

Apache Iceberg Becomes Industry Open Standard with Ecosystem Adoption

Table of Contents

Signs of Apache Iceberg Growth and Adoption

Top 10 Apache Iceberg Contributors and Influencers

How to Get Involved and Learn More about Apache Iceberg

Try Dremio Cloud free for 30 days

Ready to Get Started?

Table of Contents

Signs of Apache Iceberg Growth and Adoption

Top 10 Apache Iceberg Contributors and Influencers

How to Get Involved and Learn More about Apache Iceberg

Try Dremio Cloud free for 30 days

Related Dremio Articles

Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop

5 Use Cases for the Dremio Lakehouse

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Ready to Get Started?