The Dremio Blog

Apache Iceberg 101 – Your Guide to Learning Apache Iceberg Concepts and Practices


Apache Iceberg is an open-source data lakehouse table format that has been taking the big data analytics world by storm. 

In this article, you’ll find a 101 video course along with an aggregation of all the resources you’ll need to get up to speed on Apache Iceberg in concept and practice.

The Apache Iceberg 101 Course

Below we have a series of videos to educate you about Apache Iceberg and how to use Iceberg tables to enhance your data experience. After the course you’ll find an index of resources from around the web to continue expanding your Iceberg knowledge.

  1. Introduction to the course
  2. The Problem and the Solution (Iceberg’s Origin Story)
  3. Iceberg and the Data Lakehouse
  4. Overview of Apache Iceberg’s Architecture
  5. Iceberg Transactions Step by Step
  6. Iceberg Catalogs
  7. Copy-on-write and Merge-on-read
  8. Table Tuning with Table Properties
  9. Migrating to Iceberg
  10. Time-Travel
  11. Maintaining Iceberg Tables
  12. Hard-Deletions and GDPR

Directory of Additional Iceberg Resources

After watching the series of videos above you should have a pretty good understanding of Apache Iceberg and the concepts around it. 

Below is a list of additional resources to continue learning more about Apache Iceberg, including hands-on exercises and articles from companies detailing their usage of Apache Iceberg and more.

Apache Iceberg Core Concepts

Below are several resources for understanding what Apache Iceberg is and how it fundamentally works at a high-level conceptual level.

Apache Iceberg Features

Below are resources to learn more about the many features of Apache Iceberg.

Hands-on Apache Iceberg Exercises

The resources below guide you through guided exercises and tutorials to try Apache Iceberg in action with different tools.

Iceberg Video Demos

Videos showing hands use of Apache Iceberg Tables

Comparison of Apache Iceberg to Other Table Formats

With the resources below you can read on how Apache Iceberg compares to other table formats.

Companies Sharing Their Production Apache Iceberg Usage

Below are articles from companies that have documented their deployment of Apache Iceberg into production. You can read about their experiences and lessons learned.

Optimizing and Maintaining Apache Iceberg Tables

Once you have Apache Iceberg tables in place you’ll want to optimize and maintain them, below are articles that walk through different features for engineering tables for best performance.

Ingesting Data into Apache Iceberg Tables

How do we get data into our Iceberg tables, the following are articles on the ingestion of data into Iceberg tables from different sources.

Working with Cloud Object Storage

Object storage has become the standard for storing data in a data lakehouse and the resources below highlight Apache Iceberg in the context of cloud object storage.

The Java and Python API

Below are articles on Apache Iceberg’s Java and Python API.

Streaming with Apache Iceberg

Streaming data can require lots of considerations that don’t exist in batch processing. Below are resources that deal with using Apache Iceberg in streaming data.

Miscellaneous Blog Articles

Here is a list of other great Apache Iceberg articles you can learn from.

Iceberg Subsurface Conference Talks

Here is a list of Subsurface conference talks on Apache Iceberg.

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us