July 13, 2021

Deep Dive into Iceberg SQL Extensions

Apache Iceberg is an open table format that allows data engineers and data scientists to build reliable and efficient data lakes with features that are normally present only in data warehouses. The project allows companies to substantially simplify their current data lake use cases as well as to unlock fundamentally new ones.This talk will focus on the Iceberg SQL extensions, a recent development in the Iceberg community to efficiently manage tables through SQL. In particular, this session will cover how to snapshot/migrate an existing Hive or Spark table, perform table maintenance, and optimize metadata and data to fully benefit from Iceberg’s rich feature set. In addition, the presentation will cover common pitfalls of running and managing Iceberg tables with tens of millions of files in production and how they can be addressed using SQL extensions.

Topics Covered

Table Formats

Unlocking Potential with Apache Iceberg

Speakers

Anton Okolnychyi

Anton is a committer and PMC member of Apache Iceberg as well as an Apache Spark contributor at Apple.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Deep Dive into Iceberg SQL Extensions

Speakers

Try Dremio’s Interactive Demo

Get Started Free

See Dremio in Action

Talk to an Expert

Make data engineers and analysts 10x more productive