March 1, 2023
12:20 pm - 12:50 pm PST
Apache Iceberg’s Best Secret: A Guide to Metadata Tables
Apache Iceberg’s rich metadata is its secret sauce, powering core features like time travel, query optimizations, and optimistic concurrency handling. But did you know that this metadata is accessible to all, via easy-to-use system tables? This talk will walk through real-life examples of using metadata tables to get even more out of Iceberg and address questions such as: What is the last partition updated and when? Why are there too many small files? What Iceberg maintenance procedures can give us better query performance? Can we start building more advanced systems like data audit and data quality? How many null values are being added per hour? What is the latency of data ingest over time? We will also cover metadata table performance tips and tricks, and ongoing improvements in the community. Whether you are already using Iceberg metadata tables or interested in getting started, attend this talk to learn how this under-utilized feature can help manage data tables more effectively than ever before.