December 10, 2025

JSON and Tell: The Variant Type in Parquet and Iceberg

The data landscape is evolving rapidly, with semi-structured data increasingly becoming the norm rather than the exception. This talk covers one of the recent big developments in the data ecosystem: the adoption of Variant types for Apache Iceberg and Parquet. We’ll explore Variant from several angles, including the actual format spec and its transformative impact on data architecture patterns and the management of semi-structured data, particularly JSON.

We’ll cover:

•Why the Variant type was developed and the benefits it provides for semi-structured data
•A deep dive into the Variant encoding format
•How shredding can improve performance for various cases
•The current state of ecosystem integration support for the Variant type
This talk is ideal for technical leaders, data platform engineers, and open-source contributors interested in the cutting edge of data storage and processing technologies.

Topics Covered

Apache Iceberg
Data Analytics
ELT/ETL
Modernization and Migration
Open Source
Table Formats
Use Cases

Sign up to watch all Subsurface 2025 sessions