category-logo

Optimized Row Columnar (ORC)

Apache ORC is a free and open-source column-oriented data storage format of the Apache Hadoop ecosystem. It is similar to the other columnar-storage file formats available in the Hadoop ecosystem such as RCFile and Parquet. It is compatible with most of the data processing frameworks in the Hadoop environment.
Puffins and Icebergs: Additional Stats for Apache Iceberg Tables

Puffins and Icebergs: Additional Stats for Apache Iceberg Tables

A short introduction to the new file format called Puffin in Apache Iceberg that helps with additional table statistics
Read more