Data Lake Storage

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data. In addition to S3, ADLS & GCS, there are Minio, Dell ECS, IBM, Alibaba and other small cloud providers
Table Formats Apache Iceberg Subsurface LIVE Sessions

The Future of Intelligent Storage in Big Data

Read more
...

Apache Iceberg Office Hours

Read more
...

Apache Iceberg Office Hours

Read more
...

Compaction in Apache Iceberg: Fine-Tuning Your Iceberg Table’s Data Files

Learn how to optimize the data files in your Apache Iceberg Table using compaction and its different strategies including z-order.
Read more
...

Compaction in Apache Iceberg: Fine-Tuning Your Iceberg Table’s Data Files

Learn how to optimize the data files in your Apache Iceberg Table using compaction and its different strategies including z-order.
Read more