Data Lake Storage

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data. In addition to S3, ADLS & GCS, there are Minio, Dell ECS, IBM, Alibaba and other small cloud providers
Table Formats Apache Iceberg Subsurface LIVE Sessions

The Future of Intelligent Storage in Big Data

Read more
...

Migrating to Apache Iceberg Tables

Read more
...

Ensuring High Performance at Any Scale with Apache Iceberg’s Object Store File Layout

Object Storage can have some potential bottlenecks when it comes to working with big data. Apache Iceberg’s architecture lends to overcoming these challenges for a scalable table format solution for object storage.
Read more
...

Introduction to Apache Iceberg Using Spark

Learn the basics of Iceberg’s many features and utilities by trying them out in a Spark sandbox.
Read more
...

How Z-Ordering in Apache Iceberg Helps Improve Performance

This tutorial introduces the Z-order clustering algorithm in Apache Iceberg and explains how it adds value to the file optimization strategy.
Read more