March 1, 2023
9:35 am - 10:05 am PST
How Our Iceberg Migration Achieved 90% Cost Savings for Amazon S3
At Insider, we migrated a data lake with hundreds of terabytes of data in Amazon S3 from Hive to Iceberg using Apache Spark and reduced our Amazon S3 cost by 90%. During the migration we changed the column structure and partition structure of some tables, the file type, and the compression algorithm.
The session explains:
– Why we decided to migrate from Apache Hive
– Why we selected Apache Iceberg
– How we designed and executed the migration process
– The outcomes from a cost and performance perspective
Topics Covered
Open Source
Real-world implementation