March 1, 2023

10:10 am - 10:40 am PST

Scaling Row Level Deletions at Pinterest

With close to exabyte-scale data at Pinterest and evolving business needs, the ability to perform row-level data deletions efficiently on petabytes of data is important. This talk shares how we utilize Apache Iceberg to achieve this goal at Pinterest. We will discuss challenges specific to row-level deletion, solutions we considered, and their trade-offs. Furthermore, we will share some bottlenecks that row-level data deletions run into and the optimizations we added to resolve them. Given how important data deletion requirements are in the current world, we hope that the learnings and solutions shared in this session will help you save money for your respective businesses while improving reliability.

Topics Covered

Open Source

Sign up to watch all Subsurface 2023 sessions