March 1, 2023
10:10 am - 10:40 am PST
Scaling Row Level Deletions at Pinterest
With close to exabyte-scale data at Pinterest and evolving business needs, the ability to perform row-level data deletions efficiently on petabytes of data is important. This talk shares how we utilize Apache Iceberg to achieve this goal at Pinterest. We will discuss challenges specific to row-level deletion, solutions we considered, and their trade-offs. Furthermore, we will share some bottlenecks that row-level data deletions run into and the optimizations we added to resolve them. Given how important data deletion requirements are in the current world, we hope that the learnings and solutions shared in this session will help you save money for your respective businesses while improving reliability.