Companies leverage Apache Iceberg to build reliable and efficient data lakes with features that are normally present only in data warehouses. As users begin to use Apache Iceberg in a bigger range of data processing scenarios, it is essential to support efficient and transactional delete/update/merge operations even in read-mostly data lake environments.
This talk will be a deep dive into the copy-on-write and merge-on-read approaches for executing row-level operations in Apache Iceberg so that users can pick the correct implementation for a given use case. In addition, the presentation will help data engineers to avoid common mistakes and tune delete/update/merge operations at scale.