Root Cause Analysis for Your Data Lake

Wednesday, July 21 2021

From null values and duplicate rows, to modeling errors and schema changes, data pipelines can break for millions of reasons. And once "data downtime" happens, we need to know what caused it so that we can fix it – fast.

It’s one thing to talk about root cause analysis in concept, but what does it look like in practice? In this talk, we pull back the curtain on how some of the best data teams are tackling data downtime across their data lake by walking through how to root cause a real-life incident across three main channels: your code, your operational environment, and the data itself.