Dagster: An Orchestrator for the Full Data Lifecycle

Building datasets in a data lake boils down to developing, executing and monitoring graphs of computation. Traditional orchestrators focus on sequencing computations in production, but the graph is equally important when developing and testing changes, as well as when monitoring and debugging production datasets. Dagster is an open source orchestrator built for the full data lifecycle.In this talk, we’ll discuss how the orchestration graph helps answer some of the most important questions in data engineering. We’ll discuss how developing and testing with orchestration graphs enables early error detection on issues that are usually caught later in production. And we’ll talk about how the orchestration graph allows you to capture lineage in order to track all the runs and datasets that are upstream of a particular dataset.


Sandy Ryza

Sandy Ryza

Sandy Ryza is a Software Engineer at Elementl. He previously led the freight ML engineering team at KeepTruckin. He authored O’Reilly’s Advanced Analytics with Spark and is a committer on Apache Spark and Apache Hadoop.

Ready to Get Started? Here Are Some Resources to Help

Case Study

When E-Commerce Explodes – The More Data the More Dremio

read more


Real-World Strategies to Optimize Data Platform Cost

read more
On-Demand webinar graphic


Centralize Data Security Governance on your Open Data Lakehouse with Dremio & Privacera

read more

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us