July 22, 2021

Data Discovery at Lyft and Convoy

In this talk, we will introduce why it’s so crucial to solve data discovery, and discuss the learnings from addressing this problem at Lyft and Convoy. The leading open source data catalog is used by 750 users every week at Lyft and by 80% of Convoy’s employees every month. We will share what makes a successful data catalog and the latest improvements in Amundsen, including lineage and dbt integration. We will end with what’s still not working well and how we as a community could tackle it.Everyone has access to data but few know what exists, what’s trustworthy and how to use it. Humans solve this problem naturally through the gossip protocols of Slack and shoulder-tapping which doesn’t scale and comes at a huge productivity loss. But it gets worse. Wrong data leads to wrong conclusions.Mark and Chad saw this problem first hand at their respective organizations – Lyft & Convoy. Analysts and data scientists were spending more than 1/3rd of their time discovering and establishing trust in the data they use. Lyft has made its analysts and data scientists over 20% more productive by creating and using the leading open source data discovery and metadata engine, Amundsen. Convoy has ~80% of the company use Amundsen for data discovery and trust.

Topics Covered

Amundsen

Subsurface Data Catalog by Dremio: Deep Insights.

Speakers

Mark Grover

Mark Grover is the co-creator of the open source data catalog and metadata engine, Amundsen. Amundsen is used by data scientists and analysts to discover, understand and trust the data they use. At Lyft, Amundsen has 700+ active users every week, and outside of Lyft, Amundsen is used by 27 companies like Instacart, ING, Square and more.

Chad Sanderson

Chad Sanderson is the Product Lead for Convoy’s Data Platform team, which includes the data warehouse, streaming, BI and visualization, experimentation, machine learning and data discovery.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Data Discovery at Lyft and Convoy

Speakers

Try Dremio’s Interactive Demo

Get Started Free

See Dremio in Action

Talk to an Expert

Make data engineers and analysts 10x more productive