Data Discovery at Lyft and Convoy

In this talk, we will introduce why it’s so crucial to solve data discovery, and discuss the learnings from addressing this problem at Lyft and Convoy. The leading open source data catalog is used by 750 users every week at Lyft and by 80% of Convoy’s employees every month. We will share what makes a successful data catalog and the latest improvements in Amundsen, including lineage and dbt integration. We will end with what’s still not working well and how we as a community could tackle it.Everyone has access to data but few know what exists, what’s trustworthy and how to use it. Humans solve this problem naturally through the gossip protocols of Slack and shoulder-tapping which doesn’t scale and comes at a huge productivity loss. But it gets worse. Wrong data leads to wrong conclusions.Mark and Chad saw this problem first hand at their respective organizations – Lyft & Convoy. Analysts and data scientists were spending more than 1/3rd of their time discovering and establishing trust in the data they use. Lyft has made its analysts and data scientists over 20% more productive by creating and using the leading open source data discovery and metadata engine, Amundsen. Convoy has ~80% of the company use Amundsen for data discovery and trust.

Topics Covered

Subsurface Data Catalog by Dremio: Deep Insights.

Ready to Get Started? Here Are Some Resources to Help

Whitepaper Thumb


Harness Snowflake Data’s Full Potential with Dremio

read more
Whitepaper Thumb


Simplifying Data Mesh for Self-Service Analytics on an Open Data Lakehouse

read more
Whitepaper Thumb


Dremio Upgrade Testing Framework

read more
get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.