Dremio Jekyll

Data Lineage

Understand how your data is used for analytics. All sources. All tools. Queries, transformations, sharing, and more.


Data Lineage Tells You The Full Story. Understand How Your Data Is Used For Analytics.

Data is at the heart of analytics. But understanding how it is being accessed, transformed, and shared across your company is virtually impossible. Dremio gives you full visibility into the lineage of your data. Across all your BI tools, transformations, joining, sharing actions, and more. It’s a perspective on your data you’ve never had before.

Understand Which Datasets Really Matter

You know where all your data lives, but do you know how it’s being used? Which teams across the company? What queries? When it’s most important? How data from different systems is being joined together? You’ve made major investments to make data available, but it’s almost impossible to understand how the data is being used and which datasets really matter.

Error Remediation - Where Do You Even Begin?

Errors in data happen, that’s life. When it’s time to make a correction, how do you know which reports have been impacted by erroneous sources? With Dremio this is easy. You have full visibility into which systems and users have accessed specific datasets, down to the row level. You can easily detect and notify when errors might impact downstream analysis.

Security Compromises - What Could Be Impacted?

If a data source is compromised by an internal or external agent, it can be incredibly difficult to understand everyone that might be impacted. With Dremio you can know in seconds exactly which downstream systems and users have access to specific datasets, and take appropriate measures to notify or limit their access.

How Dremio Captures Your Data Lineage

Dremio connects all your BI and analytics tools to all of your data sources. Relational databases, NoSQL, Hadoop, S3, and more. Your teams can query data sources with any BI tool or standard SQL. Dremio keeps a full history of every query, the number of results, who issued the query, and much more.

In addition, Dremio gives your users a self-service experience for curating data for their particular needs, without waiting on IT. Dremio’s virtual datasets allow users to transform and join data from different sources using a powerful and intuitive interface. The relationships between your data sources, virtual datasets, and all your queries are maintained in Dremio’s Data Graph.

Dremio data lineage

You can query these relationships and the queries using standard SQL.

Technical Details for Data Lineage

  • Supports many tools, including Tableau, Power BI, Qlik, Python, R, and any SQL-based tool.
  • Supports virtually any data source, including relational, NoSQL, Hadoop, Excel, and others.
  • Documentation for Data Lineage.