Dremio Blog

9 minute read · April 14, 2025

Syncing Documentation with Dremio + dbt

Will Martin Technical Evangelist

Start For Free

Copied to clipboard

Syncing Documentation with Dremio + dbt

Syncing Descriptions and Tags

How it Works

What Happens in Dremio

Benefits of Persisted Documentation

Learning How to Work with Dremio & dbt

Join the Dremio/dbt community by joining the dbt slack community and joining the #db-dremio channel to meet other dremio-dbt users and seek support.

Documentation is a critical component of any product, from food to furniture to software. Whether you are looking at a t-shirt or a supermarket chicken it will come with something explaining what it is and how to use it. The same is true for software and code. You would (and should!) struggle to find an example of a software product or git repository that does not come with a docs page or a README.

With Data Products, the operating model of treating your datasets as reliable, business-ready assets, documentation is just as critical. Rather than a static asset locked within a single platform, data products are curated, governed, and accessible datasets to be leveraged across tools, teams, and use cases. As such, these teams and tools need to be provided a clear understanding of each dataset, how it was made, and how it should be used.

However, documentation is only good when it is accessible. The examples I gave above come with their “documentation” attached, but this isn’t as easy to do with datasets. Hosting a data dictionary on an internal website or wiki is a poor solution as your users have to go outside of their tool of choice to access or update this information.

The ideal is to have your documentation be readily accessible in multiple tools and also be convenient to write and maintain. By using Dremio and dbt this is easily achievable. An Analytics Engineer who is developing a data model in their IDE can create descriptions and tags in dbt, and effortlessly sync these with Dremio. Then Business Users and Data Analysts in the Dremio UI can reliably review and understand the datasets they are using for analytics work.

Syncing Descriptions and Tags

By using the dbt-dremio adaptor, you can seamlessly sync model descriptions and tags from your dbt project to generate wikis and labels in Dremio. This benefits your data documentation by making it accessible directly within Dremio's UI, ensuring consistent metadata across tools. This is also a huge boon for the engineers creating data products, as they can write and maintain both the models and the documentation in one place using the same development tool.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Enabling the Sync

To enable this feature, you need to turn on the persist_docs property in your dbt_project.yml file with the relation option set to True. This configuration ensures that the model descriptions and tags are persisted in Dremio. This can be implemented at the project-wide level or for individual layers depending on where this property is nested within the configuration.

An example dbt_project.yml

Here is an example configuration, enabling documentation syncing for the "example" layer of a project called "tutorial":

models:
  tutorial:
    example:
      +materialized: view
      +persist_docs:
        relation: True

How it Works

The description property in the schema.yml file for a model is added to the model wiki in Dremio. This provides a central place for users to understand the purpose and structure of each model directly in Dremio. This functionality also works for doc blocks, descriptions defined using markdown in dbt. This enables the Dremio UI to display features such as more elaborate text formatting, tables, and hyperlinks.

The tags defined in the schema.yml file within the model config are converted into labels in Dremio. These labels help organise and filter models within Dremio based on shared attributes or business domains.

An example schema.yml

Here is an example of how to define a description and tags for your models in a schema.yml:

models:
  - name: my_first_dbt_model
    description: "A starter dbt model"
    columns:
      - name: id
        description: "The primary key for this table"
        data_tests:
          - unique
          - not_null
    config:
      tags: 
        - example

What Happens in Dremio

Wiki Sync

The description of my_first_dbt_model ("A starter dbt model") is added to the wiki of the corresponding view in Dremio.

*Model Wiki as displayed in the Dremio UI.*

Label Sync

The “example” tag is added as a label to the corresponding materialisation in Dremio.

*Dataset Overview in the Dremio UI displaying the label.*

Benefits of Persisted Documentation

Centralised Metadata: Descriptions and tags are automatically synchronised, reducing duplication of effort.
Improved Discoverability: Labels in Dremio make it easier to search, filter, and categorise views.
Enhanced Collaboration: With wikis synced, teams can quickly understand the context and purpose of models within Dremio. By enabling persist_docs in your dbt_project.yml and leveraging descriptions and tags in your schema.yml, you can ensure that your Dremio environment is always enriched with up-to-date and meaningful documentation.

Learning How to Work with Dremio & dbt

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.

Start For Free

Article Topics

Product Insights from the Dremio Blog

Blog coverpage for Ingesting Data into Aparche Iceberg with Dremio

Feb 1, 2024 Product Insights from the Dremio Blog

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.

Alex Merced

Oct 12, 2023 Product Insights from the Dremio Blog

Table-Driven Access Policies Using Subqueries

This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.

Albert Vernon

Aug 31, 2023 Dremio Blog: News Highlights

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.

Jeremiah Morrow

Syncing Documentation with Dremio + dbt

Table of Contents

Syncing Descriptions and Tags