
Join the Dremio/dbt community by joining the dbt slack community and joining the #db-dremio channel to meet other dremio-dbt users and seek support.
Documentation is a critical component of any product, from food to furniture to software. Whether you are looking at a t-shirt or a supermarket chicken it will come with something explaining what it is and how to use it. The same is true for software and code. You would (and should!) struggle to find an example of a software product or git repository that does not come with a docs page or a README.

With Data Products, the operating model of treating your datasets as reliable, business-ready assets, documentation is just as critical. Rather than a static asset locked within a single platform, data products are curated, governed, and accessible datasets to be leveraged across tools, teams, and use cases. As such, these teams and tools need to be provided a clear understanding of each dataset, how it was made, and how it should be used.
However, documentation is only good when it is accessible. The examples I gave above come with their “documentation” attached, but this isn’t as easy to do with datasets. Hosting a data dictionary on an internal website or wiki is a poor solution as your users have to go outside of their tool of choice to access or update this information.
The ideal is to have your documentation be readily accessible in multiple tools and also be convenient to write and maintain. By using Dremio and dbt this is easily achievable. An Analytics Engineer who is developing a data model in their IDE can create descriptions and tags in dbt, and effortlessly sync these with Dremio. Then Business Users and Data Analysts in the Dremio UI can reliably review and understand the datasets they are using for analytics work.
Syncing Descriptions and Tags
By using the dbt-dremio adaptor, you can seamlessly sync model descriptions and tags from your dbt project to generate wikis and labels in Dremio. This benefits your data documentation by making it accessible directly within Dremio's UI, ensuring consistent metadata across tools. This is also a huge boon for the engineers creating data products, as they can write and maintain both the models and the documentation in one place using the same development tool.
Enabling the Sync
To enable this feature, you need to turn on the persist_docs
property in your dbt_project.yml
file with the relation
option set to True
. This configuration ensures that the model descriptions and tags are persisted in Dremio. This can be implemented at the project-wide level or for individual layers depending on where this property is nested within the configuration.
An example dbt_project.yml
Here is an example configuration, enabling documentation syncing for the "example" layer of a project called "tutorial":
models:
tutorial:
example:
+materialized: view
+persist_docs:
relation: True
How it Works
The description
property in the schema.yml
file for a model is added to the model wiki in Dremio. This provides a central place for users to understand the purpose and structure of each model directly in Dremio. This functionality also works for doc blocks, descriptions defined using markdown in dbt. This enables the Dremio UI to display features such as more elaborate text formatting, tables, and hyperlinks.
The tags
defined in the schema.yml
file within the model config are converted into labels in Dremio. These labels help organise and filter models within Dremio based on shared attributes or business domains.
An example schema.yml
Here is an example of how to define a description
and tags
for your models in a schema.yml
:
models:
- name: my_first_dbt_model
description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
data_tests:
- unique
- not_null
config:
tags:
- example
What Happens in Dremio
Wiki Sync
The description
of my_first_dbt_model
("A starter dbt model") is added to the wiki of the corresponding view in Dremio.

Label Sync
The “example” tag is added as a label to the corresponding materialisation in Dremio.

Benefits of Persisted Documentation
- Centralised Metadata: Descriptions and tags are automatically synchronised, reducing duplication of effort.
- Improved Discoverability: Labels in Dremio make it easier to search, filter, and categorise views.
- Enhanced Collaboration: With wikis synced, teams can quickly understand the context and purpose of models within Dremio. By enabling
persist_docs
in yourdbt_project.yml
and leveraging descriptions and tags in yourschema.yml
, you can ensure that your Dremio environment is always enriched with up-to-date and meaningful documentation.
Learning How to Work with Dremio & dbt
- Video Demonstration of dbt with Dremio Cloud
- Video Demonstration of dbt with Dremio Software
- Dremio dbt Documentation
- Dremio CI/CD with Dremio/dbt whitepaper
- Dremio Quick Guides dbt reference
- End-to-End Laptop Exercise with Dremio and dbt
- Video Playlist: Intro to dbt with Dremio
- Automating Running dbt-dremio with Github Actions
- Orchestrating Dremio with Airflow (can be used to trigger dbt after external data updates)
Sign up for AI Ready Data content