Table of Contents
Connecting Tableau to Apache Iceberg Tables with Dremio
The Apache Iceberg format has taken the data lakehouse world by storm, becoming the keystone pillar of many firms’ data infrastructure. This article shows you how you can connect your Apache Iceberg tables to tools like Tableau so you can generate BI dashboards directly from your Iceberg tables without the need for cubes or extracts. This can be best achieved with Dremio.
Dremio is a data lakehouse platform that enables performant queries on many data lakehouse data sources like Parquet files, Apache Iceberg tables, and even Delta Lake tables. Dremio also integrates with many BI tools making it easy to build dashboards with Dremio using tools like Tableau, Power BI, Superset, Looker, Hex, and many other client tools.
Getting Set Up with Dremio
If you haven’t tried it yet, take Dremio out for a free test drive to see many of the features in action before creating your own free account. After your test drive, follow this tutorial to create your own Dremio Cloud account and participate in this hands-on tutorial.
The first step is to sign up for a free Dremio Cloud account. The process should only take a few minutes. Follow the docs if you have any questions. You can also take a tour through the sign-up process and UI by watching these videos:
(Note: If you use Azure, Google Cloud, or on-prem clusters for your data lake, get started with Dremio Community Edition.)
Once you’ve created a Dremio Sonar and Arctic project, we can get started.
Promoting an Iceberg Table from Object Storage
When you first access the dashboard of your Dremio Cloud account you’ll see that you have a “Samples” source already connected with sample data to try out. In there you’ll see a folder called “NYC-taxi-trips-iceberg.” This is a folder on object storage that contains all the data and metadata to be read as an Iceberg table.
You can take the preexisting Iceberg tables in the object storage and “promote” them in Dremio so they will be recognized by Dremio. To do so, just click on the “Format Folder” option which should be available if you hover your cursor to the right.
Select Iceberg and the table is now recognized as an Iceberg table, giving you full access to DML, time-travel, and more.
Enabling Reflections on the Iceberg Table
One of Dremio’s most innovative features is data reflections which can accelerate your queries transparently to sub-second, which makes working with Tableau dashboards a more pleasant experience.
For your existing Iceberg table, turn on aggregate reflections to accelerate aggregate queries, which are precisely the kind of queries your dashboard will rely on. Bring up the dataset on your Dremio dashboard and follow these steps:
- Click on the reflections tab.
- Toggle on aggregate reflections.
- Click save.
- Give it a moment to generate the reflections. Once it’s finished, you should see the footprint of the reflection.
This will optimize aggregate queries on the dimensions and measures listed. Feel free to remove any you don’t typically query and add those that you do.
It’s time to connect your data to Tableau now that the queries will be accelerated!
Connecting to Tableau
After promoting the dataset you should find yourself in the SQL editor to query the Iceberg table. You can easily load this dataset into Tableau or Power BI using the buttons available in the top right.
When you click on the Tableau button, it will download a “tds” file that will open Tableau and the browser window for SSO authentication. Once you’ve authenticated, you are now able to generate a dashboard from the Iceberg table directly in Tableau, thanks to Dremio.
Now you know how to create BI dashboards directly from your data lake Apache Iceberg tables using Dremio and Tableau. Keep in mind, this connectivity to Tableau is extended to any data source you have access to on your Dremio account which means databases, Parquet tables, Delta Lake tables, and more.