Dremio Blog

6 minute read · March 29, 2024

BI Dashboards with Apache Iceberg Using AWS Glue and Apache Superset

Alex Merced Head of DevRel, Dremio

Start For Free

Copied to clipboard

BI Dashboards with Apache Iceberg Using AWS Glue and Apache Superset

Setting Up Our Environment

Connecting Your AWS Glue Catalog to Dremio

Connecting Superset to Dremio

Business Intelligence (BI) dashboards are invaluable tools that aggregate, visualize, and analyze data to provide actionable insights and support data-driven decision-making. Serving these dashboards directly from the data lake, especially with technologies like Apache Iceberg, offers immense benefits, including real-time data access, cost-efficiency, and the elimination of data silos. Dremio as a data lakehouse platform, enhances this setup by providing high-performance query acceleration and an integrated analytics layer, thus ensuring that the dashboards are timely and powered by rich and comprehensive data sources. This blog will guide you through leveraging your AWS Glue catalog as a Dremio data source and utilizing Apache Superset as the BI tool to create and deliver dynamic, insightful BI dashboards. By combining these powerful technologies, you can unlock the full potential of your data lake, making your data more accessible and actionable for all stakeholders.

Setting Up Our Environment

This exercise will use Docker Compose to set up a local Dremio and Superset environment. We will use the official Dremio Docker image and a custom Superset image with the requisite Dremio libraries installed. Create a docker-compose.yml with the following:

version: "3"

services:
  # Dremio
  dremio:
    platform: linux/x86_64
    image: dremio/dremio-oss:latest
    ports:
      - 9047:9047
      - 31010:31010
      - 32010:32010
    container_name: dremio
    environment:
      - DREMIO_JAVA_SERVER_EXTRA_OPTS=-Dpaths.dist=file:///opt/dremio/data/dist
    networks:
      dremio-superset:
  #Superset
  superset:
    image: alexmerced/dremio-superset
    container_name: superset
    networks:
      dremio-superset:
    ports:
      - 8080:8088
networks:
  dremio-superset:

To spin up the environment, do the following commands with your terminal in the same folder as the docker-compose.yml:

docker compose up

This will spin up Dremio and Superset, but to fully activate Superset open up another terminal and enter the command:

docker exec -it superset superset init

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Connecting Your AWS Glue Catalog to Dremio

Go to locahost:9047 in your browser and create your Dremio user. Add a new “AWS Glue” data source.
Name the source “glue”, select your preferred AWS region, and enter your AWS credentials. The simplest way is to use your access key and secret key, but if you prefer using IAM roles that is also possible.
Under the advanced options tab, add a connection property with the “hive.metastore.warehouse.dir” and the value should be the address of the location you want your data written to when Dremio creates Iceberg tables in your Glue catalog.
Then click “save” to add the data source

Connecting Superset to Dremio

Dremio can be used with most existing BI tools, with one-click integrations in the user interface for tools like Tableau and Power BI. We will use an open-source option in Superset for this exercise, but any BI tool would have a similar experience.

To get started, head over to localhost:8080 and log in to Superset with the username “admin” and password “admin”. Once you are in, click on “Settings” and select “Database Connections”.

Add a New Database
Select “Other”
Use the following connection string (make sure to include Dremio username and password in URL): dremio+flight://USERNAME:PASSWORD@dremio:32010/?UseEncryption=false
* Read here for details on how the URL should look like for Dremio Cloud
Test connection
Save connection

The next step is to add a dataset by clicking on the + icon in the upper right corner and selecting “create dataset”. From here, select the table you want to add to Superset, in this case, our sales_data table.

We can then click the + to add charts based on the datasets we’ve added. Once we create the charts we want, we can add them to a dashboard, and that’s it! It is as simple as that, and if you want to accelerate your dashboard even further, you can enable aggregate reflections on your underlying datasets for an additional boost.

Consider deploying Dremio into production to make delivering data for analytics easier for your data engineering team.

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.

Start For Free

Article Topics

Product Insights from the Dremio Blog

Blog coverpage for Ingesting Data into Aparche Iceberg with Dremio

Feb 1, 2024 Product Insights from the Dremio Blog

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.

Alex Merced

Oct 12, 2023 Product Insights from the Dremio Blog

Table-Driven Access Policies Using Subqueries

This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.

Albert Vernon

Aug 31, 2023 Dremio Blog: News Highlights

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.

Jeremiah Morrow

BI Dashboards with Apache Iceberg Using AWS Glue and Apache Superset

Table of Contents

Setting Up Our Environment

Try Dremio’s Interactive Demo

Connecting Your AWS Glue Catalog to Dremio

Connecting Superset to Dremio

Try Dremio Cloud free for 30 days

Ready to Get Started?

Table of Contents

Setting Up Our Environment

Try Dremio’s Interactive Demo

Connecting Your AWS Glue Catalog to Dremio

Connecting Superset to Dremio

Try Dremio Cloud free for 30 days

Related Dremio Articles

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

Table-Driven Access Policies Using Subqueries

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Ready to Get Started?