14 minute read · May 1, 2025

Introducing Dremio Auth Manager for Apache Iceberg

Alex Dutra

Alex Dutra · Senior Staff Software Engineer

Iceberg 1.9 includes the Auth Manager API, a new API enabling pluggable authentication for Iceberg REST. This enhancement allows for the integration of various authentication protocols and expanded OAuth2 functionalities.

Building on this API, Dremio has just open-sourced Dremio Auth Manager, a new OAuth2 manager for Apache Iceberg REST catalogs. Originally, this project was created due to limitations encountered while integrating Iceberg REST into our products. The implementation is now being open-sourced for the broader community.

Dremio Auth Manager is intended as an alternative to Iceberg’s built-in OAuth2 manager, offering greater functionality and flexibility while complying with the OAuth2 standards. Dremio Auth Manager streamlines authentication by handling token acquisition and renewal transparently, eliminating the need for users to deal with tokens directly, and avoiding failures due to token expiration.

This post will demonstrate how to leverage Dremio Auth Manager to authenticate with three distinct catalog servers: Apache Polaris (incubating)Nessie, and Dremio Enterprise Catalog.

Before We Start

First of all, download the latest Dremio Auth Manager artifact. To run the examples in this post, you should use the “runtime” jar.

Dremio Auth Manager is highly configurable, please make sure to read the docs to familiarize yourself with its capabilities.

One last caveat: all examples below have numbers pinpointing some aspects of interest in the code, e.g. ①②③. Please make sure to remove those numbers if you execute the commands in your terminal.

Dremio Auth Manager with Apache Polaris (incubating)

Our first example will use Apache Polaris (incubating) to illustrate how Dremio Auth Manager can seamlessly replace Iceberg’s built-in manager, facilitating migration from the latter to the former.

  1. Run this getting-started guide from the Apache Polaris (incubating) repository.
  2. Execute the following command to launch a Spark SQL shell:
spark-sql \
 - jars authmgr-oauth2-runtime.jar ① \
 - packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0 \
 - conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
 - conf spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog \
 - conf spark.sql.catalog.polaris.type=rest \
 - conf spark.sql.catalog.polaris.warehouse=quickstart_catalog \
 - conf spark.sql.catalog.polaris.uri=http://localhost:8181/api/catalog \
 - conf spark.sql.catalog.polaris.rest.auth.type=com.dremio.iceberg.authmgr.oauth2.OAuth2Manager ② \
 - conf spark.sql.catalog.polaris.rest.auth.oauth2.dialect=iceberg_rest ③ \
 - conf spark.sql.catalog.polaris.rest.auth.oauth2.client-id=root ④ \
 - conf spark.sql.catalog.polaris.rest.auth.oauth2.client-secret=s3cr3t \
 - conf spark.sql.catalog.polaris.rest.auth.oauth2.scope=PRINCIPAL_ROLE:ALL ⑤
  1. Add the Dremio Auth Manager runtime jar to the Spark classpath
  2. Enable the Dremio Auth Manager with the rest.auth.type property
  3. Enable the “Iceberg REST” dialect
  4. Provide the client credentials
  5. Activate all roles for the authenticated principal

In this example Apache Polaris (incubating) acts as the identity provider, and since Dremio Auth Manager is configured with the “Iceberg REST” dialect, it will use Polaris’ internal token endpoint to authenticate, effectively behaving like Iceberg’s built-in manager.

At the prompt, type use polaris; to activate the catalog and trigger the authentication; then you are ready to go!

Dremio Auth Manager with Nessie

Our second example demonstrates using Dremio Auth Manager to authenticate against a Nessie catalog server, leveraging Keycloak as an external identity provider.

Unlike the initial scenario, in this example Dremio Auth Manager will adhere strictly to OAuth2 standards for Keycloak interactions, a key advantage over Iceberg’s built-in manager. This ensures automatic background token refreshing, addressing a known limitation of the built-in manager (which often fails to refresh tokens with external identity providers due to its non-compliance with OAuth2 standards).

To further illustrate other interesting features, we will utilize the “Authorization Code” grant type, a prevalent OAuth2 flow, not natively supported by the built-in Iceberg manager.

  1. Run this docker compose example from the Nessie repository.
  2. Now, run the following command to start the Spark SQL shell:
spark-sql \
 - jars authmgr-oauth2-runtime.jar \
 - packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0 \
 - conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
 - conf spark.sql.catalog.nessie=org.apache.iceberg.spark.SparkCatalog \
 - conf spark.sql.catalog.nessie.type=rest \
 - conf spark.sql.catalog.nessie.uri=http://localhost:19120/iceberg \
 - conf spark.sql.catalog.nessie.rest.auth.type=com.dremio.iceberg.authmgr.oauth2.OAuth2Manager \
 - conf spark.sql.catalog.nessie.rest.auth.oauth2.issuer-url=http://127.0.0.1:8080/realms/iceberg ① \
 - conf spark.sql.catalog.nessie.rest.auth.oauth2.grant-type=authorization_code ② \
 - conf spark.sql.catalog.nessie.rest.auth.oauth2.client-id=client1 ③ \
 - conf spark.sql.catalog.nessie.rest.auth.oauth2.client-secret=s3cr3t \
 - conf spark.sql.catalog.nessie.rest.auth.oauth2.scope=catalog ④
  1. The identity provider (issuer) URL; for Keycloak, this is the realm root URL
  2. Enable the “Authorization Code” grant
  3. Provide the client credentials
  4. The scopes to activate will depend on how Keycloak is configured (you may need to also request offline_access)

Note how the configuration includes the Keycloak realm root URL. Dremio Auth Manager uses this URL to automatically retrieve the OpenID Provider metadata from Keycloak and to identify necessary endpoints, such as the token and authorization endpoints, which simplifies configuration.

At the prompt, type use nessie; to activate the catalog. You should see an output like below:

[iceberg-auth-manager] ======== Authentication Required ========
[iceberg-auth-manager] Please open the following URL to continue:
[iceberg-auth-manager] http://127.0.0.1:8080/realms/iceberg/protocol/openid-connect/...

Authenticate with Keycloak by clicking the provided link and using the username “Alice” and password “s3cr3t”. Upon successful authentication, return to the Spark SQL shell to resume normal operations. You won’t need to do this again if the access token expires: Dremio Auth Manager will refresh it for you in the background!

Dremio Auth Manager with Dremio Enterprise Catalog

To interact with a Dremio Enterprise Catalog using Spark SQL, you need a Dremio Kubernetes cluster running version 26 or later with the Dremio Enterprise Catalog feature activated.

As a prerequisite, it is assumed that you have already generated a Dremio Personal Access Token (PAT) for the user account that Spark will use to connect to the Dremio cluster.

  1. Make sure you have saved your PAT in the DREMIO_PAT variable.
  2. Store the Dremio cluster IP address in the DREMIO_ADDRESS variable.
  3. Then run the following command to start the Spark SQL shell:
export DREMIO_PAT=… ①
export DREMIO_ADDRESS=…
export DREMIO_PAT_TYPE=urn:ietf:params:oauth:token-type:dremio:personal-access-token

spark-sql \
 - jars authmgr-oauth2-runtime.jar \
 - packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0 \
 - conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
 - conf spark.sql.catalog.dremio.type=rest \
 - conf spark.sql.catalog.dremio.warehouse=default ② \
 - conf spark.sql.catalog.dremio.uri=http://$DREMIO_ADDRESS:8181/api/catalog ③ \
 - conf spark.sql.catalog.dremio.rest.auth.type=com.dremio.iceberg.authmgr.oauth2.OAuth2Manager \
 - conf spark.sql.catalog.dremio.rest.auth.oauth2.token-endpoint=http://$DREMIO_ADDRESS:9047/oauth/token ④ \
 - conf spark.sql.catalog.dremio.rest.auth.oauth2.grant-type=token_exchange ⑤ \
 - conf spark.sql.catalog.dremio.rest.auth.oauth2.client-id=dremio-catalog-cli ⑥ \
 - conf spark.sql.catalog.dremio.rest.auth.oauth2.scope=dremio.all ⑦ \
 - conf spark.sql.catalog.dremio.rest.auth.oauth2.token-exchange.subject-token="$DREMIO_PAT" ⑧ \
 - conf spark.sql.catalog.dremio.rest.auth.oauth2.token-exchange.subject-token-type="$DREMIO_PAT_TYPE"
  1. Provide your Dremio PAT and Dremio cluster IP address
  2. The warehouse must be default
  3. The address of the Dremio Enterprise Catalog service
  4. The address of the Dremio cluster token endpoint
  5. Enable the Token Exchange grant
  6. Provide the client ID (no client secret required)
  7. The scope must be dremio.all
  8. Configure the subject token to use your Dremio PAT

In this demonstration, the Dremio cluster acts as the identity provider. The client exchanges the user’s Personal Access Token (PAT) for an access token issued by Dremio, then uses it to authenticate with Dremio Enterprise Catalog. This process illustrates one of the many ways to use the “Token Exchange” grant type.

Conclusion

The preceding examples offer a brief overview of Dremio Auth Manager’s capabilities. But Dremio Auth Manager offers many more features such as:

And more features are coming, such as client JWT assertions. Stay updated on the GitHub repository for the latest additions and feel free to submit issues for feedback, feature suggestions, or bug reports!

Sign up for AI Ready Data content

Explore the Key Benefits of Dremio Auth Manager for Building an Intelligent, Scalable Lakehouse

Ready to Get Started?

Enable the business to accelerate AI and analytics with AI-ready data products – driven by unified data and autonomous performance.