10 minute read · April 8, 2025
Credential Vending with Iceberg REST Catalogs in Dremio

· Lead Engineer

· Engineer

This blog covers features introduced in Dremio v26
In modern data lakehouse architecture, one key innovation is credential vending for catalogs. This provides end users with a simplified, fast, and efficient way to access data, ensuring governance extends all the way down to the storage layer. This is especially important in industries like banking, manufacturing, government, and other highly regulated sectors where maintaining security from the application layer to storage layer is critical. In this post, we discuss how Dremio supports credential vending for Iceberg tables and dive into the engineering challenges in making it work.
TL;DR
Dremio now supports credential vending for Iceberg REST catalogs, enabling secure, temporary access to cloud storage without managing long-term credentials. Instead of static access keys, Dremio requests and receives short-lived, table-scoped credentials from catalogs like Snowflake Open Catalog, Databricks Unity Catalog, and Apache Polaris.
This approach enhances security and access control, reducing the risk of credential leaks while simplifying integration with Iceberg tables. However, implementing it required major engineering changes, including:
- Transitioning from file-system-level to table-level credentials
- Adapting caching to handle expiring credentials
- Modifying file access logic to trust catalog metadata instead of listing storage locations
By embracing credential vending, Dremio ensures just-in-time, least-privilege access to Iceberg data, aligning with modern zero-trust security principles while making multi-platform data querying seamless
What is Credential Vending?
Credential vending is a security mechanism where the catalog service issues short-lived, scoped credentials to the query engine for accessing table storage. Instead of requiring long-term cloud storage keys or broad access, the engine is “vended” a temporary credential just-in-time for the query. For example, Snowflake’s Open Catalog can hand out a temporary storage credential that lets Dremio read an Iceberg table’s files in S3 or Azure without managing cloud keys directly. Similarly, Unity Catalog vends short-lived credentials via its REST API, inheriting the privileges of the authorized Databricks principal, to grant Dremio access to the data. In short, credential vending centralizes access control in the catalog – if enabled, you no longer need to separately configure cloud storage credentials for each engine.
Credential Vending Process Overview
When Dremio connects to an Iceberg REST catalog that supports credential vending, the high-level process involves a sequence of interactions between Dremio, the catalog, and cloud object storage.
- Catalog Authentication: Dremio authenticates with the Iceberg REST catalog using a service credential (for example, a client ID/secret for Polaris or a personal access token for Unity). This establishes Dremio’s identity and permissions in the catalog.
- Metadata & Credential Request: When a query needs to access a table, Dremio sends a request to the catalog’s REST API (e.g. “Get Table” call). The catalog checks that Dremio’s principal is authorized for that table and prepares the response.
- Catalog Response with Vended Credentials: The catalog returns the table’s metadata (schema, snapshot pointers, etc.) along with a temporary credential for the table’s storage location. This credential might be an AWS STS token, an Azure SAS token, or similar, scoped to the specific table’s data files. It typically has a short expiration time and minimal permissions (only read or write on that table’s file path).
- Data Access using Vended Credential: Dremio uses the vended cloud storage credential to directly read from (or write to) the table’s files in the object store. For example, Dremio can read Iceberg data and manifest files from S3 using the temporary IAM role credentials supplied by the catalog. Because the credential is scoped to the table’s directory, it cannot access other data. Once the query is done (or the credential expires), access is automatically revoked.

Credential Vending Process's Sequence Diagram
This approach ensures just-in-time, least-privilege access. The query engine doesn’t hold long-term access to data; instead it gets what it needs on the fly. If credential vending is not enabled, one would have to manually configure static cloud credentials for Dremio – an approach that is less secure and more cumbersome.
Engineering Challenges in Dremio’s Implementation
Enabling seamless credential vending in Dremio required significant engineering work. Dremio’s plugins for filesystems and catalogs had to be enhanced to handle table-scoped credentials, dynamic authentication, and new failure modes. Below we discuss key challenges and how they were addressed:
Transition from File-Level to Table-Level Credentials
Previous approach: Traditionally, Dremio data sources like S3 or ADLS were configured with file-system level credentials – e.g. an IAM role or access keys that allowed browsing an entire bucket or directory. The authentication was often broad, applying to all datasets under that source.
With credential vending: Dremio now receives credentials scoped to a specific Iceberg table (dataset). Each table may come with a different token or key, valid only for that table’s file path. This required re-architecting how Dremio manages credentials internally. Instead of one set of cloud credentials for an entire source, Dremio must apply per-dataset credeentials when accessing files. For instance, if two Iceberg tables reside in the same bucket, each query uses the credential vended for that particular table’s path, not a general bucket credential. Dremio had to propagate these table-level credentials down to the filesystem layer for all operations on that table’s files.
Caching Strategies for Temporary Credentials
Dremio heavily uses caching (of file metadata, connections, etc.) to improve performance. It is also designed to optimize roundtrip time to the server by reducing redundant requests and minimizing latency. However, temporary credentials add an expiration factor that invalidates old assumptions. A vended credential might only be valid for, say, 1 hour. If Dremio cached a filesystem client or file handle with that credential, a query a few hours later could fail due to an expired token.
To address this, Dremio adapted its caching and filesystem layers to become credential-aware. Cache entries associated with vended credentials are tied to the credential’s lifespan. For example, Dremio caches the dataset, but it must refresh them once the storage credential expires. In practice, Dremio’s job was to ensure that its own layers do not hold onto a credential longer than allowed. The result is a caching strategy that still yields fast performance but respects the ephemeral nature of vended credentials.
Limited Permissions and Trust-Based File Access
By design, vended credentials are down-scoped to minimum privileges. Often, they permit only object reads/writes on specific paths and disallow broader actions like listing directories at higher levels. This posed a challenge: parts of Dremio’s filesystem code (or underlying libraries) might attempt to list a folder or check for file existence as a safety measure. With limited credentials, such operations can fail even if the file is present (because the credential might not have ListFolder or equivalent permission).
Dremio’s solution is a trust-based approach for file discovery. Since the Iceberg catalog already knows what files make up a table (through manifest lists and metadata), Dremio does not need to list storage locations on its own. It trusts the catalog’s metadata. For example, if the metadata says a data file s3://bucket/table/part-00001.parquet exists, Dremio will directly attempt to read that file using the credential, rather than trying to list the s3://bucket/table/ directory. This avoids unnecessary permission errors. Essentially, Dremio delegates file existence and listing logic to the catalog. This required auditing Dremio’s filesystem calls to remove or modify any that assumed full listing capabilities. The engineering trade-off is relying on the catalog’s accuracy, but given that Iceberg metadata is the source of truth for table contents, this is a reasonable design. It also aligns with the zero-trust principle – only do what the credential explicitly allows and nothing more.
Conclusion
Credential vending support in Dremio opens up a more secure and convenient way to query external Iceberg catalogs. By obtaining temporary, table-scoped credentials on the fly, Dremio minimizes long-lived secrets and ensures access is tightly controlled by the catalog’s policies. Implementing this capability required rethinking credential management from the ground up – shifting to dataset-level auth, adjusting caching, trusting catalog metadata, and integrating with new token-based auth flows. The payoff is a smoother integration with emerging Iceberg catalog services like Open Catalog, Unity Catalog, and Apache Polaris, empowering Dremio users to easily query data across platforms without sacrificing security or governance. With these engineering enhancements, Dremio continues to embrace the open data ecosystem, allowing data teams to leverage the best of multiple platforms in a unified way.
Sign up for AI Ready Data content