8 minute read · January 21, 2022

Announcing the Dremio 20.0 Release

Jason Hughes

Jason Hughes · Director of Technical Advocacy, Dremio

Today, we’re excited to announce the Dremio 20.0 release!

This month’s release introduces major enhancements across the platform, including extensive audit logging, support for Apache Iceberg tables that use the AWS Glue catalog, Apache Ranger row filter and column masking support, and SSO support in Power BI for Azure AD.

Extensive Audit Logging

Every organization and administrator needs to have visibility into who is doing what and when. This is even more important to organizations subject to compliance and regulation where auditing is regularly required.

Dremio has long provided organizations the ability to see who queried what and when in Dremio. With the 20.0 release, organizations now have a fuller view of actions taken in the system, showing them exactly who did what and when in Dremio. This allows getting answers to questions such as “who changed the definition of this reflection?” and “when did this virtual dataset change?” 

Each time a user performs an altering action within Dremio, such as creating an object or running a query, the audit log captures the user’s ID and username, object(s) affected, action performed and event type, SQL statements used, and more. These events are logged in a standard JSON structure, so these audit logs can be analyzed by Dremio using SQL or a dashboarding tool.

Here’s an example of an event where a reflection was deleted:

{
  "timestamp": "2022-01-15 21:02:08,749",
  "userContext": {
    "userId": "41ee28c5-2f0a-4d7d-a46d-c92c0a71e977",
    "userName": "jason"
  },
  "status": "OK",
  "eventType": "REFLECTION",
  "action": "DELETE",
  "details": {
    "reflectionId": "cb7632a8-79df-46eb-ab9b-a0c2eb77dfa4",
    "name": "Aggregation Reflection",
    "dataset": "60c85be9-59dc-4aea-a4fb-759306915fb2",
    "type": "AGGREGATION",
    "sortColumns": [],
    "partitionColumns": [],
    "distributionColumns": [],
    "dimensions": [
      {
        "name": "pickup_datetime",
        "granularity": "DATE"
      }
    ],
    "measures": [
      {
        "name": "total_amount",
        "measureType": [
          "COUNT",
          "SUM"
        ]
      },
      {
        "name": "tip_amount",
        "measureType": [
          "COUNT",
          "SUM"
        ]
      }
    ],
    "displayColumns": [],
    "partitiondistributionstrategy": "CONSOLIDATED",
    "arrowCachingEnabled": false,
    "targetDataset": "",
    "state": "DELETED"
  }
}

Audit logging is an enterprise edition feature and is enabled by default when upgrading to 20.0+.

For more information, see the documentation page on audit logging.

Support for Apache Iceberg tables that leverage AWS Glue as catalog (preview)

Apache Iceberg is an open source table format being rapidly adopted for analytic datasets. Iceberg brings data warehouse-like capabilities and more directly on the data lake for any engine to leverage, such as transactions and partition evolution.

The vast majority of an Iceberg table’s data is stored in the data lake. However, to provide consistency guarantees you’ll need to store a file path attribute in a data store that provides strong consistency guarantees, and most data lakes aren’t designed for that. So, you’ll need to store this file path attribute in a data store that has stronger consistency guarantees, called an Iceberg catalog. Hive Metastore and Project Nessie are two examples of Iceberg catalogs. One catalog that enables organizations to take advantage of the benefits of Iceberg without having to manage any infrastructure is AWS Glue.

With the 20.0 release, Dremio now has preview support for Iceberg tables that leverage AWS Glue as their catalog. Users in Dremio can leverage these Iceberg tables just like any other table in the configured AWS Glue source - just select the table in the Glue source, and query away. This capability is enabled by default.

Here’s a tutorial on getting started with Iceberg tables, AWS Glue, and Dremio.

If you’re interested in learning more about Iceberg or interacting with the Iceberg community, Subsurface LIVE Winter 2022, coming up on March 2-3, is a perfect opportunity. Register now to hear from the co-creator of Apache Iceberg and many others who develop it and leverage it at companies like Apple and Bloomberg.

Ranger Row Filter & Column Masking Support

Organizations often leverage Dremio’s Apache Ranger integration to manage table authorizations in a single central location, as well as define row filter and column masking policies in Dremio. 

Now with the 20.0 release, Dremio extends this table-level Ranger integration to also support row filter and column masking policies centrally defined in Apache Ranger. This support includes all of Ranger’s built-in policies for column masking and custom policies.

For more information, see the documentation page on this capability.

SSO Support in Power BI for Azure AD

At Dremio, we strive to provide users with a delightful and frictionless BI experience, often partnering with companies like Microsoft towards that goal. As another step towards that continuous goal, the Dremio 20.0 release provides users with a seamless SSO workflow when using Power BI against Dremio environments configured to use Azure Active Directory for authentication.

Simply connect Power BI to Dremio, but instead of having to input your username and password, your browser will automatically open to your organization’s Azure AD login page, where after logging in just like you do with your organization’s other applications, you’re good to go - just start analyzing your data in Power BI.

This capability is supported starting with the December 2021 version of Power BI Desktop and Gateway, with support for Power BI Online currently targeted for February.

Learn More

For a complete list of new features, enhancements, changes and fixes, please review the release notes. As always, we look forward to your feedback. Please post any questions or comments on our community site.

P.S. – We recently opened registration for Subsurface LIVE Winter 2022 where you can hear speakers such as Bill Inmon, the father of data warehousing, speakers from companies like Apple, Uber, and Fannie Mae, and many more speakers to be announced. Register now — we’d love to see you there!

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.