Dremio Jekyll


Announcing Dremio March 2021

Apr 8, 2021
Anita Thomas

Today we are excited to announce Dremio’s March 2021 release. This release adds several features such as caching query plans and improved runtime filtering that enhance the platform’s performance and security.

Caching of Query Plans

Dremio now caches physical query plans to deliver up to 2x improvement in query performance.

When a query is run for the first time in Dremio, it is compiled and a query plan is generated. Every query requires a query plan before it is executed. With this new feature, the physical plan for a query is stored in a query plan cache after it is run for the first time. As a result, when the same query is run again, Dremio does not need to create another query plan and instead uses the cached query plan. This improves performance, particularly for queries that need to be run multiple times and have relatively long planning times. For example, BI dashboard queries are often executed many times as the dashboard is accessed and refreshed by multiple users.

The chart below shows improvement in overall query times for select TPC-H queries with plan caching enabled. In this benchmark, reflections are used to accelerate the queries, resulting in high-performance, low-latency execution. In this scenario, the planning time is a non-negligible component of the overall query runtime. The result is an improvement of 40-50% with plan caching enabled.

The chart below further breaks down query times to show only the plan component of the query time. As illustrated, with caching enabled, plan times for all queries improve by 70-90%. This improvement in plan times results in the 40-50% improvement in overall query times shown in the previous chart.

The duration for which a query plan stays in the cache depends on how often the query is executed. Query plans that are used often remain in the cache longer than infrequently-used query plans. Cached plans are automatically invalidated when the underlying dataset is changed, its metadata is updated or its reflection is refreshed.

Caching of query plans is currently a preview feature. Please contact Dremio support to enable this feature for your deployment.

Improved Runtime Filtering

With this release, Dremio enhances the performance of queries with complex joins by extending runtime filtering to support additional operators.

Runtime filtering enables Dremio to dynamically apply filters from a smaller joined table to a larger table to enhance filtering on larger tables. Dremio automatically applies these filters on joins without any user involvement and provides up to 100x improved performance when working with traditional star or snowflake schemas.

This release extends the benefits of runtime filtering to additional operators such as renames, aggregates, outer joins and unions to offer enhanced performance across these more complex joins. Runtime filtering is now also supported when working with Delta Lake tables.

Enhanced Security

The Dremio JDBC driver now supports automatic detection of the system trust store when SSL is enabled between BI clients and Dremio. The new useSystemTrustStore property allows the JDBC driver to automatically pick the right trust store based on the operating system — Keychain on macOS or Windows-ROOT on Windows. This capability helps simplify client configuration when TLS is enabled between BI tools and Dremio.

With this release, Dremio also provides the ability to clear cached authorizations for LDAP users. Dremio caches information for LDAP users who have successfully authenticated using Active Directory so it does not need to perform an LDAP lookup for every query. However, when a user is deleted from Active Directory, the user’s information stays in the cache until Dremio performs a resync with LDAP. With this new capability, admins can immediately purge user information from Dremio’s cache after a user has been deleted to ensure the user can no longer login or run queries in Dremio. See the Dremio API guide for more information on how to use the invalidate LDAP user cache API.

Conclusion

We are very excited about our March release and the performance and security benefits it offers. Please review the release notes for a complete list of new features, enhancements, changes, and fixes.

As always, we look forward to your feedback. Please post any questions on our community forum and we’ll do our best to answer them there, along with other members of the Dremio community.

Ready to get started?