Dremio Revolutionizes Data Lakehouse Engine with Cutting-Edge Features

June 29, 2023

Dremio announced robust new features that enhance the performance and versatility of its data platform. These new capabilities empower organizations to accelerate their data analytics and enable faster, more efficient decision-making. Dremio is ensuring easy self-service analytics-with data warehouse functionality and data lake flexibility-across customer data.

Among the key features unveiled by Dremio are querying, performance, and compatibility enhancements that include:

  • Effortless Iceberg table optimization: Data teams no longer need to be concerned about how a table is physically stored on object storage, including file counts, file sizes, statistics, repartitioning and more. Dremio now offers SQL commands such as OPTIMIZE, ROLLBACK & VACUUM. These commands optimize performance and streamline data lake management. The OPTIMIZE command improves query performance by optimizing data layout and statistics, while the ROLLBACK command enables users to revert any unintended changes made to their data. Additionally, the VACUUM command reclaims storage space by removing unnecessary data files.
  • 40% better data compression: Dremio now supports native Zstandard (zstd) compression, offering an improvement of up to 40% on compression ratios and decompression speeds. This feature enables users to optimize storage utilization and improve query performance, all while reducing operational costs.
  • Tabular UDFs: Tabular User-Defined Functions enable users to extend the native capabilities of Dremio SQL and provide a layer of abstraction to simplify query construction. This allows users to create functions that can serve as native row and column policies, empowering data analysts and engineers to easily build complex calculations, transformations and advanced analytics that unlock new possibilities for data-driven insights.
  • New mapping SQL functions: CARDINALITY returns the number of elements in a map or list and helps customers moving array workloads from Presto and Athena; ST_GEOHASH returns the corresponding geohash for the given latitude and longitude coordinates; FROM_GEOHASH returns the latitude and longitude coordinates of the center of the given geohash. Both geohash functions help customers move workloads from Snowflake, Amazon Redshift, Databricks, and Vertica. Geohashing guarantees that the longer a shared prefix between two geohashes is, the spatially closer they are together.
  • Enhanced Delta Lake support: Dremio now supports multiple Delta Lake catalogs including Hive Metastore and AWS Glue. This allows seamless integration with existing Delta Lake-based workflows, providing a unified data lake experience across the organization.

"With these new key features, Dremio continues to provide the most powerful and flexible data lakehouse engine on the market," said Tomer Shiran, founder and CPO at Dremio. "We are excited to empower our customers with capabilities that make lakehouses easier than ever, and allow companies to replace their expensive and proprietary cloud data warehouses with modern and open data architectures."

These key features further solidify Dremio's position as a leader in the data lakehouse engine space, enabling organizations to efficiently analyze, transform, and derive insights from their data at scale.

Read the full story here.

get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.