Featured Articles
Popular Articles
-
Dremio Blog: Open Data Insights
Looking back the last year in Lakehouse OSS: Advances in Apache Arrow, Iceberg & Polaris (incubating)
-
Dremio Blog: Open Data Insights
Scaling Data Lakes: Moving from Raw Parquet to Iceberg Lakehouses
-
Dremio Blog: Various Insights
Partition Bucketing – Improving query performance when filtering on a high-cardinality column
-
Dremio Blog: Various Insights
The Growing Apache Polaris Ecosystem (The Growing Apache Iceberg Catalog Standard)
Browse All Blog Articles
-
Dremio Blog: Open Data Insights
Looking back the last year in Lakehouse OSS: Advances in Apache Arrow, Iceberg & Polaris (incubating)
The direction is clear: the open lakehouse is no longer about choosing between flexibility and performance, or between innovation and governance. With Arrow, Iceberg, and Polaris maturing side by side, and with Dremio leading the charge, the open lakehouse has become a complete, standards-driven foundation for modern analytics. For enterprises seeking both freedom and power, this is the moment to embrace it. -
Dremio Blog: Open Data Insights
Scaling Data Lakes: Moving from Raw Parquet to Iceberg Lakehouses
Apache Iceberg closed that gap by transforming collections of Parquet files into true tables, complete with ACID transactions, schema flexibility, and time travel capabilities. And with Apache Polaris sitting on top as a catalog, organizations finally have a way to manage all those Iceberg tables consistently, delivering centralized access, discovery, and governance across every tool in the stack. -
Dremio Blog: Various Insights
Partition Bucketing – Improving query performance when filtering on a high-cardinality column
Introduction Dremio can automatically take advantage of partitioning on parquet data sets (or derivatives such as Iceberg or Delta Lake). By understanding the dataset’s partitioning, Dremio can perform partition pruning, the process of excluding irrelevant partitions of data during the query optimisation phase, to boost query performance. (See Data Partition Pruning). Partition bucketing provides a […] -
Dremio Blog: Various Insights
The Growing Apache Polaris Ecosystem (The Growing Apache Iceberg Catalog Standard)
What makes Polaris especially exciting is the trajectory it’s on. Today, it is a powerful, open catalog for Iceberg tables. Tomorrow, it could serve as the central control plane for managing a full range of lakehouse assets, unifying governance, access, and interoperability across an increasingly complex data ecosystem. -
Engineering Blog
Column Nullability Constraints in Dremio
Column nullability serves as a safeguard for reliable data systems. Apache Iceberg's capabilities in enforcing and evolving nullability rules are crucial for ensuring data quality. Understanding the null, along with the specifics of engine support, is essential for constructing dependable data systems. -
Product Insights from the Dremio Blog
Dremio Reflections – The Journey to Autonomous Query Acceleration
Reflections are a query acceleration functionality unique to Dremio, that work by minimising data processing times and reducing computational workloads. Debuting in the early days of Dremio, Reflections accelerate data lake queries by creating optimised Apache Iceberg data structures from file-based datasets, delivering orders-of-magnitude performance improvements. However, the game-changing aspect of this technology was not […] -
Dremio Blog: Various Insights
Optimizing Apache Iceberg Tables – Manual and Automatic
When combined with Dremio’s query acceleration, unified semantic layer, and zero-ETL data federation, Enterprise Catalog creates a truly self-managing data platform—one where optimization is just something that happens, not something you have to think about. -
Dremio Blog: Various Insights
Optimizing Apache Iceberg for Agentic AI
By using Dremio as the data gateway, organizations improve security, reduce complexity, and give their agents the reliable, performant access they need—without reinventing the data stack. This frees developers to focus less on credentials, connectors, and workarounds, and more on building the intelligent workflows that drive business impact. -
Product Insights from the Dremio Blog
Realising the Self-Service Dream with Dremio & MCP
A promise of self-service data platforms, such as the Data Lakehouse, is to democratise data. The idea is that they empower business users (BUs), those with little or no technical expertise, to access, prep, and analyse data for themselves. With the right platform and tools your subject matter experts can take work away from your […] -
Product Insights from the Dremio Blog
5 Ways Dremio Makes Apache Iceberg Lakehouses Easy
Dremio simplifies all of it. By bringing together query federation, an integrated Iceberg catalog, a built-in semantic layer, autonomous performance tuning, and flexible deployment options, Dremio makes it easier to build and run a lakehouse without stitching together multiple tools. -
Product Insights from the Dremio Blog
Who Benefits From MCP on an Analytics Platform?
The MCP Server is a powerful alternative to the command line or UI for interacting with Dremio. But can only data analysts benefit from this transformative technology? -
Dremio Blog: Open Data Insights
Celebrating the Release of Apache Polaris (Incubating) 1.0
With the release of Apache Polaris 1.0, the data ecosystem takes a meaningful step forward in establishing a truly open, interoperable, and production-ready metadata catalog for Apache Iceberg. Polaris brings together the reliability enterprises expect with the openness developers and data teams need to innovate freely. -
Dremio Blog: Open Data Insights
Quick Start with Apache Iceberg and Apache Polaris on your Laptop (quick setup notebook environment)
By following the steps in this guide, you now have a fully functional Iceberg and Polaris environment running locally. You have seen how to spin up the services, initialize the catalog, configure Spark, and work with Iceberg tables. Most importantly, you have set up a pattern that closely mirrors what modern data platforms are doing in production today. -
Engineering Blog
Query Results Caching on Iceberg Tables
Seamless result cache for Iceberg was enabled for all Dremio Cloud organizations in May 2025. Since then, our telemetry has told us between 10% to 50% of a single project’s queries have been accelerated by result cache. That’s a huge cost saving on executors. Looking forward, Dremio is doing research on how to bring its reflection matching query re-write capabilities to the result cache. For example, once a user generates a result cache entry, it should be possible to trim, filter, sort and roll up from this result cache. Limiting the search space and efficient matching through hashes will be key features to make matching on result cache possible. Stay tuned for more! -
Product Insights from the Dremio Blog
Test Driving MCP: Is Your Data Pipeline Ready to Talk?
Back in April of this year Dremio debuted its own MCP server, giving the LLM of your choice intelligent access to Dremio’s powerful lakehouse platform. With the Dremio MCP Server the LLM knows how to interact with Dremio; facilitating authentication, executing requests against the Dremio environment, and returning results to the LLM. The intention is […]
- 1
- 2
- 3
- …
- 31
- Next Page »