Dremio Blog: Open Data Insights
-
Dremio Blog: Open Data Insights
What’s New in Apache Iceberg 1.10.0, and what comes next!
Apache Iceberg 1.10.0 represents a turning point in the evolution of the open lakehouse. With the general availability of format-version 3, Iceberg now offers a more complete solution for organizations seeking the flexibility of data lakes combined with the reliability of data warehouses. Features like binary deletion vectors, default column values, and row-level lineage aren’t just incremental improvements, they redefine what’s possible in managing massive, ever-changing datasets. -
Dremio Blog: Open Data Insights
Looking back the last year in Lakehouse OSS: Advances in Apache Arrow, Iceberg & Polaris (incubating)
The direction is clear: the open lakehouse is no longer about choosing between flexibility and performance, or between innovation and governance. With Arrow, Iceberg, and Polaris maturing side by side, and with Dremio leading the charge, the open lakehouse has become a complete, standards-driven foundation for modern analytics. For enterprises seeking both freedom and power, this is the moment to embrace it. -
Dremio Blog: Open Data Insights
Scaling Data Lakes: Moving from Raw Parquet to Iceberg Lakehouses
Apache Iceberg closed that gap by transforming collections of Parquet files into true tables, complete with ACID transactions, schema flexibility, and time travel capabilities. And with Apache Polaris sitting on top as a catalog, organizations finally have a way to manage all those Iceberg tables consistently, delivering centralized access, discovery, and governance across every tool in the stack. -
Dremio Blog: Open Data Insights
Apache Polaris Releases Version 1.1.0 (Better Federation, Minio Support and more)
Support for Hive Metastore federation and modular catalog integration makes Polaris more adaptable to real-world data environments. Enhancements to external authentication and Helm-based deployment reduce friction for teams operating in secure, regulated environments. And with expanded support for S3-compatible storage, the catalog can now accompany your lakehouse architecture into hybrid and edge deployments without compromise. -
Dremio Blog: Open Data Insights
Celebrating the Release of Apache Polaris (Incubating) 1.0
With the release of Apache Polaris 1.0, the data ecosystem takes a meaningful step forward in establishing a truly open, interoperable, and production-ready metadata catalog for Apache Iceberg. Polaris brings together the reliability enterprises expect with the openness developers and data teams need to innovate freely. -
Dremio Blog: Open Data Insights
Quick Start with Apache Iceberg and Apache Polaris on your Laptop (quick setup notebook environment)
By following the steps in this guide, you now have a fully functional Iceberg and Polaris environment running locally. You have seen how to spin up the services, initialize the catalog, configure Spark, and work with Iceberg tables. Most importantly, you have set up a pattern that closely mirrors what modern data platforms are doing in production today. -
Dremio Blog: Open Data Insights
Benchmarking Framework for the Apache Iceberg Catalog, Polaris
The Polaris benchmarking framework provides a robust mechanism to validate performance, scalability, and reliability of Polaris deployments. By simulating real-world workloads, it enables administrators to identify bottlenecks, verify configurations, and ensure compliance with service-level objectives (SLOs). The framework’s flexibility allows for the creation of arbitrarily complex datasets, making it an essential tool for both development and production environments. -
Dremio Blog: Open Data Insights
Why Dremio co-created Apache Polaris, and where it’s headed
Polaris is a next-generation metadata catalog, born from real-world needs, designed for interoperability, and open-sourced from day one. It’s built for the lakehouse era, and it’s rapidly gaining momentum as the new standard for how data is managed in open, multi-engine environments. -
Dremio Blog: Open Data Insights
Understanding the Value of Dremio as the Open and Intelligent Lakehouse Platform
With Dremio, you’re not locked into a specific vendor’s ecosystem. You’re not waiting on data engineering teams to build yet another pipeline. You’re not struggling with inconsistent definitions across departments. Instead, you’re empowering your teams to move fast, explore freely, and build confidently, on a platform that was designed for interoperability from day one. -
Dremio Blog: Open Data Insights
Extending Apache Iceberg: Best Practices for Storing and Discovering Custom Metadata
By using properties, Puffin files, and REST catalog APIs wisely, you can build richer, more introspective data systems. Whether you're developing an internal data quality pipeline or a multi-tenant ML feature store, Iceberg offers clean integration points that let metadata travel with the data. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 10 – Sampling and Prompts in MCP — Making Agent Workflows Smarter and Safer
That’s where Sampling comes in. And what if you want to give the user — or the LLM — reusable, structured prompt templates for common workflows? That’s where Prompts come in. In this final post of the series, we’ll explore: How sampling allows servers to request completions from LLMs How prompts enable reusable, guided AI interactions Best practices for both features Real-world use cases that combine everything we’ve covered so far -
Dremio Blog: Open Data Insights
The Case for Apache Polaris as the Community Standard for Lakehouse Catalogs
The future of the lakehouse depends on collaboration. Apache Polaris embodies the principles of openness, vendor neutrality, and enterprise readiness that modern data platforms demand. By aligning around Polaris, the data community can reduce integration friction, encourage ecosystem growth, and give organizations the freedom to innovate without fear of vendor lock-in. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 9 – Tools in MCP — Giving LLMs the Power to Act
Tools are executable functions that an LLM (or the user) can call via the MCP client. Unlike resources — which are passive data — tools are active operations. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 8 – Resources in MCP — Serving Relevant Data Securely to LLMs
One of MCP’s most powerful capabilities is its ability to expose resources to language models in a structured, secure, and controllable way. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 7 – Under the Hood — The Architecture of MCP and Its Core Components
By the end, you’ll understand how MCP enables secure, modular communication between LLMs and the systems they need to work with.
- 1
- 2
- 3
- …
- 11
- Next Page »