Dremio Blog: Open Data Insights
-
Dremio Blog: Open Data Insights
The Release of Apache Polaris 1.3.0 (Incubating): Improvements to catalog federation, handling non-Apache Iceberg datasets and more
Apache Polaris 1.3.0 shows steady progress in how the project handles security, storage, events, and development workflow. The release tightens access control, expands format coverage, and offers clearer audit signals. It also brings faster builds and cleaner client tools. Each change supports a stable and flexible catalog that can serve many engines and many teams. -
Dremio Blog: Open Data InsightsIngesting Data into Apache Iceberg Using Python Tools with Dremio Catalog
In this blog you will learn how to connect each tool to a REST catalog like Dremio Catalog, using bearer tokens and vended credentials to keep your pipelines secure and portable. -
Dremio Blog: Open Data InsightsUnderstanding Dremio Cloud MCP Servers and How to Use Them
You can move unstructured content into Iceberg tables with AI functions. You can use Dremio’s integrated AI agent for natural language exploration. You can connect external assistants through MCP to build multi-step workflows. All these pieces work together. They give data teams a clear path from raw information to AI-powered insights that stay accurate and trustworthy. -
Dremio Blog: Open Data InsightsData management for AI: Tools and best practices
AI data management is the practice of preparing, organizing, governing, and serving enterprise data so it can be used effectively by AI models and agents. It includes collecting data from multiple systems, maintaining high data quality, enforcing governance, and delivering fast, consistent access to that data for training and inference. -
Dremio Blog: Open Data Insights
What is AI-ready data? Definition and architecture
AI-ready data is structured, governed, and accessible in a way that supports machine learning, large language models (LLMs), and real-time intelligent agents. Unlike traditional analytics data, it’s not just clean, it’s optimized for rapid, automated decision-making at scale. AI-ready data supports diverse formats, is accessible without ETL, and maintains the context required to train and operate intelligent systems. -
Dremio Blog: Open Data Insights
What’s New in Apache Polaris 1.2.0: Fine-Grained Access, Event Persistence, and Better Federation
Apache Polaris 1.2.0 continues to make the case for a fully open, production-grade Iceberg catalog. These changes reflect real-world needs: better control, stronger security, broader compatibility, and early hooks for observability. As Iceberg adoption grows, Polaris is becoming the default choice for teams who want to avoid vendor lock-in while building modern lakehouse infrastructure. Whether you’re using Dremio Catalog or deploying Polaris yourself, this release brings features that support scale, safety, and flexibility. -
Dremio Blog: Open Data Insights
Exploring the Evolving File Format Landscape in AI Era: Parquet, Lance, Nimble and Vortex And What It Means for Apache Iceberg
Formats like Lance, Nimble, and Vortex show that innovation is accelerating. Each one introduces new ideas about how to structure, index, and access data. But none of them exist in isolation. They need to integrate with engines, catalogs, and governance layers to be truly useful. -
Dremio Blog: Open Data Insights
Try Apache Polaris (incubating) on Your Laptop with Minio
Running Polaris locally with MinIO gives you a hands-on view of how an open catalog governs Iceberg tables. But when it’s time to move beyond experimentation, deploying and managing a full Polaris environment can take more effort. That’s where Dremio’s integrated catalog comes in. Dremio’s catalog is built directly on Apache Polaris. It delivers all the same open governance, authentication, and interoperability while removing the setup work. You get a production-ready Polaris deployment with a complete user interface, automated security management, and fine-grained access controls, all without maintaining separate services. -
Dremio Blog: Open Data Insights
What’s New in Apache Iceberg 1.10.0, and what comes next!
Apache Iceberg 1.10.0 represents a turning point in the evolution of the open lakehouse. With the general availability of format-version 3, Iceberg now offers a more complete solution for organizations seeking the flexibility of data lakes combined with the reliability of data warehouses. Features like binary deletion vectors, default column values, and row-level lineage aren’t just incremental improvements, they redefine what’s possible in managing massive, ever-changing datasets. -
Dremio Blog: Open Data Insights
Looking back the last year in Lakehouse OSS: Advances in Apache Arrow, Iceberg & Polaris (incubating)
The direction is clear: the open lakehouse is no longer about choosing between flexibility and performance, or between innovation and governance. With Arrow, Iceberg, and Polaris maturing side by side, and with Dremio leading the charge, the open lakehouse has become a complete, standards-driven foundation for modern analytics. For enterprises seeking both freedom and power, this is the moment to embrace it. -
Dremio Blog: Open Data Insights
Scaling Data Lakes: Moving from Raw Parquet to Iceberg Lakehouses
Apache Iceberg closed that gap by transforming collections of Parquet files into true tables, complete with ACID transactions, schema flexibility, and time travel capabilities. And with Apache Polaris sitting on top as a catalog, organizations finally have a way to manage all those Iceberg tables consistently, delivering centralized access, discovery, and governance across every tool in the stack. -
Dremio Blog: Open Data Insights
Apache Polaris Releases Version 1.1.0 (Better Federation, Minio Support and more)
Support for Hive Metastore federation and modular catalog integration makes Polaris more adaptable to real-world data environments. Enhancements to external authentication and Helm-based deployment reduce friction for teams operating in secure, regulated environments. And with expanded support for S3-compatible storage, the catalog can now accompany your lakehouse architecture into hybrid and edge deployments without compromise. -
Dremio Blog: Open Data Insights
Celebrating the Release of Apache Polaris (Incubating) 1.0
With the release of Apache Polaris 1.0, the data ecosystem takes a meaningful step forward in establishing a truly open, interoperable, and production-ready metadata catalog for Apache Iceberg. Polaris brings together the reliability enterprises expect with the openness developers and data teams need to innovate freely. -
Dremio Blog: Open Data Insights
Quick Start with Apache Iceberg and Apache Polaris on your Laptop (quick setup notebook environment)
By following the steps in this guide, you now have a fully functional Iceberg and Polaris environment running locally. You have seen how to spin up the services, initialize the catalog, configure Spark, and work with Iceberg tables. Most importantly, you have set up a pattern that closely mirrors what modern data platforms are doing in production today. -
Dremio Blog: Open Data Insights
Benchmarking Framework for the Apache Iceberg Catalog, Polaris
The Polaris benchmarking framework provides a robust mechanism to validate performance, scalability, and reliability of Polaris deployments. By simulating real-world workloads, it enables administrators to identify bottlenecks, verify configurations, and ensure compliance with service-level objectives (SLOs). The framework’s flexibility allows for the creation of arbitrarily complex datasets, making it an essential tool for both development and production environments.
- 1
- 2
- 3
- …
- 12
- Next Page »