Dremio Blog: Open Data Insights
-
Dremio Blog: Open Data Insights
Quick Start with Apache Iceberg and Apache Polaris on your Laptop (quick setup notebook environment)
By following the steps in this guide, you now have a fully functional Iceberg and Polaris environment running locally. You have seen how to spin up the services, initialize the catalog, configure Spark, and work with Iceberg tables. Most importantly, you have set up a pattern that closely mirrors what modern data platforms are doing in production today. -
Dremio Blog: Open Data Insights
Benchmarking Framework for the Apache Iceberg Catalog, Polaris
The Polaris benchmarking framework provides a robust mechanism to validate performance, scalability, and reliability of Polaris deployments. By simulating real-world workloads, it enables administrators to identify bottlenecks, verify configurations, and ensure compliance with service-level objectives (SLOs). The framework’s flexibility allows for the creation of arbitrarily complex datasets, making it an essential tool for both development and production environments. -
Dremio Blog: Open Data Insights
Why Dremio co-created Apache Polaris, and where it’s headed
Polaris is a next-generation metadata catalog, born from real-world needs, designed for interoperability, and open-sourced from day one. It’s built for the lakehouse era, and it’s rapidly gaining momentum as the new standard for how data is managed in open, multi-engine environments. -
Dremio Blog: Open Data Insights
Understanding the Value of Dremio as the Open and Intelligent Lakehouse Platform
With Dremio, you’re not locked into a specific vendor’s ecosystem. You’re not waiting on data engineering teams to build yet another pipeline. You’re not struggling with inconsistent definitions across departments. Instead, you’re empowering your teams to move fast, explore freely, and build confidently, on a platform that was designed for interoperability from day one. -
Dremio Blog: Open Data Insights
Extending Apache Iceberg: Best Practices for Storing and Discovering Custom Metadata
By using properties, Puffin files, and REST catalog APIs wisely, you can build richer, more introspective data systems. Whether you're developing an internal data quality pipeline or a multi-tenant ML feature store, Iceberg offers clean integration points that let metadata travel with the data. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 10 – Sampling and Prompts in MCP — Making Agent Workflows Smarter and Safer
That’s where Sampling comes in. And what if you want to give the user — or the LLM — reusable, structured prompt templates for common workflows? That’s where Prompts come in. In this final post of the series, we’ll explore: How sampling allows servers to request completions from LLMs How prompts enable reusable, guided AI interactions Best practices for both features Real-world use cases that combine everything we’ve covered so far -
Dremio Blog: Open Data Insights
The Case for Apache Polaris as the Community Standard for Lakehouse Catalogs
The future of the lakehouse depends on collaboration. Apache Polaris embodies the principles of openness, vendor neutrality, and enterprise readiness that modern data platforms demand. By aligning around Polaris, the data community can reduce integration friction, encourage ecosystem growth, and give organizations the freedom to innovate without fear of vendor lock-in. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 9 – Tools in MCP — Giving LLMs the Power to Act
Tools are executable functions that an LLM (or the user) can call via the MCP client. Unlike resources — which are passive data — tools are active operations. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 8 – Resources in MCP — Serving Relevant Data Securely to LLMs
One of MCP’s most powerful capabilities is its ability to expose resources to language models in a structured, secure, and controllable way. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 7 – Under the Hood — The Architecture of MCP and Its Core Components
By the end, you’ll understand how MCP enables secure, modular communication between LLMs and the systems they need to work with. -
Dremio Blog: Open Data Insights
Journey from AI to LLMs and MCP – 6 – Enter the Model Context Protocol (MCP) — The Interoperability Layer for AI Agents
What if we had a standard that let any agent talk to any data source or tool, regardless of where it lives or what it’s built with? That’s exactly what the Model Context Protocol (MCP) brings to the table. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 5 – AI Agent Frameworks — Benefits and Limitations
Enter agent frameworks — open-source libraries and developer toolkits that let you create goal-driven AI systems by wiring together models, memory, tools, and logic. These frameworks enable some of the most exciting innovations in the AI space… but they also come with trade-offs. -
Dremio Blog: Open Data Insights
What’s New in Apache Iceberg Format Version 3?
Now, with the introduction of format version 3, Iceberg pushes the boundaries even further. V3 is designed to support more diverse and complex data types, offer greater control over schema evolution, and deliver performance enhancements suited for large-scale, high-concurrency environments. This blog explores the key differences between V1, V2, and the new V3, highlighting what makes V3 a significant step forward in Iceberg's evolution. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 4 – What Are AI Agents — And Why They’re the Future of LLM Applications
We’ve explored how Large Language Models (LLMs) work, and how we can improve their performance with fine-tuning, prompt engineering, and retrieval-augmented generation (RAG). These enhancements are powerful — but they’re still fundamentally stateless and reactive. -
Dremio Blog: Open Data Insights
A Journey from AI to LLMs and MCP – 3 – Boosting LLM Performance — Fine-Tuning, Prompt Engineering, and RAG
this post, we’ll walk through the three most popular and practical ways to boost the performance of Large Language Models (LLMs): Fine-tuning Prompt engineering Retrieval-Augmented Generation (RAG) Each approach has its strengths, trade-offs, and ideal use cases. By the end, you’ll know when to use each — and how they work under the hood.
- 1
- 2
- 3
- …
- 11
- Next Page »