Dremio is now part of SAP

Learn more About SAP

The Dremio Blog

Featured Articles

Popular Articles

Browse All Blog Articles

All
Dremio Blog: News Highlights
Dremio Blog: Open Data Insights
Dremio Blog: Partnerships Unveiled
Engineering Blog
Product Insights from the Dremio Blog

Dremio Blog: News Highlights

Apache Ossie (Incubating): The New Name for Open Semantic Interchange

Apache Ossie is currently undergoing incubation at The Apache Software Foundation (ASF). If you've been following the Open Semantic Interchange project — the open specification for semantic layer and ontology — there's an important update. The project has been accepted into the Apache Incubator under a new name: Apache Ossie (Incubating). The spec, the community, and […]
Dremio Blog: Open Data Insights

How Data Lake Table Storage Degrades Over Time

An Iceberg table that works well on day one will not work well on day 365 without maintenance. Every append, update, and delete operation adds files and metadata.

Alex Merced
Dremio Blog: Various Insights

What’s The Deal With Apache Parquet?

Apache Parquet is the recommended file format used in every modern data platform, and for good reason. But what are those reasons? And would it really matter if you stuck with CSV? The short answer is "YES". The slightly longer answer is "Yes, because columns". The full answer is below, so keep on reading to […]

Will Martin
Dremio Blog: Open Data Insights

When Catalogs Are Embedded in Storage

This article examines a newer approach: embedding the catalog directly inside the storage layer. Traditional Iceberg architectures have three components: the query engine, a standalone catalog, and object storage.

Alex Merced
Dremio Blog: Various Insights

The Semantic Layer: From Human Shortcut to Agent Guardrail

For most of its history, the semantic layer was considered a solved problem. You built it once, business users queried it wherever it lived, and (hopefully) everyone would agreed on what "revenue" meant. However, much like the information in your data dictionary, the popularity of the semantic layer went stale and businesses turned to new, […]

Will Martin
Dremio Blog: Various Insights

Dremio ELT: Load, Transform, and Govern Data Without Leaving the Lakehouse

Data pipelines used to require a lot of infrastructure to keep running: separate compute for transformation, staging layers between systems, and a growing stack of tools to manage it all. Dremio changes the equation. With native ingestion, flexible transformation, and AI-assisted pipeline development, teams can build and operate end-to-end ELT workflows directly in the lakehouse, […]

Mark Shainman
Dremio Blog: Various Insights

Why AI Agents Need a CLI, Not Just an MCP Server

Most conversations about AI and data platforms start with MCP. That's understandable: the Model Context Protocol has become the standard way to give AI agents a window into a data system, and Dremio's MCP server does this well. But MCP solves the specific problem of giving agents a supervised, conversational interface to your data. What […]

Will Martin
Dremio Blog: Open Data Insights

What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg

A lakehouse catalog is the component that answers one question: "Where is the current metadata for this table?" Without a catalog, every engine would need to independently locate and track metadata files. With a catalog, there is a single source of truth that coordinates reads, writes, and access control across all engines.

Alex Merced
Dremio Blog: Open Data Insights

Enterprise Agentic Analytics Explained

Learn how agentic workflows for enterprise analytics connect AI agents, governed data and multi-step analysis to improve complex business decisions.

Alex Merced
Dremio Blog: Various Insights

Agentic Analytics Benefits and Key Features

Learn the benefits of agentic analytics and how enterprise teams use natural language queries, governed data and AI agents to improve decisions.

Alex Merced
Dremio Blog: Various Insights

Agentic AI in Insurance: From Competitive Advantage to Competitive Baseline: How Dremio Fuels Agentic AI at Scale

The insurance industry is undergoing a structural shift. What was once a slow moving, data heavy sector is now being reshaped by real time intelligence, automation, and advanced analytics powered by artificial intelligence. Agentic AI is no longer a futuristic concept or a “nice to have” innovation, it is rapidly becoming the competitive baseline that […]

Joe Rodriguez
Dremio Blog: Open Data Insights

Writing to an Apache Iceberg Table: How Commits and ACID Actually Work

Understanding the write process is critical because it explains why Iceberg can provide ACID guarantees on top of object storage, something that seems impossible when you consider that S3, ADLS, and GCS have no built-in transaction support.

Alex Merced
Dremio Blog: Open Data Insights

Agentic Lakehouse: The Architecture Built for AI-Native Analytics

The Agentic Lakehouse is not a new name for the same architecture. It represents a genuine shift in what a data platform is responsible for. A traditional lakehouse is a managed repository. An Agentic Lakehouse is an active participant in AI workflows: it provides context, enforces governance, and optimizes itself autonomously.

Alex Merced
Dremio Blog: Open Data Insights

Text-to-SQL vs Agentic Analytics: What the Upgrade Requires

Text-to-SQL on a governed semantic layer is significantly more reliable than text-to-SQL on a raw production schema. The semantic layer constrains what the model can access, provides business-friendly terminology, and enforces metric definitions. The accuracy improvement is material.

Alex Merced
Dremio Blog: Open Data Insights

Semantic Layer vs Data Catalog: What’s the Difference?

The convergence of AI agents, open table formats, and semantic tooling is making this architecture decision more consequential than it was a few years ago. AI agents that query through ungoverned raw tables or that cannot discover what data exists are not reliable.

Alex Merced

1
2
3
…
45
Next Page »