Why SQL Must Evolve in the Era of Agentic Apps and Data-Aware AI
SQL has long been the universal language of data. But with the rise of Generative AI and agentic applications, a major shift is underway. We're entering an era where natural language is the interface, and agents are the client.
There are two major trends converging here—both fueled by GenAI and both reliant on data:
Agents need data to do their work. Autonomous agents are being deployed to perform tasks like generating personalized marketing campaigns, running financial simulations, or triaging support tickets. To do these jobs effectively, they need access to company data—and many of them are now fluent in SQL. SQL is emerging as the preferred language for agents to retrieve and interact with structured data.
Humans still prefer natural language over SQL. Despite years of SQL training and the proliferation of BI tools, many users—from analysts to marketers—struggle to write precise queries. They want to express what they need in plain English. Agents can help here too—acting as translators that convert natural language into executable SQL queries.
In both cases, agents need to interact directly with data systems like Dremio. But without a common protocol, every integration becomes a custom effort. Just as REST standardized how services communicate, we now need a standard for agent-data interaction.
That’s where MCP (Model Context Protocol) comes in.
Try Dremio’s Interactive Demo
Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI
Introducing MCP: The Missing Link for Agentic AI
MCP, developed by Anthropic and backed by a growing ecosystem including OpenAI, Microsoft, and now Dremio, is designed to standardize how agents interact with tools, systems, and data.
In simple terms, MCP lets agents:
Discover what capabilities are available (e.g., “query a dataset” or “get schema metadata”)
Understand how to use them (parameters, expected results, etc.)
Invoke them dynamically, in real time, as part of a reasoning process
This makes MCP the OpenAPI of the agentic world—except broader, semantically richer, and designed for intelligent systems.
How Do LLMs Work?
To understand why MCP matters, it helps to briefly understand how LLMs reason and interact with tools.
An LLM receives a “context”—a sequence of instructions, background knowledge, history, and available tools—and then determines the next best token to produce. In many agent frameworks, this context includes a list of tools that the model can invoke. These are represented as function signatures, like this:
“How many customers signed up in California last month?”
…it doesn’t just generate a SQL query directly. It breaks the task into steps:
Search for relevant tables or metadata
Figure out which columns represent state and signup date
Construct a valid SQL query
Invoke the SQL tool with that query
Interpret the results and respond to the user
This entire process is possible because the model understands the tools available and how to call them. And it’s exactly this context—this interface between tools and models—that MCP standardizes.
Without MCP: Why This Was Painful Before
Before MCP, building agentic experiences that interact with real systems was painful and repetitive. Here’s what it used to involve:
Every tool needed a bespoke interface. You had to manually define each tool’s parameters and hardcode function specs for your specific model/runtime.
No shared vocabulary. Each model had its own format for tool calls, making it hard to build once and reuse across providers.
Limited discoverability. Agents couldn’t explore available capabilities—they needed everything preloaded.
Hard to scale or compose tools. Chaining tasks (like querying Dremio, exporting to Sheets, and summarizing results) was error-prone and manual.
With MCP, this all changes.
Why Does This Matter for Dremio?
This is why we’re so excited about MCP. It standardizes the interface between LLMs and tools—removing the need for one-off integrations and enabling dynamic discovery and composition of capabilities.
Dremio is built around openness. We believe in an open lakehouse architecture where your data isn’t locked behind proprietary APIs—it’s accessible, queryable, and now, agent-ready.
LLMs have a data information retrieval problem because they cannot natively access, retrieve, or accurately interpret real-time, private, or structured data without external systems augmenting their capabilities.
MCP, combined with Dremio, addresses this challenge head-on. With Dremio’s rich metadata and semantic layer, and MCP’s standardization, agents gain native access to discover datasets, generate SQL queries, and return insights—securely and at scale.
Enter the Dremio MCP Server
We’re introducing the Dremio MCP Server—an open-source project that allows any AI agent using MCP to communicate directly with Dremio.
With this server, agents can:
Discover datasets, views, and metadata
Translate natural language into SQL queries
Explore your lakehouse with rich, contextual understanding
This isn’t just for “data analysts.” An agent might:
Help a marketer pull client segmentation data for campaign personalization
Assist a finance bot in compiling quarterly reporting numbers
Translate an executive’s question into a SQL query for sales performance
And all of this happens seamlessly—without needing the user to know SQL.
What’s Under the Hood?
Here’s how the Dremio MCP integration works:
Tooling – Tools like RunSqlQuery, GetSchemaOfTable, and RunSemanticSearch are defined and registered with the MCP Server.
Auto-Discovery – Agents use MCP metadata to understand available functions, parameters, and expected outputs.
Invocation – Agents invoke tools directly and use the results to proceed with the next step in reasoning.
Because of MCP, agents can reason over which capability to use—no hardcoding required.
Build natural-language interfaces to your lakehouse
Enable automated workflows powered by agents
Initial tools include:
RunSqlQuery – Execute SQL directly on your cluster
GetSchemaOfTable – Retrieve schema, descriptions, and tags
RunSemanticSearch – Let agents explore your metadata with LLM-powered search
This is just the beginning.
Developers Wanted
This is an open ecosystem—and your contributions matter. We welcome:
New capabilities (semantic layers, visualizations, data transformations)
Better dev tooling, monitoring, and logs
Real-world feedback on use cases and performance
Let’s shape the future of data-native AI together.
Final Thoughts
In the near future, natural language will be the API—and agents will be the clients. But to make that future real, we need open, expressive, and secure ways for those agents to interact with systems.
MCP offers that promise. And with the Dremio MCP Server, we’re helping make it real.
Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg
By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.
Oct 12, 2023·Product Insights from the Dremio Blog
Table-Driven Access Policies Using Subqueries
This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.
Aug 31, 2023·Dremio Blog: News Highlights
Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud
Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.