Most discussions about AI in the enterprise focus on the models themselves, which LLM is fastest, which one is cheapest, which one writes the best SQL. That focus misses the harder problem. Models are getting capable fast. The bottleneck isn't intelligence. It's access.
Agentic analytics is the discipline of connecting AI agents to enterprise data in a way that's fast, accurate, and governed. And building the infrastructure to support it requires more than plugging an LLM into a database. There are three foundational pillars every serious agentic analytics platform must deliver, and most platforms today have only one or two.
This post breaks down what agentic analytics actually means, why it differs from traditional BI or even conversational analytics, and what separates platforms that can genuinely support it from those that are retrofitting AI features onto aging architectures.
What Agentic Analytics Actually Means
Traditional analytics puts humans in the loop at every step. A data engineer models the data. A BI developer builds the dashboard. A business user reads the dashboard and asks a follow-up question. That follow-up question goes back to the engineer. Repeat.
Agentic analytics changes the loop. An AI agent receives a business question, or generates one on its own based on a task, and then autonomously discovers relevant data, writes SQL, runs queries, interprets results, and returns an answer. The human sets the goal; the agent handles the path.
That sounds simple. It isn't, because agents face a set of challenges that humans using dashboards don't: they can't rely on institutional memory, they don't know your business definitions unless you've encoded them somewhere, they can't distinguish between a table that's production-ready and one that was abandoned six months ago, and they'll happily query data they're not supposed to see if nothing stops them.
This is why "just give an LLM access to your database" doesn't work at enterprise scale. You need infrastructure built around how agents actually operate.
Try Dremio’s Interactive Demo
Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI
The Three Pillars of an Agentic Analytics Platform
Think of an agentic analytics platform as three concentric layers of capability. Strip any one of them away, and what you have is incomplete.
Figure 1: Dremio's three-pillar architecture for agentic analytics, each pillar is necessary; none is sufficient on its own.
Pillar 1: Unified Data, Agents Can't Query What They Can't Reach
The first problem is access to data itself. Enterprises don't keep their data in one place. A typical organization has data split across cloud data lakes (S3, ADLS, GCS), relational databases (PostgreSQL, MySQL, Oracle), SaaS platforms (Salesforce, Hubspot), and possibly one or more data warehouses.
An AI agent trying to answer a business question doesn't care about any of that topology. It just needs answers. The platform has to unify all those sources into something an agent can query coherently.
Two capabilities matter here: federation and lakehouse management. Federation means the ability to query across all those sources with a single SQL interface, without physically moving data or running ETL pipelines. The lakehouse component means native, governed management of your own data lake, so you're not just federating out to other systems but maintaining a high-performance, governed home for your core data.
Without data unification, agents either get a fragmented view of the enterprise (querying only what's in the warehouse, missing context from everywhere else) or they spend time navigating multiple connection points that were never designed to work together. Neither situation produces reliable analytics.
Pillar 2: Data Meaning, Agents Need Context, Not Just Rows
Raw data access isn't enough. An agent that can run SQL against your tables can still fail spectacularly if it doesn't know what the tables mean.
What does the "revenue" column actually count, bookings, cash received, or recognized revenue? Is the customer_id field in the orders table the same entity as cust_id in the support tickets table? Which of the three "active users" tables should be queried for monthly retention analysis?
Humans learn these things through months of onboarding and constant conversation with colleagues. Agents learn them from what's encoded in the data platform.
This is the role of a semantic layer, a governed, business-readable representation of your data that defines metrics, resolves naming conflicts, documents table purposes, and provides the context an agent needs to form accurate queries. Without it, agents guess. Sometimes they guess correctly. More often, especially on ambiguous business questions, they don't.
A proper semantic layer also has to be AI-aware. It's not enough to have written documentation that no one reads. The platform needs to surface that context to agents dynamically at query time, so they operate with understanding rather than pattern-matching on column names.
Pillar 3: Governed Agentic Access, Speed and Security Aren't Optional
The third pillar is where most platforms fall short. Giving agents access to data is one thing. Giving them governed, fast, purpose-built access is another entirely.
Three things matter here:
Access controls that actually work for agents. Row-level security, column masking, and role-based permissions need to apply to agents exactly as they apply to humans. An agent connected to a sales analyst's credentials shouldn't be able to see HR data. The governance model has to extend cleanly to agentic workloads.
A purpose-built interface for AI agents. REST APIs designed for humans are clunky for agents. The most practical interface that's emerged is MCP, the Model Context Protocol, which gives agents structured tools to discover schemas, run queries, and search for relevant datasets. A platform with a native MCP server removes integration friction entirely: agents connect, authenticate, and start working.
AI capabilities inside the data platform. Not every unstructured data problem can be solved before query time. PDFs, support ticket logs, product reviews, and images often need AI analysis at query time, directly in SQL. AI functions that call LLMs inside SQL queries let agents work with unstructured data the same way they work with structured tables.
Speed matters too. Conversational analytics with a 30-second query latency isn't conversational. Sub-second response times are what make agentic workflows feel natural and useful.
Figure 2: Each missing pillar produces a predictable, distinct failure mode. Missing unification means incomplete answers. Missing meaning means wrong answers. Missing governance leads to unsafe, slow answers.
Why Most Platforms Can't Deliver All Three
The challenge is architectural. Most modern data platforms were designed for one of these pillars and have tried to extend into the others.
Data warehouses were built for fast, governed query, but they centralize data, which means ETL, which means delay and duplication. Bolting on federation as a feature works for simple queries but degrades at scale.
Data catalogs and semantic tools were built to support data meaning, but they're often read-only metadata stores without a query layer to power agent actions.
And most platforms that have rushed to add AI features have treated them as top-of-stack additions rather than architectural commitments. Natural language to SQL is a feature. An AI-aware semantic layer that actively resolves business ambiguity for agents is an architecture.
Dremio: Built for All Three Pillars
Dremio is positioned as The Agentic Lakehouse, and that positioning reflects a deliberate architectural choice, not a rebranding exercise.
The federation layer comes from Dremio's SQL query engine, built on Apache Arrow, which queries data across any source, lakes, databases, and warehouses, without ETL. Data stays where it lives; agents get a unified SQL interface over all of it.
The lakehouse layer provides native Apache Iceberg management with ACID transactions, time travel, schema evolution, and autonomous performance optimization (more on this in Part 2 of this series). Dremio is a core contributor to Apache Iceberg and the co-creator of Apache Polaris, the open catalog standard.
The semantic layer provides AI-generated wikis, business context, and metric definitions that surface to agents at query time, not just as human documentation but as structured context that agents can use to form better queries. And the MCP Server gives any AI agent, Claude, LangChain, Codex, or custom frameworks, zero-integration access to Dremio's full data environment.
Autonomous Reflections handle the performance side: they detect query patterns and automatically build and maintain materializations that bring latency from minutes to sub-second, without any manual tuning.
Figure 3: Traditional analytics routes every question through a human analyst. Agentic analytics collapses that loop, agents discover, query, and return answers in the same session.
What This Blog Series Covers
This is the first post in a four-part series examining what it actually takes to build an agentic analytics platform.
Over the next three posts, we'll go deep on each pillar:
Part 2: Data Unification, how federation and lakehouse architecture give agents reliable, comprehensive data access across any source
Part 3: Data Meaning, how a well-built semantic layer transforms what agents can do with enterprise data
Part 4: Governed Agentic Access, how access controls, MCP, AI SQL functions, and sub-second performance combine to make agentic analytics safe and fast at scale
Each post examines the problem the pillar solves, why it matters specifically for agents (not just human analysts), and how Dremio's architecture addresses it.
Getting Started
If agentic analytics is on your roadmap, or if you're already building AI applications that need to connect to enterprise data, it's worth auditing where your current platform sits across these three pillars. Most gaps show up fastest when agents start hitting data quality issues, permission errors, or ambiguous schema definitions that a human analyst would have talked their way around.
Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop
We're always looking for ways to better handle and save money on our data. That's why the "data lakehouse" is becoming so popular. It offers a mix of the flexibility of data lakes and the ease of use and performance of data warehouses. The goal? Make data handling easier and cheaper. So, how do we […]
Aug 16, 2023·Dremio Blog: News Highlights
5 Use Cases for the Dremio Lakehouse
With its capabilities in on-prem to cloud migration, data warehouse offload, data virtualization, upgrading data lakes and lakehouses, and building customer-facing analytics applications, Dremio provides the tools and functionalities to streamline operations and unlock the full potential of data assets.
Aug 31, 2023·Dremio Blog: News Highlights
Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud
Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.