An AI agent that confidently returns the wrong revenue number is more dangerous than one that returns no number at all. Wrong answers that look plausible get acted on. They end up in board decks, budget decisions, and quarterly reports before anyone notices the refunds were never excluded. This is the real problem with AI agents on enterprise data, and it has nothing to do with the quality of the underlying model.
The failure is architectural. And a semantic layer for AI agents is the architectural fix.
The Demo Looks Impressive. Production Data Is a Different Story.
Every AI data tool demo runs on the same type of schema: a sales table, a customers table, maybe an orders table. Column names are human-readable. There's one definition of revenue. The AI agent asks a question, generates clean SQL, and returns the right number. Everyone in the room nods.
Your production schema looks nothing like that.
Real enterprise schemas have hundreds of tables built over years by different teams with different naming conventions. Column names like acct_rev_adj_usd_net_q2, trans_amt, and cust_seg_cd are not edge cases. They are the norm. No documentation exists alongside the columns. And revenue, the single most important number in any business question, appears in at least five different tables with five different definitions:
gross_revenue: includes refunds, includes test accounts
net_revenue: excludes refunds
adj_revenue: excludes refunds and test transactions
booked_revenue: revenue recognized on the accounting date, not the transaction date
arr_revenue: annualized recurring revenue, for SaaS contexts only
An AI agent has to choose between these. And it will choose based on column names and table names alone, because that's all it has access to.
The risk is not that the agent crashes. The risk is that it picks gross_revenue, returns $2.4M for the quarter, and your finance team acts on a number that's $300K too high because refunds were never excluded. The model does not tell you it guessed.
Try Dremio’s Interactive Demo
Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI
Why AI Agents Hallucinate on Your Data (And It Is Not the Model's Fault)
LLMs generate SQL through pattern matching. The model looks at the user's question, the available table and column names, any schema metadata it has been given, and predicts the most statistically likely SQL to produce the intended result. This works well when column names clearly describe what they contain. It fails when names are abbreviated, ambiguous, or duplicated across tables.
The confidence problem compounds this. A language model asked to choose between net_revenue, transaction_amount, and acct_rev_adj_usd_net_q2 does not say "I'm not sure." It picks one and generates SQL as if it made a deliberate, informed choice. The output looks identical whether the model was right or wrong. There is no warning icon, no confidence score attached to the column selection decision.
Prompt engineering helps at the margins but cannot fix this structurally. You can write a detailed system prompt explaining what each column means. That context works for a few queries in a fresh conversation. But over longer sessions, across multiple agents, across queries from different business users, the prompt-based context degrades. You cannot document 10,000 columns in a system prompt that remains useful.
Anthropic's Claude model specification emphasizes that reliable AI behavior depends on grounding outputs in accurate, verifiable context. An undocumented production schema is the opposite of verifiable context. The model is left to infer meaning from names, and names alone are insufficient.
The fix must be structural. The schema needs to be made interpretable before the AI ever touches it.
The Three AI Hallucination Failure Modes in Analytics
AI hallucination on data is not random. It follows predictable patterns. Understanding the three failure modes tells you exactly what the semantic layer needs to prevent.
Wrong Table Join
Your schema has customer_dim, customer_legacy, customer_staging, and customer_archive. The AI needs to join orders to customers. It picks customer_legacy because the table name appears earlier in the schema metadata or because the column name in that table is a closer string match to the join key.
Result: the join brings in duplicated or stale customer records. Revenue aggregated by customer looks inflated. The join is technically valid SQL. It executes without error. The output is confidently, silently wrong.
Wrong Column (Revenue Disambiguation)
Five revenue columns. The AI sees them all and picks transaction_amount because it's the most generically named and appears in the table the AI already selected. The column includes refunds, test transactions, and pending settlements.
The correct answer is $2.1M. The AI returns $2.4M. The difference is real money that the business has already paid back to customers. But the number looks reasonable, so nobody audits it.
Wrong Aggregation Logic (Missing Business Rules)
Even if the AI picks the right table and the right column, it can still apply the wrong logic. Business rules accumulated over years exist in BI tools, notebooks, and tribal knowledge, but not in the raw schema. Rules like:
Exclude transactions where test_account = true
Convert all currencies to USD at the end-of-quarter exchange rate
Apply timezone normalization before filtering by date
Exclude transactions with status pending or disputed
The AI generates SUM(transaction_amount) WHERE transaction_date >= '2024-01-01' and misses every one of these conditions. The SQL is syntactically correct. The business answer is wrong.
-- Without semantic layer: AI guesses which revenue column to use
-- Possible wrong result: $2.4M (gross, includes refunds and test accounts)
SELECT SUM(transaction_amount) FROM raw_transactions
WHERE transaction_date >= '2024-01-01';
-- With semantic layer: AI queries documented view with business logic encoded
-- Correct result: $2.1M net revenue (refunds excluded, test transactions excluded)
SELECT net_revenue
FROM business.financial.quarterly_revenue_summary
WHERE quarter = '2024-Q1';
The difference between those two queries is not the AI's intelligence. It is the presence or absence of a semantic layer.
How a Semantic Layer for AI Agents Fixes This
To understand what a semantic layer is in the context of AI agents, think of it as a translation layer: it converts a cryptic, undocumented production schema into a clean, documented, business-friendly interface that AI can reason about reliably.
Named, Documented Virtual Datasets
Instead of exposing raw_transactions directly, the semantic layer exposes a virtual dataset called quarterly_revenue_summary. The name alone gives the AI agent meaningful context. More importantly, attached documentation explains what the view contains, what business logic is applied, and what the columns represent.
The AI no longer has to guess what acct_rev_adj_usd_net_q2 means. It reads the view name and its wiki, which say: "Net revenue for each quarter, excluding refunds and test account transactions. Source: finance_gl. Last updated: 2024-Q4."
Metric Definitions Encoded in Views
The most important function of the semantic layer is that business logic lives in the view definition, not in the AI's reasoning. When the net revenue view applies WHERE refund_flag = 0 AND test_account = 0 AND transaction_status = 'settled', every query through that view automatically applies those conditions. The AI does not have to know the rules. It queries the view and the rules are enforced by the SQL engine.
This is the key insight: you are not trying to teach the AI your business rules. You are encoding those rules in SQL where they cannot be forgotten or hallucinated away.
Natural Language Wikis and Labels
Every table and column in the semantic layer has a text description the AI reads before generating SQL. These descriptions serve two purposes. First, they help the AI match user intent to the right data object. If a user asks "what was total revenue this quarter," the AI searches for tables with descriptions containing "revenue" and "recognized in the period" rather than pattern-matching on column names. Second, labels and tags create conceptual categories. Tag every revenue-related column with revenue_metric, every cost column with cost_metric. The AI can narrow its search to the right category before selecting a specific column.
Column-Level Lineage
With column-level lineage, the AI knows that net_revenue in the quarterly summary view was derived from trans_amt in raw_transactions after applying three filtering conditions. This provenance information allows the AI to explain its answers: "This result uses net_revenue, which excludes refunds. It came from the finance_gl schema." That kind of explanation is what separates a trustworthy AI analytics response from a black-box number.
Dremio's AI Semantic Layer in Practice
Dremio's implementation of the AI semantic layer is built into the Agentic Lakehouse platform and designed specifically for AI agent workloads. The features work together to solve each failure mode systematically.
AI-Generated Wikis
Writing documentation for thousands of tables and columns is the work nobody does, which is why schemas are undocumented in the first place. Dremio addresses this by using AI to auto-generate wiki descriptions. The system samples data from each column, analyzes value distributions and patterns, and generates a natural language description of what the column likely contains. A column containing values like USD, EUR, GBP gets documented as "Currency code for the transaction." A column with values between 0 and 1 gets described as a rate or ratio.
Human review is still recommended for critical business metrics. But AI-generated first drafts reduce the documentation burden from "impossible at scale" to "review and refine."
Semantic Search by Business Concept
When an AI agent asks "which table contains quarterly revenue by region," Dremio's semantic search matches the query to the right view by concept, not by exact name. The agent does not need to know that the view is called business.financial.quarterly_revenue_summary. It describes what it's looking for, and semantic search finds the match. This is how the agent bridges the gap between natural language questions and technical schema objects.
Labels and Tags for Metric Taxonomy
Labels in Dremio allow you to build a conceptual taxonomy across your entire schema. Apply revenue_metric to every column that measures revenue, regardless of which schema or table it lives in. Apply pii to every column containing personally identifiable information. When an AI agent needs a revenue metric, it filters to revenue_metric-tagged columns first, cutting the search space from potentially thousands of columns to a dozen curated options.
Fine-Grained Access Control as a Semantic Layer Function
The semantic layer is not just about accuracy. It is about safety. An AI agent for an EMEA sales analyst should not be able to return North America revenue data, even if it generates SQL that tries to access it. Dremio's Fine-Grained Access Control (FGAC) applies row-level and column-level restrictions at query execution time, enforced by the query engine rather than by application logic.
-- Dremio FGAC: row-level security applied automatically at query execution
-- AI agent for EMEA region cannot see North America data regardless of the SQL it generates
CREATE FUNCTION row_filter_by_region(region_col VARCHAR)
RETURNS BOOLEAN
RETURN region_col = SESSION_USER_ATTRIBUTE('authorized_region');
This means the AI agent literally cannot return unauthorized data. Not because of a prompt instruction. Not because of an application-layer check. Because the query engine blocks it. Column masking applies the same principle to sensitive columns: if the AI generates SQL that selects salary or ssn, those values are masked according to the user's permissions before the result is returned.
A Real Example: Querying Revenue With and Without a Semantic Layer
Consider a concrete scenario: a business analyst asks an AI agent "What was our total revenue this quarter?" This is one of the most common business questions and one of the most commonly wrong answers when the AI has direct access to raw schemas.
Without a semantic layer: The AI scans available tables, finds raw_transactions, and identifies transaction_amount as the most likely revenue column based on its name. It generates:
SELECT SUM(transaction_amount) FROM raw_transactions
WHERE transaction_date >= '2024-01-01'
AND transaction_date < '2024-04-01';
This query includes refunded transactions ($180K), test account transactions ($120K), and transactions still in pending status ($55K). The result is $2.4M. The correct answer is $2.1M.
The $300K error is not visible. The SQL executed correctly. The agent presents the answer with confidence. Someone puts $2.4M in a slide deck.
With a semantic layer: The AI receives a query against documented views. Semantic search matches "total revenue this quarter" to quarterly_revenue_summary. The view is documented: "Net revenue by quarter. Excludes refunds, test accounts, and pending transactions." The AI generates:
SELECT net_revenue
FROM business.financial.quarterly_revenue_summary
WHERE quarter = '2024-Q1';
The view's underlying SQL already applies all business rules. The result is $2.1M. The agent can explain: "This is net revenue from the quarterly summary view, which excludes refunds and test account transactions per finance team definition."
The difference in output quality is not a function of which AI model was used. It is a function of what the AI was allowed to query.
The Trust Problem: Why Semantic Layer Changes the Math on Agentic Analytics
Without a semantic layer, the practical limit of agentic analytics is "generate a query, have a human review it." The AI speeds up SQL writing but does not speed up decision-making, because every AI-generated query is a liability until someone verifies it. Data teams become a bottleneck in a workflow that was supposed to remove bottlenecks.
With a semantic layer, the math changes. The AI is constrained to documented, tested business concepts. "Net revenue" always means net revenue, regardless of who asks or which agent runs the query. The domain of possible errors narrows from "any SQL imaginable" to "misapplication of a documented concept." That is a much smaller problem, and it is auditable.
This is what makes agentic analytics operationally viable. Not zero-trust, because no system involving AI and complex data is zero-trust. But bounded trust, where the trust floor is high enough that routine business questions can be answered autonomously, and complex or novel questions are flagged for human review rather than blocked entirely.
Anthropic's Claude model specification frames reliable AI behavior in terms of verifiable grounding. A well-constructed semantic layer is verifiable grounding at the data layer. It is the difference between an AI reasoning in the dark and an AI reasoning from documented facts.
Scenario
Without Semantic Layer
With Semantic Layer
Table selection
AI guesses from hundreds of tables
AI finds right view via semantic search
Column choice
AI picks most similarly named column
AI queries documented virtual column
Business logic
AI omits refund exclusion rules
Logic encoded in view, applied automatically
Access control
Depends on application-layer logic
Enforced by FGAC at query execution
Human review needed
Every query
Complex or novel queries only
Trust level
Low: any query could be subtly wrong
High: bounded to documented concepts
Building a Semantic Layer Specifically for AI Agents
The goal is not a perfect semantic layer covering every table. The goal is a semantic layer that covers the queries your AI agents will actually run. For a complete guide to building a semantic layer from the ground up, Dremio's resources cover the full architecture. For the AI-specific approach, here is the practical sequence.
Start With the Top 20 Business Questions
What does your business ask most often? Revenue by period, customer count by segment, conversion rate, churn, cost per acquisition. These 20 questions drive 80% of business analytics. Build views that answer these questions correctly. Document them. Test them against known historical values.
This is not a boil-the-ocean approach. Twenty well-built views with accurate documentation and business logic encoded in SQL deliver most of the value.
Use AI to Write the Documentation
The irony of building a semantic layer for AI is that you can use AI to build it. Dremio's AI wiki generation samples columns and generates natural language descriptions. You can also use any capable LLM to draft table and column descriptions by feeding it schema information and sample data. The drafts need human review for critical metrics, but AI-generated documentation is far better than no documentation.
Tag Metrics Explicitly
Create a label taxonomy before you start. revenue_metric, cost_metric, customer_metric, operational_metric, pii. Apply these tags consistently across all schemas. The tags become navigation primitives for AI agents, narrowing the search space and reducing the likelihood of the agent selecting an off-topic column.
Build a Gold-Layer View Set for AI Access
Create a curated set of views specifically designed for AI agent access. Think of this as the "AI API" for your data. It exposes only the concepts an agent needs, with clean names, full documentation, and business logic encoded. Raw tables should not be directly accessible to AI agents. Only the gold-layer views should be in scope.
This also provides a clean governance boundary: you control exactly what the AI can see by controlling what is in the gold layer.
Plan for Ongoing Maintenance
The biggest ongoing cost of a semantic layer is keeping it synchronized with business logic changes. When your refund policy changes, the net revenue view needs to be updated. When a new product line is added, the revenue views need to reflect it. Build maintenance into your workflow from the start: update views, re-run tests against known totals, get sign-off from the relevant business team before deploying changes.
Automation helps. Scheduled tests that query views and compare results against expected ranges catch view drift before it propagates to AI agent outputs.
The Honest Tradeoffs
Building a high-quality semantic layer is work. Weeks to months upfront for the initial gold-layer views, ongoing maintenance as business rules evolve, and a human review process for critical metric definitions. This is real cost, and it should be factored into any agentic analytics evaluation.
The semantic layer also does not solve every AI SQL generation problem. Complex multi-step analytical questions (cohort analysis, attribution modeling, forecasting queries) can still go wrong even with a well-built semantic layer. The AI is still doing reasoning on top of the documented views, and that reasoning can still be imperfect.
What the semantic layer solves is the most common, highest-impact class of errors: wrong table, wrong column, missing business rule. These errors affect 80% of business queries and are the ones most likely to produce wrong numbers that end up in actual decisions.
For the tail cases, the complex and novel one-off analytical questions, human review remains necessary. That is not a failure. That is appropriate division of labor. An AI agent that handles routine analytics reliably and flags complex queries for human review is more valuable than one that attempts everything and gets wrong answers for a quarter of them.
Building the Trust Infrastructure for AI Analytics
The reason so many agentic analytics projects stall at proof-of-concept is not the AI model. It is the absence of the infrastructure that would make the AI trustworthy on real data. A semantic layer is that infrastructure.
Start narrow. Pick the five metrics your business asks about most often. Build documented, tested views for each one. Use Dremio's AI wiki generation to document the columns. Apply labels. Configure FGAC for your user personas. Run the AI agent against the gold-layer views for those five metrics and validate the outputs against known historical values.
When those five work reliably, add five more. The semantic layer grows incrementally, and so does the domain where AI agents can be trusted to operate autonomously.
The alternative is deploying AI agents directly against raw production schemas and hoping the model figures it out. That produces the demo, not the production result.
Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop
We're always looking for ways to better handle and save money on our data. That's why the "data lakehouse" is becoming so popular. It offers a mix of the flexibility of data lakes and the ease of use and performance of data warehouses. The goal? Make data handling easier and cheaper. So, how do we […]
Aug 16, 2023·Dremio Blog: News Highlights
5 Use Cases for the Dremio Lakehouse
With its capabilities in on-prem to cloud migration, data warehouse offload, data virtualization, upgrading data lakes and lakehouses, and building customer-facing analytics applications, Dremio provides the tools and functionalities to streamline operations and unlock the full potential of data assets.
Aug 31, 2023·Dremio Blog: News Highlights
Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud
Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.