Something I keep hearing in conversations with customers and prospects right now is a version of the same frustration: we have executive buy-in, budget approved, models selected, and we still cannot ship anything meaningful. The AI initiative is stalled, and everyone is looking at the data team.
Then there are the teams who are already in production. Agents running on live data, costs coming down, engineers focused on building instead of firefighting. When I dig into what separates those two groups, it almost never comes down to which model they picked.
It comes down to how their data is set up.
The assumption that made sense for BI does not work for AI
For years, the standard playbook was to move data into a central warehouse and query it from there. That worked well enough when the primary consumers were human analysts pulling dashboards. The data could be a day old. The definitions could vary a little between teams. Analysts were flexible enough to work around it.
AI agents are not flexible in that way. An agent querying a warehouse copy of your data is querying something that is already stale. The semantic definitions across systems often do not match. The answers come back confidently wrong, and trust in the whole initiative erodes fast.
The architecture was not broken for the old use case. It just was not built for this one.
Try Dremio’s Interactive Demo
Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI
What the teams that are shipping actually did differently
The pattern I see across customers who have broken through is consistent: they stopped treating data movement as the default and started querying data where it already lives.
The results are concrete. A global manufacturer across more than 100 countries brought analytics latency down from 30 minutes to 3 seconds. The production data that used to require daily batch jobs now feeds AI-driven decisions in real time. A pharmaceutical company running more than 5,000 product SKUs hit sub-second query performance for LLM automation and cut 30 percent of the overhead that used to go to legacy ETL. An enterprise technology company reduced onboarding time for new data domains from weeks to hours, which meant AI initiatives could scale without proportional headcount growth.
Nucleus Research looked at this independently across Dremio customers and found 61 to 73 percent cost savings on data reads. That is a direct consequence of eliminating copies. Fewer copies means lower cost, lower latency, and more consistent data. The savings are almost mechanical.
The compounding problem for teams that wait
Here is the thing that concerns me when I talk to teams still in the first group: the delay is not neutral.
The teams that have made this shift are not just ahead, they are accelerating. Their data engineers are spending time building new capabilities instead of maintaining pipelines. Their agents have access to live, governed data. The business value is compounding.
The teams waiting are not paused at their own baseline. They are falling behind relative to peers who are actively in production and moving faster every quarter.
What making the shift actually looks like
This is not a rip-and-replace project, and I think that misconception is part of what slows teams down.
The Agentic Lakehouse works with data where it already lives. Apache Iceberg provides a storage foundation that is open and not locked to any vendor. The AI Semantic Layer adds the business context agents need to return answers that are actually trustworthy, without requiring custom engineering for every use case. Autonomous Reflections handle query optimization automatically, learning from usage patterns over time.
The customers I see move fastest do not wait until everything is perfect. They connect to the sources they already have, surface semantic definitions that already exist somewhere in the organization, and start delivering use cases within weeks.
Timing is the real variable
The AI mandate is not going to wait for infrastructure to catch up on its own schedule. The teams shipping today did not have a perfect architecture or an ideal budget window. They used what they had and built from there.
The gap between the two groups I keep seeing will be measurable and significant within the next 18 months. The organizations who move now are the ones building the foundation that everything else runs on.
Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg
By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.
Sep 22, 2023·Dremio Blog: Open Data Insights
Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop
We're always looking for ways to better handle and save money on our data. That's why the "data lakehouse" is becoming so popular. It offers a mix of the flexibility of data lakes and the ease of use and performance of data warehouses. The goal? Make data handling easier and cheaper. So, how do we […]
Oct 12, 2023·Product Insights from the Dremio Blog
Table-Driven Access Policies Using Subqueries
This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.