Alex Merced

Head of DevRel, Dremio

Alex Merced is Head of DevRel for Dremio, a developer, and a seasoned instructor with a rich professional background. Having worked with companies like GenEd Systems, Crossfield Digital, CampusGuard, and General Assembly.

Alex is a co-author of the O’Reilly Book “Apache Iceberg: The Definitive Guide.”  With a deep understanding of the subject matter, Alex has shared his insights as a speaker at events including Data Day Texas, OSA Con, P99Conf and Data Council.

Driven by a profound passion for technology, Alex has been instrumental in disseminating his knowledge through various platforms. His tech content can be found in blogs, videos, and his podcasts, Datanation and Web Dev 101.

Moreover, Alex Merced has made contributions to the JavaScript and Python communities by developing a range of libraries. Notable examples include SencilloDB, CoquitoJS, and dremio-simple-query, among others.

Alex Merced's Articles and Resources

Hidden Partitioning: How Iceberg Eliminates Accidental Full Table ScansHidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans

Blog Post

Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans

This is Part 5 of a 15-part Apache Iceberg Masterclass. Part 4 covered partition evolution. This article covers hidden partitioning, the feature that ensures users never need to know how their data is physically organized. The most expensive mistake in data lake querying is the accidental full table scan: a query that reads every file because the user […]

Read more ->

Blog Post

Semantic Layer Governance: Control What AI Agents Access

AI agents can execute hundreds of queries per minute, with no human reviewing each result before the agent acts on it. That is the governance gap that most data architecture teams have not yet closed. Traditional access controls were designed for a world where a person ran a report, read the output, and made a […]

Read more ->

Blog Post

Semantic Layer for AI Agents: Stop Getting the Numbers Wrong

An AI agent that confidently returns the wrong revenue number is more dangerous than one that returns no number at all. Wrong answers that look plausible get acted on. They end up in board decks, budget decisions, and quarterly reports before anyone notices the refunds were never excluded. This is the real problem with AI […]

Read more ->

Blog Post

MCP Server Data Lakehouse: Connect AI Agents to Your Data

Every AI agent that touches enterprise data without a standard protocol needs its own custom integration code: bespoke authentication, hand-rolled schema discovery, one-off SQL generation tuned to a specific database. Multiply that by a dozen agents and half a dozen data sources, and you have an integration backlog that grows faster than your team can […]

Read more ->

Blog Post

Apache Iceberg Small Files Problem: Causes, Fixes, and Prevention

A single streaming job writing to Apache Iceberg every minute, across 10 partitions, produces 14,400 files per day from that one pipeline alone. Within a week, your table has over 100,000 files. Query planning that once took milliseconds now takes 15 to 30 seconds before a single row of data is read. That is the […]

Read more ->
Partition evolution is one of the features that makes Iceberg a safe long-term choice. It means the partitioning decision you make today is not permanent.Partition evolution is one of the features that makes Iceberg a safe long-term choice. It means the partitioning decision you make today is not permanent.

Blog Post

Partition Evolution: Change Your Partitioning Without Rewriting Data

This is Part 4 of a 15-part Apache Iceberg Masterclass. Part 3 covered metadata-driven performance. This article explains how Iceberg handles the problem that has plagued data lakes for over a decade: what happens when your partition strategy needs to change. Partitioning determines how data is physically organized in storage, and it is the single most impactful factor […]

Read more ->

Blog Post

Apache Iceberg REST Catalog: What It Is and How to Use It

Iceberg tables are files. Without a catalog, they are just Parquet and Avro files scattered across object storage with no name, no schema, and no access control layer binding them together. The Apache Iceberg REST Catalog solves this problem with a vendor-neutral HTTP API that any engine can use to discover, read, and write Iceberg tables. Understanding […]

Read more ->

Blog Post

Apache Iceberg Partition Evolution: Change Your Partitioning Strategy Without Rewriting Data

A 10 TB events table partitioned by month was the right call two years ago. Now your data volume has grown tenfold, your team runs daily SLA dashboards, and every query that touches “last 7 days” is scanning an entire month’s worth of files. In a traditional Hive-style warehouse, fixing this means a full table […]

Read more ->

Blog Post

What Is Agentic Analytics? How It Differs from BI and AI Assistants

 business analyst gets a question from the CMO on Monday morning: “Why did customer churn spike last month?” Under a traditional BI model, that question enters a backlog, a report gets built over two days, and the answer lands on the CMO’s desk on Wednesday reflecting data from three weeks ago. By that point, the […]

Read more ->

Blog Post

Agentic Lakehouse vs Data Lakehouse: What Actually Changes

The traditional data lakehouse was designed for human analysts. Every architectural decision, from how performance is tuned to how business context is stored, assumed that a person would be sitting at the end of the pipeline, writing queries, interpreting results, and carrying those results into decisions. That assumption is no longer reliable. AI agents are […]

Read more ->

Blog Post

Apache Polaris 1.5.0: Deep-Dive Into the Future of Open Data Catalogs

Catalog governance is the biggest bottleneck in building a multi-engine lakehouse. When you query the same Apache Iceberg tables with Spark, Flink, and Dremio, synchronizing permissions and access credentials across different engines is traditionally a manual, error-prone chore. Apache Polaris solves this by providing a centralized, open-source REST catalog for Apache Iceberg tables. Instead of […]

Read more ->

Blog Post

Agentic Lakehouse Architecture: The Four Technical Layers

Choosing the right concept is only half the job. Plenty of teams have adopted the lakehouse model, picked open formats, and still built systems that fail when AI agents start querying them at scale. The Agentic Lakehouse architecture solves a specific problem: how do you structure a data platform so that AI agents can discover, […]

Read more ->
Performance and Apache Iceberg's MetadataPerformance and Apache Iceberg's Metadata

Blog Post

Performance and Apache Iceberg’s Metadata

This is Part 3 of a 15-part Apache Iceberg Masterclass. Part 2 covered the metadata structures of all five table formats. This article focuses on exactly how query engines use Iceberg’s metadata to avoid reading data they don’t need. The single biggest performance advantage of Iceberg over raw data lakes is not a clever algorithm or a faster […]

Read more ->

Blog Post

Apache Iceberg V2 vs V3: What Changed and What It Means for Your Tables

Apache Iceberg is not a static format. The spec version number stamped into every table’s metadata controls which features that table can use, which engines can read it, and how efficiently row-level changes are handled. The jump from Apache Iceberg V2 vs V3 is not cosmetic. It introduces deletion vectors, a native semi-structured column type, and nanosecond […]

Read more ->

Blog Post

Apache Iceberg Machine Learning: Solving Data Versioning for AI

Your model’s accuracy dropped 7 percentage points after last month’s retraining. You need to reproduce the exact dataset it trained on three months ago. Your data lake has been through dozens of batch loads, partition rewrites, and schema updates since then. That dataset is gone. This is the data versioning problem that Apache Iceberg machine […]

Read more ->
Build an Agentic Lakehouse on Dremio: Getting StartedBuild an Agentic Lakehouse on Dremio: Getting Started

Blog Post

Build an Agentic Lakehouse on Dremio: Getting Started

You can have a working agentic lakehouse in under two hours. That is not a marketing claim. It is what you get when you combine a free Dremio Cloud trial, a three-view semantic layer, and the built-in AI Agent. This guide walks you through every step to build an agentic lakehouse on Dremio, from signing […]

Read more ->

Blog Post

Migrate Delta Lake to Apache Iceberg: Step-by-Step Guide

Apache Iceberg now has native read/write support across more than a dozen query engines, from Spark and Flink to Trino, Dremio, DuckDB, Snowflake, and BigQuery. That breadth of support is the single biggest reason data engineering teams in 2025 and 2026 are choosing to migrate Delta Lake to Apache Iceberg, and this guide walks through […]

Read more ->

Blog Post

Dremio Semantic Layer: A Practical Step-by-Step Guide

Most data teams using Dremio start the same way: analysts connect directly to raw source tables, write their own joins, and define their own filters. Within six months, “net revenue” means four different things depending on which dashboard you open. The Dremio semantic layer is the fix for this. It gives you a single governed […]

Read more ->

Blog Post

Agentic Analytics in Financial Services: How AI Agents Query Regulated Data Safely

Financial services is the industry where a wrong answer from an AI agent doesn’t just produce a bad dashboard. It produces a regulatory violation. That single fact changes every architectural decision you make about agentic analytics in banking, insurance, and capital markets. The data is maximally sensitive: Social Security numbers, account numbers, primary account numbers […]

Read more ->

Blog Post

What’s New in Apache Iceberg 1.11.0

Apache Iceberg 1.11.0 shipped in May 2026, and it is not a routine maintenance release. Two parallel threads of work converge here: a significant architectural shift in how Iceberg handles file formats, and the arrival of production maturity for the V3 specification features the community has been building toward for the past two years. If […]

Read more ->

Blog Post

What is a model context protocol (MCP) server?

If you want to know what is a model context protocol server, it is a lightweight application that connects artificial intelligence models to external data sources and tools. This server acts as a translator between an AI system and your corporate databases. It exposes file systems, developer tools, and APIs through a standardized communication layer. […]

Read more ->

Blog Post

Snowflake Competitors: More Affordable and Open Source Alternatives

Snowflake changed cloud data warehousing with its separate compute-and-storage architecture and multi-cloud support. But rising costs, vendor lock-in concerns and the shift toward open data formats have pushed many organizations to look at Snowflake competitors that offer better pricing, open source foundations, or both. This guide covers 19 alternatives to Snowflake across cloud data warehouses, […]

Read more ->

Blog Post

Agentic Analytics vs Traditional BI Tools: What Do You Need for the Future?

The way organizations analyze data is changing fast. Traditional BI dashboards and manual SQL queries served teams well for years, but they can’t keep pace with the speed of modern business decisions. Agentic analytics vs traditional BI tools is now a critical comparison for any data leader planning their next move. Agentic analytics platforms use […]

Read more ->
How Dremio Keeps Agentic Analytics Fast Without Manual TuningHow Dremio Keeps Agentic Analytics Fast Without Manual Tuning

Blog Post

How Dremio Keeps Agentic Analytics Fast Without Manual Tuning

A BI analyst runs the same sales dashboard every Monday morning. A data engineer can look at that query, understand the access pattern, and build a materialized view to make it fast. That model works because the query patterns are predictable and stable. An AI agent doesn’t work that way. When a business analyst asks […]

Read more ->

Blog Post

Definitive Guide to the Data Lakehouse

Most companies still run a separate data warehouse and a data lake. They pay twice for storage, run duplicate pipelines, and spend weeks reconciling numbers that don’t match between the two systems. The data lakehouse pattern exists to collapse that complexity into one open architecture. This guide covers the full picture: how we got to […]

Read more ->

Blog Post

The Metadata Structure of Modern Table Formats

This is Part 2 of a 15-part Apache Iceberg Masterclass. Part 1 covered why table formats exist. This article breaks down exactly how each format organizes its metadata. The metadata structure of a table format determines everything: how fast queries start planning, how efficiently concurrent writes are handled, how schema changes propagate, and how much overhead accumulates over […]

Read more ->

Blog Post

17 Best AI Integration Platforms for Agents and Automation

AI integration platforms have become a critical piece of enterprise architecture. Organizations building AI agents, automation workflows, and AI-ready data pipelines need platforms that connect data sources, enforce governance, and support the high-throughput, low-latency access patterns that AI systems demand. This guide covers 17 of the best AI integration platforms available in 2026, selection criteria […]

Read more ->

Blog Post

Top 11 Hadoop Alternatives to Use in 2026

Apache Hadoop was the default platform for big data processing for much of the 2010s. By 2026, most organizations have moved on. The architecture that made Hadoop groundbreaking — distributed storage with MapReduce computation — has been replaced by faster, more flexible, and less operationally demanding systems. This guide covers the 11 best Hadoop alternatives […]

Read more ->

Blog Post

Enterprise Data Fabric: Architecture and Best Practices

Enterprise data fabrics have become a central topic for data and technology leaders working to support AI, real-time analytics, and cross-cloud operations. As organizations accumulate data across cloud providers, on-premises systems, SaaS applications, and partner environments, the challenge of maintaining consistent, governed, and accessible data grows with each new source added. This guide explains what […]

Read more ->

Blog Post

Enterprise Data Platforms: The Definitive Guide

The amount of data enterprises generate has grown beyond what traditional storage and processing systems can handle. Enterprise data platforms have emerged as the infrastructure layer that brings order to this complexity, enabling analytics teams and AI systems to work from a single, governed foundation. This guide covers what enterprise data platforms are, how their […]

Read more ->
get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Make data engineers and analysts 10x more productive

Boost efficiency with AI-powered agents, faster coding for engineers, instant insights for analysts.