Dremio Blog

18 minute read · March 6, 2026

How to Reduce Amazon Redshift Costs by 40-60% with Dremio’s Agentic Lakehouse

Alex Merced Head of DevRel, Dremio

Start For Free

Copied to clipboard

How to Reduce Amazon Redshift Costs by 40-60% with Dremio’s Agentic Lakehouse

Where Redshift Money Goes

How Dremio Stops the Redshift Meter

Step-by-Step: Connect Redshift to Dremio

Dremio's AI Features: Beyond Query Offloading

Architecture: How Dremio and Redshift Work Together

The Long-Term Path: From Cost Reduction to Full Modernization

What Redshift Does Not Provide

Get Started

Amazon Redshift pricing looks straightforward on paper: provisioned clusters with reserved instances or Serverless with per-second RPU billing. In practice, the bill gets complicated fast. Provisioned clusters charge even when idle if you forget to pause them. Serverless bills a 60-second minimum per query, which inflates costs for the short dashboard queries that make up most BI workloads. Concurrency Scaling adds on-demand charges when parallel queries exceed your cluster's capacity. And Redshift Spectrum charges $5 per TB scanned when querying S3 data.

Dremio provides an alternative: keep Redshift for the workloads that need it, but offload the repetitive, expensive dashboard and reporting queries to Dremio's engine. Dremio's Autonomous Reflections serve those queries from Apache Iceberg tables on your own S3 storage, bypassing Redshift compute entirely. The result is a 40-60% reduction in Redshift compute costs in the first month, without migrating a single table.

This guide walks through the cost reduction strategy, the modern AI features that make Dremio more than a query accelerator, and a step-by-step setup.

Where Redshift Money Goes

Redshift compute costs break into two models, and both have structural inefficiencies.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Provisioned Clusters: Paying for Capacity You Don't Use

Provisioned Redshift clusters bill by the hour based on node type and count. A typical production cluster runs 3 ra3.xlplus nodes at $1.086/hour each, totaling $3.26/hour or approximately $2,350/month running 24/7.

The problem: most analytics workloads are bursty. Dashboards fire queries during business hours. ETL runs overnight. Between those peaks, the cluster sits idle, but you still pay. Pausing the cluster saves money but introduces latency when users need it resumed. Resizing a provisioned cluster to handle peak load adds capacity you pay for during off-peak hours.

Serverless: The 60-Second Tax

Redshift Serverless charges per RPU-hour (approximately $0.375/RPU-hour in us-east-1) with a minimum of 4 RPUs. The catch is the 60-second minimum billing per query. A dashboard query that takes 3 seconds still bills for 60 seconds of RPU compute.

For BI workloads, this 60-second minimum is expensive. If a user refreshes a dashboard with 8 charts, each chart fires a query. Even if each query runs in 5 seconds, you are billed for 8 full minutes of RPU compute (8 queries × 60 seconds). Multiply that across 50 users opening dashboards throughout the day and the serverless bill inflates rapidly.

The Full Cost Picture

Cost Component	Provisioned	Serverless	Impact
Idle compute	Cluster runs 24/7, idle nights and weekends	No idle charges (advantage)	Provisioned wastes 40-60% of capacity for bursty workloads
Short query tax	Minimal (cluster already running)	60-second minimum per query	Serverless inflates BI dashboard costs 10-20x
Concurrency Scaling	1 free hour/day, then on-demand rates	Included in RPU-hour	Provisioned sees unpredictable peaks
Redshift Spectrum	$5/TB scanned on S3	Included in RPU-hour	Provisioned adds S3 query costs
Reserved Instance discount	Up to 75% with 3-year commitment	Up to 45% with 3-year reservation	Both require long-term commitment to save
Storage (RMS)	$0.024/GB-month	$0.024/GB-month	Same for both

How Dremio Stops the Redshift Meter

Dremio connects to Redshift as a federated data source. When you enable Reflections, Dremio queries Redshift once to build an optimized Apache Iceberg copy of the data on your S3 storage. Every subsequent matching query is served by Dremio's engine. Redshift never sees the query. No RPU hours consumed. No cluster cycles used.

Cost Reduction Scenario: 50 Active Dashboard Users

Before Dremio (Provisioned Cluster):

3× ra3.xlplus nodes running 24/7
Monthly compute: $2,350
Concurrency Scaling during peak hours: $400/month
Redshift Spectrum for S3 queries: $200/month
Total: $2,950/month

Before Dremio (Serverless):

50 users × 10 dashboard refreshes/day × 8 queries/refresh
4,000 queries/day × 60-second minimum × 8 RPU base
Approximately $2,700/month in RPU-hours
Total: $2,700/month

After Dremio with Autonomous Reflections:

80% of dashboard queries served from Reflections (no Redshift compute)
Redshift handles remaining 20% of queries
Provisioned: Downsize to 2 nodes or switch to Serverless for remaining load = $600-900/month
Dremio Cloud cost (DCUs + cloud infrastructure): $800-1,200/month
Total: $1,400-2,100/month (30-50% savings)

Note: Dremio Cloud pricing includes both Dremio Compute Units (DCUs) and the underlying cloud infrastructure. Even accounting for both, the total is significantly less than the Redshift spend it replaces.

Step-by-Step: Connect Redshift to Dremio

Step 1: Create a Free Dremio Cloud Account

Sign up at Dremio Cloud. The 30-day free trial includes all features including Autonomous Reflections and AI capabilities.

Step 2: Add Redshift as a Source

In the Dremio console, go to Sources → Add Source → Amazon Redshift.

Configure the connection:

Host: Your Redshift cluster endpoint or Serverless workgroup endpoint
Port: 5439 (default)
Database: The Redshift database containing your analytics tables
Username/Password: A Redshift user with SELECT privileges on the target tables

Dremio connects and catalogs your Redshift tables immediately.

Step 3: Build the Semantic Layer

Create virtual datasets (views) that encode your business logic:

CREATE VDS analytics.monthly_revenue AS

SELECT

  DATE_TRUNC('month', order_date) AS order_month,

  sales_region,

  product_line,

  SUM(line_total) AS revenue,

  COUNT(DISTINCT order_id) AS order_count,

  COUNT(DISTINCT customer_id) AS active_customers

FROM redshift_source.public.order_lines ol

JOIN redshift_source.public.customers c ON ol.customer_id = c.id

GROUP BY 1, 2, 3

This view becomes the governed, documented data asset that BI tools and AI agents query. Dremio's AI-generated metadata feature can auto-document these views with descriptions and labels, saving hours of manual cataloging.

Step 4: Enable Reflections

Open the virtual dataset and navigate to the Reflections tab:

Enable Raw Reflections to cache the full result set with optimized sort and partition orders. Best for queries that filter or scan rows.
Enable Aggregate Reflections to pre-compute SUM, COUNT, and AVG aggregations. Best for dashboard queries and summary reports.

Which type to create depends on how the data is used. Create the type that matches your dominant query patterns, or both if usage is mixed.

Dremio reads from Redshift once, builds the Reflection as an Apache Iceberg table on S3, and then serves all matching queries locally.

Step 5: Turn On Autonomous Reflections

Navigate to Project Settings → Reflections → Enable Autonomous Reflections.

Dremio now monitors your query patterns over 7 days and automatically creates, manages, and drops Reflections based on what your users actually run.

Important: Autonomous Reflections only consider datasets natively stored as Iceberg (or UniForm tables, Parquet datasets, and views built on them). While all Reflections are internally stored as Iceberg tables, those internal tables are not exposed from Dremio's catalog. Only your natively-stored Iceberg datasets are eligible for autonomous optimization. This gives you another reason to migrate data from Redshift to Iceberg over time: it unlocks autonomous performance management.

Dremio is an Iceberg-native lakehouse platform with a built-in Open Catalog based on Apache Polaris. Your Iceberg data is stored on your own S3 and accessible to any Iceberg-compatible engine (Spark, Flink, Trino). Dremio autonomously manages query performance without creating engine lock-in.

The system provisions a dedicated small engine for Reflection maintenance that auto-suspends after 30 seconds of idle time.

Dremio's AI Features: Beyond Query Offloading

AI SQL Functions

Run classification, generation, and summarization directly in SQL without exporting data from your analytics environment:

-- Detect PII in customer support tickets stored in Redshift

SELECT

  ticket_id,

  ticket_text,

  AI_CLASSIFY(ticket_text,

    ARRAY['contains_pii', 'no_pii']) AS pii_flag

FROM analytics.support_tickets

WHERE created_date > CURRENT_DATE - INTERVAL '30' DAY

With Redshift alone, this requires building an external ML pipeline: export data → process in SageMaker or a Lambda function → load results back. With Dremio, it is one SQL query.

The AI Agent

Dremio's built-in AI Agent handles the full analytical workflow:

Discover: Browse your Redshift tables and Dremio views using natural language. The Agent uses the semantic layer (virtual datasets, wikis, and labels) to understand your business terminology.
Analyze: Ask business questions in plain English and get SQL + results. The Agent writes the query, executes it, and presents the output without the user touching SQL.
Visualize: Generate charts directly in the Dremio console. The Agent can create bar charts, line charts, and tables from query results in a single conversational turn.
Optimize: The Agent can review slow queries, identify bottlenecks, and suggest performance improvements. It can also analyze past job history to surface recurring inefficiencies.

This reduces the number of ad-hoc Redshift queries because analysts self-serve through the Agent, which runs against Reflections rather than Redshift. Every question answered by the Agent is a query that did not consume RPU-hours or cluster compute.

Results Cache

Beyond Reflections, Dremio also maintains a Results Cache that automatically stores and reuses query results. If the same query is run again and the underlying data has not changed, Dremio returns the cached result instantly. This eliminates redundant computation for repetitive dashboard queries and further reduces the load on both Dremio's engine and any federated sources. Combined with Autonomous Reflections, the Results Cache creates a multi-layered acceleration strategy that maximizes cache hit rates across your analytical workload.

AI-Generated Metadata

Dremio uses generative AI to auto-document your data catalog. By sampling table schemas and data, Dremio generates descriptions (wikis) and suggests tags (labels) for every dataset. This is particularly valuable for Redshift migrations because it builds the documentation that governance teams need without the weeks of manual effort that would normally be required.

MCP Server and External AI Integration

Dremio's MCP Server lets external AI tools (ChatGPT, Claude, custom agents) connect to your Dremio environment. Those AI tools query Dremio's Reflections, not Redshift. Every AI-generated query that hits Dremio instead of Redshift is a query you did not pay Redshift to process.

Architecture: How Dremio and Redshift Work Together

┌─────────────────────────────────────────────────────┐

│          BI Tools / AI Agents / Data Apps            │

│     (QuickSight, Tableau, Power BI, ChatGPT)        │

└──────────────────────┬──────────────────────────────┘

                       │

                       ▼

┌─────────────────────────────────────────────────────┐

│                   Dremio Cloud                       │

│                                                      │

│  ┌─────────────┐  ┌───────────────┐  ┌────────────┐ │

│  │  Semantic    │  │  Autonomous   │  │ AI Agent + │ │

│  │  Layer       │  │  Reflections  │  │ AI SQL     │ │

│  │  (Views +    │  │  (7-day       │  │ Functions  │ │

│  │  AI-gen      │  │  pattern      │  │            │ │

│  │  metadata)   │  │  learning)    │  │            │ │

│  └─────────────┘  └──────┬────────┘  └────────────┘ │

│                          │                           │

│  ┌───────────────────────▼─────────────────────────┐ │

│  │    Apache Iceberg on S3 (Your Account)          │ │

│  │    Reflections + migrated data                  │ │

│  └─────────────────────────────────────────────────┘ │

└──────────────────────┬──────────────────────────────┘

                       │ Only cache-miss

                       │ queries

                       ▼

┌─────────────────────────────────────────────────────┐

│            Amazon Redshift                           │

│    (Provisioned or Serverless, reduced load)         │

└─────────────────────────────────────────────────────┘

The Long-Term Path: From Cost Reduction to Full Modernization

Phase 1 is query offloading and Reflections. Phase 2 is migrating data from Redshift to Apache Iceberg tables on S3.

Dremio's Open Catalog, built on Apache Polaris, manages your Iceberg tables with automatic maintenance: compaction, vacuuming, manifest rewriting, and sort ordering run in the background. Migrating to Iceberg is also what enables Autonomous Reflections, which only consider datasets natively stored as Iceberg. Once your data lives in Iceberg, you can query it with Dremio, Spark, Flink, or any Iceberg-compatible engine. Redshift can also query Iceberg tables via Spectrum, so the same data is accessible from both platforms. The Open Catalog uses the standard Iceberg REST protocol, so other engines can write to the same tables that Dremio is autonomously optimizing. No vendor lock-in.

Phase	Action	Redshift Impact
Phase 1	Connect Redshift + enable Reflections	40-60% compute reduction
Phase 2	Migrate cold/warm tables to Iceberg on S3	Eliminate Redshift storage costs
Phase 3	Downsize or decommission Redshift cluster	Eliminate Redshift compute costs

The migration is incremental. You do not need to move everything at once. Start with the most expensive tables (highest query volume) and work outward.

What Redshift Does Not Provide

Capability	Redshift	Dremio
Autonomous query acceleration	No built-in equivalent	Autonomous Reflections learn from 7-day query patterns
AI SQL functions	Limited ML functions via Redshift ML	AI_CLASSIFY, AI_GENERATE, AI_COMPLETE (LLM-native)
Built-in AI agent	None	Full analytical co-pilot with charts and recommendations
Federated queries	Limited to Redshift Spectrum (S3) and DataShares	Query Redshift + S3 + PostgreSQL + MongoDB + Snowflake + more
Open data format	Proprietary internal format	Native Apache Iceberg with full multi-engine interoperability
Self-documenting catalog	Manual Redshift Data Sharing/Glue descriptions	AI-generated wikis and labels on every dataset
Per-second billing (no minimum)	60-second minimum (Serverless)	Consumption-based with no artificial minimums

Get Started

Sign up for Dremio Cloud: https://www.dremio.com/get-started
Connect Redshift and enable Reflections on your busiest dashboard tables
Monitor your AWS Cost Explorer for Redshift spend over the first week

The Autonomous Reflections feature means savings compound automatically. As Dremio learns your query patterns, it creates optimized Reflections without manual tuning. Most teams measure savings within 7 days.

For documentation and tutorials, visit Dremio Docs or take free courses at Dremio University.

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.

Start For Free

Article Topics

Product Insights from the Dremio Blog

Blog coverpage for Ingesting Data into Aparche Iceberg with Dremio

Feb 1, 2024 Product Insights from the Dremio Blog

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.

Alex Merced

Oct 12, 2023 Product Insights from the Dremio Blog

Table-Driven Access Policies Using Subqueries

This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.

Albert Vernon

Aug 31, 2023 Dremio Blog: News Highlights

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.

Jeremiah Morrow

How to Reduce Amazon Redshift Costs by 40-60% with Dremio’s Agentic Lakehouse

Table of Contents

Where Redshift Money Goes

Try Dremio’s Interactive Demo

Provisioned Clusters: Paying for Capacity You Don't Use

Serverless: The 60-Second Tax

The Full Cost Picture

How Dremio Stops the Redshift Meter

Cost Reduction Scenario: 50 Active Dashboard Users

Step-by-Step: Connect Redshift to Dremio

Step 1: Create a Free Dremio Cloud Account

Step 2: Add Redshift as a Source

Step 3: Build the Semantic Layer

Step 4: Enable Reflections

Step 5: Turn On Autonomous Reflections

Dremio's AI Features: Beyond Query Offloading

AI SQL Functions

The AI Agent

Results Cache

AI-Generated Metadata

MCP Server and External AI Integration

Architecture: How Dremio and Redshift Work Together

The Long-Term Path: From Cost Reduction to Full Modernization

What Redshift Does Not Provide

Get Started

Try Dremio Cloud free for 30 days

Ready to Get Started?

Table of Contents

Where Redshift Money Goes

Try Dremio’s Interactive Demo

Provisioned Clusters: Paying for Capacity You Don't Use

Serverless: The 60-Second Tax

The Full Cost Picture

How Dremio Stops the Redshift Meter

Cost Reduction Scenario: 50 Active Dashboard Users

Step-by-Step: Connect Redshift to Dremio

Step 1: Create a Free Dremio Cloud Account

Step 2: Add Redshift as a Source

Step 3: Build the Semantic Layer

Step 4: Enable Reflections

Step 5: Turn On Autonomous Reflections

Dremio's AI Features: Beyond Query Offloading

AI SQL Functions

The AI Agent

Results Cache

AI-Generated Metadata

MCP Server and External AI Integration

Architecture: How Dremio and Redshift Work Together

The Long-Term Path: From Cost Reduction to Full Modernization

What Redshift Does Not Provide

Get Started

Try Dremio Cloud free for 30 days

Related Dremio Articles

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

Table-Driven Access Policies Using Subqueries

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Ready to Get Started?