Amazon Redshift pricing looks straightforward on paper: provisioned clusters with reserved instances or Serverless with per-second RPU billing. In practice, the bill gets complicated fast. Provisioned clusters charge even when idle if you forget to pause them. Serverless bills a 60-second minimum per query, which inflates costs for the short dashboard queries that make up most BI workloads. Concurrency Scaling adds on-demand charges when parallel queries exceed your cluster's capacity. And Redshift Spectrum charges $5 per TB scanned when querying S3 data.
Dremio provides an alternative: keep Redshift for the workloads that need it, but offload the repetitive, expensive dashboard and reporting queries to Dremio's engine. Dremio's Autonomous Reflections serve those queries from Apache Iceberg tables on your own S3 storage, bypassing Redshift compute entirely. The result is a 40-60% reduction in Redshift compute costs in the first month, without migrating a single table.
This guide walks through the cost reduction strategy, the modern AI features that make Dremio more than a query accelerator, and a step-by-step setup.
Where Redshift Money Goes
Redshift compute costs break into two models, and both have structural inefficiencies.
Try Dremio’s Interactive Demo
Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI
Provisioned Clusters: Paying for Capacity You Don't Use
Provisioned Redshift clusters bill by the hour based on node type and count. A typical production cluster runs 3 ra3.xlplus nodes at $1.086/hour each, totaling $3.26/hour or approximately $2,350/month running 24/7.
The problem: most analytics workloads are bursty. Dashboards fire queries during business hours. ETL runs overnight. Between those peaks, the cluster sits idle, but you still pay. Pausing the cluster saves money but introduces latency when users need it resumed. Resizing a provisioned cluster to handle peak load adds capacity you pay for during off-peak hours.
Serverless: The 60-Second Tax
Redshift Serverless charges per RPU-hour (approximately $0.375/RPU-hour in us-east-1) with a minimum of 4 RPUs. The catch is the 60-second minimum billing per query. A dashboard query that takes 3 seconds still bills for 60 seconds of RPU compute.
For BI workloads, this 60-second minimum is expensive. If a user refreshes a dashboard with 8 charts, each chart fires a query. Even if each query runs in 5 seconds, you are billed for 8 full minutes of RPU compute (8 queries × 60 seconds). Multiply that across 50 users opening dashboards throughout the day and the serverless bill inflates rapidly.
The Full Cost Picture
Cost Component
Provisioned
Serverless
Impact
Idle compute
Cluster runs 24/7, idle nights and weekends
No idle charges (advantage)
Provisioned wastes 40-60% of capacity for bursty workloads
Short query tax
Minimal (cluster already running)
60-second minimum per query
Serverless inflates BI dashboard costs 10-20x
Concurrency Scaling
1 free hour/day, then on-demand rates
Included in RPU-hour
Provisioned sees unpredictable peaks
Redshift Spectrum
$5/TB scanned on S3
Included in RPU-hour
Provisioned adds S3 query costs
Reserved Instance discount
Up to 75% with 3-year commitment
Up to 45% with 3-year reservation
Both require long-term commitment to save
Storage (RMS)
$0.024/GB-month
$0.024/GB-month
Same for both
How Dremio Stops the Redshift Meter
Dremio connects to Redshift as a federated data source. When you enable Reflections, Dremio queries Redshift once to build an optimized Apache Iceberg copy of the data on your S3 storage. Every subsequent matching query is served by Dremio's engine. Redshift never sees the query. No RPU hours consumed. No cluster cycles used.
Cost Reduction Scenario: 50 Active Dashboard Users
Note: Dremio Cloud pricing includes both Dremio Compute Units (DCUs) and the underlying cloud infrastructure. Even accounting for both, the total is significantly less than the Redshift spend it replaces.
Step-by-Step: Connect Redshift to Dremio
Step 1: Create a Free Dremio Cloud Account
Sign up at Dremio Cloud. The 30-day free trial includes all features including Autonomous Reflections and AI capabilities.
Step 2: Add Redshift as a Source
In the Dremio console, go to Sources → Add Source → Amazon Redshift.
Configure the connection:
Host: Your Redshift cluster endpoint or Serverless workgroup endpoint
Port: 5439 (default)
Database: The Redshift database containing your analytics tables
Username/Password: A Redshift user with SELECT privileges on the target tables
Dremio connects and catalogs your Redshift tables immediately.
Step 3: Build the Semantic Layer
Create virtual datasets (views) that encode your business logic:
CREATE VDS analytics.monthly_revenue AS
SELECT
DATE_TRUNC('month', order_date) AS order_month,
sales_region,
product_line,
SUM(line_total) AS revenue,
COUNT(DISTINCT order_id) AS order_count,
COUNT(DISTINCT customer_id) AS active_customers
FROM redshift_source.public.order_lines ol
JOIN redshift_source.public.customers c ON ol.customer_id = c.id
GROUP BY 1, 2, 3
This view becomes the governed, documented data asset that BI tools and AI agents query. Dremio's AI-generated metadata feature can auto-document these views with descriptions and labels, saving hours of manual cataloging.
Step 4: Enable Reflections
Open the virtual dataset and navigate to the Reflections tab:
Enable Raw Reflections to cache the full result set with optimized sort and partition orders. Best for queries that filter or scan rows.
Enable Aggregate Reflections to pre-compute SUM, COUNT, and AVG aggregations. Best for dashboard queries and summary reports.
Which type to create depends on how the data is used. Create the type that matches your dominant query patterns, or both if usage is mixed.
Dremio reads from Redshift once, builds the Reflection as an Apache Iceberg table on S3, and then serves all matching queries locally.
Step 5: Turn On Autonomous Reflections
Navigate to Project Settings → Reflections → Enable Autonomous Reflections.
Dremio now monitors your query patterns over 7 days and automatically creates, manages, and drops Reflections based on what your users actually run.
Important: Autonomous Reflections only consider datasets natively stored as Iceberg (or UniForm tables, Parquet datasets, and views built on them). While all Reflections are internally stored as Iceberg tables, those internal tables are not exposed from Dremio's catalog. Only your natively-stored Iceberg datasets are eligible for autonomous optimization. This gives you another reason to migrate data from Redshift to Iceberg over time: it unlocks autonomous performance management.
Dremio is an Iceberg-native lakehouse platform with a built-in Open Catalog based on Apache Polaris. Your Iceberg data is stored on your own S3 and accessible to any Iceberg-compatible engine (Spark, Flink, Trino). Dremio autonomously manages query performance without creating engine lock-in.
The system provisions a dedicated small engine for Reflection maintenance that auto-suspends after 30 seconds of idle time.
Dremio's AI Features: Beyond Query Offloading
AI SQL Functions
Run classification, generation, and summarization directly in SQL without exporting data from your analytics environment:
-- Detect PII in customer support tickets stored in Redshift
SELECT
ticket_id,
ticket_text,
AI_CLASSIFY(ticket_text,
ARRAY['contains_pii', 'no_pii']) AS pii_flag
FROM analytics.support_tickets
WHERE created_date > CURRENT_DATE - INTERVAL '30' DAY
With Redshift alone, this requires building an external ML pipeline: export data → process in SageMaker or a Lambda function → load results back. With Dremio, it is one SQL query.
The AI Agent
Dremio's built-in AI Agent handles the full analytical workflow:
Discover: Browse your Redshift tables and Dremio views using natural language. The Agent uses the semantic layer (virtual datasets, wikis, and labels) to understand your business terminology.
Analyze: Ask business questions in plain English and get SQL + results. The Agent writes the query, executes it, and presents the output without the user touching SQL.
Visualize: Generate charts directly in the Dremio console. The Agent can create bar charts, line charts, and tables from query results in a single conversational turn.
Optimize: The Agent can review slow queries, identify bottlenecks, and suggest performance improvements. It can also analyze past job history to surface recurring inefficiencies.
This reduces the number of ad-hoc Redshift queries because analysts self-serve through the Agent, which runs against Reflections rather than Redshift. Every question answered by the Agent is a query that did not consume RPU-hours or cluster compute.
Results Cache
Beyond Reflections, Dremio also maintains a Results Cache that automatically stores and reuses query results. If the same query is run again and the underlying data has not changed, Dremio returns the cached result instantly. This eliminates redundant computation for repetitive dashboard queries and further reduces the load on both Dremio's engine and any federated sources. Combined with Autonomous Reflections, the Results Cache creates a multi-layered acceleration strategy that maximizes cache hit rates across your analytical workload.
AI-Generated Metadata
Dremio uses generative AI to auto-document your data catalog. By sampling table schemas and data, Dremio generates descriptions (wikis) and suggests tags (labels) for every dataset. This is particularly valuable for Redshift migrations because it builds the documentation that governance teams need without the weeks of manual effort that would normally be required.
MCP Server and External AI Integration
Dremio's MCP Server lets external AI tools (ChatGPT, Claude, custom agents) connect to your Dremio environment. Those AI tools query Dremio's Reflections, not Redshift. Every AI-generated query that hits Dremio instead of Redshift is a query you did not pay Redshift to process.
Architecture: How Dremio and Redshift Work Together
The Long-Term Path: From Cost Reduction to Full Modernization
Phase 1 is query offloading and Reflections. Phase 2 is migrating data from Redshift to Apache Iceberg tables on S3.
Dremio's Open Catalog, built on Apache Polaris, manages your Iceberg tables with automatic maintenance: compaction, vacuuming, manifest rewriting, and sort ordering run in the background. Migrating to Iceberg is also what enables Autonomous Reflections, which only consider datasets natively stored as Iceberg. Once your data lives in Iceberg, you can query it with Dremio, Spark, Flink, or any Iceberg-compatible engine. Redshift can also query Iceberg tables via Spectrum, so the same data is accessible from both platforms. The Open Catalog uses the standard Iceberg REST protocol, so other engines can write to the same tables that Dremio is autonomously optimizing. No vendor lock-in.
Phase
Action
Redshift Impact
Phase 1
Connect Redshift + enable Reflections
40-60% compute reduction
Phase 2
Migrate cold/warm tables to Iceberg on S3
Eliminate Redshift storage costs
Phase 3
Downsize or decommission Redshift cluster
Eliminate Redshift compute costs
The migration is incremental. You do not need to move everything at once. Start with the most expensive tables (highest query volume) and work outward.
What Redshift Does Not Provide
Capability
Redshift
Dremio
Autonomous query acceleration
No built-in equivalent
Autonomous Reflections learn from 7-day query patterns
Connect Redshift and enable Reflections on your busiest dashboard tables
Monitor your AWS Cost Explorer for Redshift spend over the first week
The Autonomous Reflections feature means savings compound automatically. As Dremio learns your query patterns, it creates optimized Reflections without manual tuning. Most teams measure savings within 7 days.
Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg
By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.
Oct 12, 2023·Product Insights from the Dremio Blog
Table-Driven Access Policies Using Subqueries
This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.
Aug 31, 2023·Dremio Blog: News Highlights
Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud
Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.