Dremio Blog

31 minute read · February 27, 2026

Build Healthcare Analytics with Dremio Cloud

Alex Merced Alex Merced Head of DevRel, Dremio
Start For Free
Build Healthcare Analytics with Dremio Cloud
Copied to clipboard

Key Takeaways

  • Healthcare organizations face challenges with patient data from various sources, leading to outdated analytics.
  • This tutorial teaches you to build a healthcare analytics pipeline on Dremio Cloud to unify data in real-time.
  • You'll create Bronze, Silver, and Gold views to analyze readmission risk and costs using natural language queries.
  • Apache Iceberg helps manage healthcare data by ensuring compliance, schema evolution, and automated performance.
  • Finally, you can use the AI Agent for insightful queries that enhance real-time decision-making in healthcare.

Healthcare organizations collect patient data across Electronic Health Record (EHR) systems, insurance claims platforms, and pharmacy databases. A care coordinator trying to identify patients at risk of readmission needs data from all three. Most organizations solve this with batch ETL jobs that run overnight, meaning clinicians are always working with yesterday's data.

This tutorial shows you how to build a healthcare analytics pipeline on Dremio Cloud that unifies patient, claims, and prescription data in real time. You'll create sample datasets, model them into Bronze, Silver, and Gold views, and use the AI Agent to analyze readmission risk and cost patterns through natural language questions.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

What You'll Build

  • Source tables simulating an EHR system (PostgreSQL), a claims processor (S3), and a pharmacy database (MySQL)
  • Bronze views that standardize dates and column naming
  • Silver views that join patient records with encounters and claims
  • Gold views for readmission risk scoring and cost-per-encounter analysis
  • AI-generated wikis and tags for clinical metadata
  • Natural language analytics for healthcare questions

Step 1: Sign Up for Dremio Cloud

Go to dremio.com/get-started and create a free trial account. You get 30 days and $400 in compute credits.

Your trial includes a project with an Open Catalog built on Apache Polaris and Dremio-managed storage. Navigate to the SQL Runner from the left sidebar.

Step 2: Create Folders and Seed Data

Create a folder structure to simulate the different clinical systems:

CREATE FOLDER IF NOT EXISTS healthcareexp;
CREATE FOLDER IF NOT EXISTS healthcareexp.postgres;
CREATE FOLDER IF NOT EXISTS healthcareexp.s3;
CREATE FOLDER IF NOT EXISTS healthcareexp.mysql;

Note: Tables in the built-in Open Catalog use folder.subfolder.table_name without a catalog prefix. External sources use source_name.schema.table_name.

Patients Table (EHR)

CREATE TABLE healthcareexp.postgres.patients (
  patient_id INT,
  first_name VARCHAR,
  last_name VARCHAR,
  date_of_birth DATE,
  gender VARCHAR,
  zip_code VARCHAR,
  insurance_type VARCHAR,
  primary_condition VARCHAR
);

INSERT INTO healthcareexp.postgres.patients VALUES
(1,'Maria','Santos','1958-03-12','F','10001','Medicare','diabetes_type2'),
(2,'James','Cooper','1972-07-25','M','30301','Commercial','hypertension'),
(3,'Linda','Park','1965-11-08','F','60601','Medicare','heart_failure'),
(4,'Robert','Hughes','1980-02-14','M','90001','Commercial','asthma'),
(5,'Susan','Rivera','1955-09-30','F','77001','Medicare','copd'),
(6,'Michael','Foster','1968-06-18','M','33101','Commercial','diabetes_type2'),
(7,'Patricia','Kim','1975-12-05','F','98101','Medicaid','depression'),
(8,'William','Brooks','1960-04-22','M','02101','Medicare','heart_failure'),
(9,'Jennifer','Lee','1983-08-17','F','85001','Commercial','hypertension'),
(10,'David','Morgan','1952-01-09','M','19101','Medicare','copd'),
(11,'Elizabeth','Chen','1970-10-28','F','94101','Commercial','diabetes_type2'),
(12,'Richard','Davis','1963-05-15','M','48201','Medicare','heart_failure'),
(13,'Nancy','Wilson','1978-03-03','F','28201','Commercial','asthma'),
(14,'Thomas','Garcia','1956-12-20','M','75201','Medicare','copd'),
(15,'Karen','Martinez','1981-07-11','F','80201','Medicaid','depression'),
(16,'Charles','Anderson','1967-09-06','M','55401','Commercial','hypertension'),
(17,'Betty','Thomas','1954-02-18','F','97201','Medicare','diabetes_type2'),
(18,'Joseph','Jackson','1973-11-25','M','32801','Commercial','heart_failure'),
(19,'Dorothy','White','1979-04-30','F','37201','Medicaid','asthma'),
(20,'Daniel','Harris','1961-08-14','M','44101','Medicare','copd'),
(21,'Sandra','Clark','1976-01-22','F','78201','Commercial','diabetes_type2'),
(22,'Mark','Lewis','1958-06-09','M','92101','Medicare','heart_failure'),
(23,'Ashley','Robinson','1984-10-05','F','46201','Commercial','hypertension'),
(24,'Paul','Walker','1962-03-17','M','53201','Medicare','copd'),
(25,'Emily','Hall','1971-12-28','F','40201','Medicaid','depression'),
(26,'Steven','Allen','1966-05-03','M','27601','Commercial','diabetes_type2'),
(27,'Margaret','Young','1953-09-21','F','23451','Medicare','heart_failure'),
(28,'Andrew','Hernandez','1977-02-12','M','78701','Commercial','asthma'),
(29,'Donna','King','1985-07-08','F','32601','Medicaid','depression'),
(30,'Joshua','Wright','1959-11-16','M','20001','Medicare','copd'),
(31,'Carol','Lopez','1974-04-24','F','89101','Commercial','hypertension'),
(32,'Kevin','Hill','1964-08-30','M','63101','Medicare','diabetes_type2'),
(33,'Michelle','Scott','1982-01-15','F','73101','Commercial','asthma'),
(34,'Brian','Green','1957-10-07','M','43201','Medicare','heart_failure'),
(35,'Amanda','Adams','1969-06-23','F','30101','Medicaid','depression'),
(36,'George','Baker','1975-03-09','M','21201','Commercial','hypertension'),
(37,'Deborah','Gonzalez','1960-12-01','F','84101','Medicare','copd'),
(38,'Edward','Nelson','1972-05-18','M','15201','Commercial','diabetes_type2'),
(39,'Laura','Carter','1986-09-27','F','45201','Medicaid','asthma'),
(40,'Ronald','Mitchell','1955-07-14','M','70112','Medicare','heart_failure'),
(41,'Stephanie','Perez','1968-02-06','F','87101','Commercial','hypertension'),
(42,'Timothy','Roberts','1963-11-22','M','29401','Medicare','copd'),
(43,'Rebecca','Turner','1980-04-10','F','50301','Commercial','diabetes_type2'),
(44,'Jeffrey','Phillips','1956-08-25','M','37801','Medicare','heart_failure'),
(45,'Sharon','Campbell','1973-01-18','F','99501','Medicaid','depression'),
(46,'Ryan','Parker','1967-06-30','M','68101','Commercial','asthma'),
(47,'Cynthia','Evans','1954-10-12','F','39201','Medicare','copd'),
(48,'Jacob','Edwards','1978-03-28','M','96801','Commercial','hypertension'),
(49,'Kathleen','Collins','1983-12-04','F','35801','Medicaid','depression'),
(50,'Gary','Stewart','1961-05-21','M','57101','Medicare','diabetes_type2');

Encounters Table (EHR)

CREATE TABLE healthcareexp.postgres.encounters (
  encounter_id INT,
  patient_id INT,
  encounter_date DATE,
  encounter_type VARCHAR,
  diagnosis_code VARCHAR,
  diagnosis_desc VARCHAR,
  provider_id INT,
  department VARCHAR,
  discharge_disposition VARCHAR
);

INSERT INTO healthcareexp.postgres.encounters VALUES
(1,1,'2024-01-15','inpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(2,1,'2024-02-28','emergency','E11.65','Type 2 diabetes with hyperglycemia',102,'emergency','inpatient_admit'),
(3,1,'2024-03-05','inpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(4,3,'2024-01-20','inpatient','I50.9','Heart failure unspecified',103,'cardiology','home'),
(5,3,'2024-02-10','emergency','I50.9','Heart failure unspecified',104,'emergency','inpatient_admit'),
(6,3,'2024-02-15','inpatient','I50.9','Heart failure unspecified',103,'cardiology','snf'),
(7,5,'2024-01-08','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(8,5,'2024-03-20','emergency','J44.1','COPD with acute exacerbation',104,'emergency','inpatient_admit'),
(9,5,'2024-03-25','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(10,8,'2024-02-01','inpatient','I50.9','Heart failure unspecified',103,'cardiology','home'),
(11,8,'2024-03-15','inpatient','I50.9','Heart failure unspecified',103,'cardiology','home'),
(12,10,'2024-01-25','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(13,10,'2024-04-10','emergency','J44.1','COPD with acute exacerbation',104,'emergency','observation'),
(14,12,'2024-02-05','inpatient','I50.9','Heart failure unspecified',103,'cardiology','snf'),
(15,12,'2024-03-01','emergency','I50.9','Heart failure unspecified',104,'emergency','inpatient_admit'),
(16,12,'2024-03-08','inpatient','I50.9','Heart failure unspecified',103,'cardiology','home'),
(17,14,'2024-01-12','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(18,17,'2024-02-20','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(19,20,'2024-03-10','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(20,22,'2024-01-30','emergency','I50.9','Heart failure unspecified',104,'emergency','inpatient_admit'),
(21,22,'2024-02-05','inpatient','I50.9','Heart failure unspecified',103,'cardiology','home'),
(22,24,'2024-03-18','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','snf'),
(23,27,'2024-02-12','emergency','I50.9','Heart failure unspecified',104,'emergency','inpatient_admit'),
(24,27,'2024-02-18','inpatient','I50.9','Heart failure unspecified',103,'cardiology','home'),
(25,30,'2024-01-05','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(26,32,'2024-04-01','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(27,34,'2024-02-25','inpatient','I50.9','Heart failure unspecified',103,'cardiology','snf'),
(28,34,'2024-03-20','emergency','I50.9','Heart failure unspecified',104,'emergency','inpatient_admit'),
(29,34,'2024-03-28','inpatient','I50.9','Heart failure unspecified',103,'cardiology','home'),
(30,37,'2024-01-18','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(31,40,'2024-03-05','emergency','I50.9','Heart failure unspecified',104,'emergency','inpatient_admit'),
(32,40,'2024-03-12','inpatient','I50.9','Heart failure unspecified',103,'cardiology','home'),
(33,42,'2024-02-08','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(34,44,'2024-04-05','inpatient','I50.9','Heart failure unspecified',103,'cardiology','snf'),
(35,44,'2024-04-25','emergency','I50.9','Heart failure unspecified',104,'emergency','inpatient_admit'),
(36,47,'2024-01-22','inpatient','J44.1','COPD with acute exacerbation',105,'pulmonology','home'),
(37,50,'2024-03-15','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(38,2,'2024-02-15','outpatient','I10','Essential hypertension',106,'internal_medicine','home'),
(39,4,'2024-03-08','outpatient','J45.20','Mild intermittent asthma',107,'pulmonology','home'),
(40,6,'2024-01-28','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(41,9,'2024-04-12','outpatient','I10','Essential hypertension',106,'internal_medicine','home'),
(42,11,'2024-02-22','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(43,16,'2024-03-25','outpatient','I10','Essential hypertension',106,'internal_medicine','home'),
(44,21,'2024-04-08','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(45,23,'2024-01-15','outpatient','I10','Essential hypertension',106,'internal_medicine','home'),
(46,26,'2024-02-28','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(47,31,'2024-03-30','outpatient','I10','Essential hypertension',106,'internal_medicine','home'),
(48,36,'2024-04-15','outpatient','I10','Essential hypertension',106,'internal_medicine','home'),
(49,38,'2024-01-10','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home'),
(50,43,'2024-02-18','outpatient','E11.65','Type 2 diabetes with hyperglycemia',101,'endocrinology','home');

Claims Table (Insurance)

CREATE TABLE healthcareexp.s3.claims (
  claim_id INT,
  patient_id INT,
  encounter_id INT,
  claim_date DATE,
  claim_type VARCHAR,
  billed_amount DECIMAL(10,2),
  allowed_amount DECIMAL(10,2),
  paid_amount DECIMAL(10,2),
  status VARCHAR
);

INSERT INTO healthcareexp.s3.claims VALUES
(9001,1,1,'2024-01-20','institutional',18500.00,15200.00,12800.00,'paid'),
(9002,1,2,'2024-03-05','institutional',8200.00,6800.00,5400.00,'paid'),
(9003,1,3,'2024-03-12','institutional',22000.00,18500.00,15200.00,'paid'),
(9004,3,4,'2024-01-28','institutional',32000.00,27500.00,22000.00,'paid'),
(9005,3,5,'2024-02-15','institutional',9500.00,7800.00,6200.00,'paid'),
(9006,3,6,'2024-02-25','institutional',45000.00,38000.00,30400.00,'paid'),
(9007,5,7,'2024-01-15','institutional',16000.00,13200.00,10500.00,'paid'),
(9008,5,8,'2024-03-25','institutional',7800.00,6500.00,5200.00,'paid'),
(9009,5,9,'2024-04-01','institutional',19500.00,16000.00,12800.00,'paid'),
(9010,8,10,'2024-02-08','institutional',28000.00,23500.00,18800.00,'paid'),
(9011,8,11,'2024-03-22','institutional',25000.00,21000.00,16800.00,'paid'),
(9012,10,12,'2024-02-02','institutional',14500.00,12000.00,9600.00,'paid'),
(9013,10,13,'2024-04-15','institutional',5200.00,4300.00,3400.00,'paid'),
(9014,12,14,'2024-02-12','institutional',38000.00,32000.00,25600.00,'paid'),
(9015,12,15,'2024-03-08','institutional',8800.00,7200.00,5700.00,'paid'),
(9016,12,16,'2024-03-18','institutional',30000.00,25500.00,20400.00,'paid'),
(9017,14,17,'2024-01-20','institutional',15000.00,12500.00,10000.00,'paid'),
(9018,17,18,'2024-02-25','professional',450.00,380.00,300.00,'paid'),
(9019,20,19,'2024-03-18','institutional',17000.00,14000.00,11200.00,'paid'),
(9020,22,20,'2024-02-05','institutional',9200.00,7600.00,6000.00,'paid'),
(9021,22,21,'2024-02-12','institutional',26000.00,22000.00,17600.00,'paid'),
(9022,24,22,'2024-03-25','institutional',35000.00,29500.00,23600.00,'paid'),
(9023,27,23,'2024-02-18','institutional',8500.00,7000.00,5600.00,'paid'),
(9024,27,24,'2024-02-25','institutional',24000.00,20000.00,16000.00,'paid'),
(9025,30,25,'2024-01-12','institutional',13500.00,11200.00,8900.00,'paid'),
(9026,32,26,'2024-04-08','professional',380.00,320.00,250.00,'paid'),
(9027,34,27,'2024-03-05','institutional',40000.00,34000.00,27200.00,'paid'),
(9028,34,28,'2024-03-25','institutional',9000.00,7500.00,6000.00,'paid'),
(9029,34,29,'2024-04-05','institutional',28000.00,23500.00,18800.00,'paid'),
(9030,37,30,'2024-01-25','institutional',14000.00,11500.00,9200.00,'paid'),
(9031,40,31,'2024-03-12','institutional',8000.00,6600.00,5200.00,'paid'),
(9032,40,32,'2024-03-20','institutional',27000.00,22500.00,18000.00,'paid'),
(9033,42,33,'2024-02-15','institutional',15500.00,12800.00,10200.00,'paid'),
(9034,44,34,'2024-04-12','institutional',36000.00,30500.00,24400.00,'paid'),
(9035,44,35,'2024-05-02','institutional',9500.00,7800.00,6200.00,'paid'),
(9036,47,36,'2024-01-30','institutional',13000.00,10800.00,8600.00,'paid'),
(9037,50,37,'2024-03-22','professional',420.00,350.00,280.00,'paid'),
(9038,2,38,'2024-02-20','professional',350.00,290.00,230.00,'paid'),
(9039,4,39,'2024-03-15','professional',400.00,330.00,260.00,'paid'),
(9040,6,40,'2024-02-05','professional',450.00,380.00,300.00,'paid'),
(9041,9,41,'2024-04-18','professional',320.00,270.00,210.00,'paid'),
(9042,11,42,'2024-03-01','professional',450.00,380.00,300.00,'paid'),
(9043,16,43,'2024-04-02','professional',350.00,290.00,230.00,'paid'),
(9044,21,44,'2024-04-15','professional',450.00,380.00,300.00,'paid'),
(9045,23,45,'2024-01-22','professional',320.00,270.00,210.00,'paid'),
(9046,26,46,'2024-03-08','professional',450.00,380.00,300.00,'paid'),
(9047,31,47,'2024-04-08','professional',350.00,290.00,230.00,'paid'),
(9048,36,48,'2024-04-22','professional',320.00,270.00,210.00,'paid'),
(9049,38,49,'2024-01-18','professional',450.00,380.00,300.00,'paid'),
(9050,43,50,'2024-02-25','professional',420.00,350.00,280.00,'paid');

Prescriptions Table (Pharmacy)

CREATE TABLE healthcareexp.mysql.prescriptions (
  rx_id INT,
  patient_id INT,
  prescribed_date DATE,
  medication VARCHAR,
  dosage VARCHAR,
  frequency VARCHAR,
  prescriber_id INT,
  refills_remaining INT
);

INSERT INTO healthcareexp.mysql.prescriptions VALUES
(7001,1,'2024-01-15','Metformin','1000mg','twice_daily',101,5),
(7002,1,'2024-01-15','Lisinopril','10mg','once_daily',101,5),
(7003,3,'2024-01-20','Furosemide','40mg','once_daily',103,3),
(7004,3,'2024-01-20','Carvedilol','12.5mg','twice_daily',103,3),
(7005,5,'2024-01-08','Albuterol','90mcg','as_needed',105,2),
(7006,5,'2024-01-08','Tiotropium','18mcg','once_daily',105,5),
(7007,8,'2024-02-01','Furosemide','40mg','once_daily',103,3),
(7008,8,'2024-02-01','Lisinopril','20mg','once_daily',103,5),
(7009,10,'2024-01-25','Albuterol','90mcg','as_needed',105,2),
(7010,12,'2024-02-05','Furosemide','80mg','once_daily',103,3),
(7011,12,'2024-02-05','Carvedilol','25mg','twice_daily',103,3),
(7012,14,'2024-01-12','Tiotropium','18mcg','once_daily',105,5),
(7013,2,'2024-02-15','Amlodipine','5mg','once_daily',106,5),
(7014,4,'2024-03-08','Albuterol','90mcg','as_needed',107,2),
(7015,6,'2024-01-28','Metformin','500mg','twice_daily',101,5),
(7016,9,'2024-04-12','Losartan','50mg','once_daily',106,5),
(7017,11,'2024-02-22','Metformin','1000mg','twice_daily',101,5),
(7018,16,'2024-03-25','Lisinopril','10mg','once_daily',106,5),
(7019,17,'2024-02-20','Metformin','500mg','twice_daily',101,5),
(7020,20,'2024-03-10','Tiotropium','18mcg','once_daily',105,5),
(7021,21,'2024-04-08','Metformin','1000mg','twice_daily',101,5),
(7022,22,'2024-01-30','Furosemide','40mg','once_daily',103,3),
(7023,24,'2024-03-18','Albuterol','90mcg','as_needed',105,2),
(7024,26,'2024-02-28','Metformin','500mg','twice_daily',101,5),
(7025,27,'2024-02-12','Furosemide','60mg','once_daily',103,3),
(7026,30,'2024-01-05','Tiotropium','18mcg','once_daily',105,5),
(7027,32,'2024-04-01','Metformin','500mg','twice_daily',101,5),
(7028,34,'2024-02-25','Furosemide','80mg','once_daily',103,3),
(7029,34,'2024-02-25','Carvedilol','25mg','twice_daily',103,3),
(7030,37,'2024-01-18','Albuterol','90mcg','as_needed',105,2),
(7031,40,'2024-03-05','Furosemide','40mg','once_daily',103,3),
(7032,42,'2024-02-08','Tiotropium','18mcg','once_daily',105,5),
(7033,44,'2024-04-05','Furosemide','60mg','once_daily',103,3),
(7034,47,'2024-01-22','Albuterol','90mcg','as_needed',105,2),
(7035,50,'2024-03-15','Metformin','1000mg','twice_daily',101,5),
(7036,7,'2024-02-10','Sertraline','50mg','once_daily',108,5),
(7037,15,'2024-03-01','Sertraline','100mg','once_daily',108,5),
(7038,19,'2024-04-05','Albuterol','90mcg','as_needed',107,2),
(7039,25,'2024-03-15','Fluoxetine','20mg','once_daily',108,5),
(7040,29,'2024-02-20','Sertraline','50mg','once_daily',108,5),
(7041,33,'2024-03-12','Albuterol','90mcg','as_needed',107,2),
(7042,35,'2024-04-10','Fluoxetine','20mg','once_daily',108,5),
(7043,39,'2024-01-15','Albuterol','90mcg','as_needed',107,2),
(7044,41,'2024-02-06','Losartan','50mg','once_daily',106,5),
(7045,43,'2024-03-20','Metformin','500mg','twice_daily',101,5),
(7046,45,'2024-04-18','Sertraline','75mg','once_daily',108,5),
(7047,46,'2024-03-28','Albuterol','90mcg','as_needed',107,2),
(7048,48,'2024-04-12','Amlodipine','5mg','once_daily',106,5),
(7049,49,'2024-03-05','Fluoxetine','20mg','once_daily',108,5),
(7050,36,'2024-04-15','Lisinopril','10mg','once_daily',106,5);

Why Apache Iceberg Matters for Healthcare Data

Healthcare data has strict regulatory requirements and high analytical demands. Apache Iceberg addresses both.

HIPAA Audit Trails. Iceberg's time travel capability lets you query the exact state of any table at a specific point in time. When an auditor asks what patient data was accessible on a particular date, you can answer with a query, not a backup tape.

Schema Evolution. Clinical data models change. New diagnosis codes get added. Regulatory fields get introduced. Iceberg lets you add, rename, or drop columns without rewriting existing data files. Historical records stay intact while new records pick up the updated schema.

Automated Performance Management. Dremio automatically compacts small files, rewrites manifests, and clusters data in your Iceberg tables. Encounter tables that grow by thousands of records per week stay fast without manual maintenance.

Autonomous Reflections and Results Cache. Common clinical queries like readmission rate by diagnosis or average cost per encounter type follow repeating patterns. Dremio detects these patterns and automatically creates pre-computed Reflections that accelerate future queries. The Results Cache returns identical queries instantly, which is critical when multiple clinicians run the same population health reports.

Iceberg vs. Federated: When to Use Which

In this tutorial, all tables are Iceberg because they're in the Open Catalog. In production, the choice depends on clinical workflow requirements.

Keep data federated when clinical timeliness matters. An EHR system updates patient records in real time as encounters happen. If you replicate that data to Iceberg on an hourly ETL schedule, a care coordinator reviewing a patient who was just admitted might be working with stale data. By querying the EHR's PostgreSQL database directly through Dremio's federation, every query sees the latest admissions, discharges, and lab results. You can still create manual Reflections on those federated views to accelerate repeated queries, and you control the refresh cadence to balance performance against freshness.

Migrate data to Iceberg when you need Dremio's automated performance management or when the data is historical. Closed claims, finalized encounter records, and historical prescription data don't change after they're settled. These datasets benefit from automatic compaction, manifest optimization, clustering, and Autonomous Reflections without any freshness trade-off.

The practical approach for healthcare: federate your live EHR, pharmacy, and active claims systems for real-time visibility. Migrate settled claims, historical encounters, and aggregated population health datasets to Iceberg for automated optimization.

Step 3: Build Bronze Views

Bronze views standardize the raw tables from each clinical system into a consistent interface. Healthcare data is notoriously inconsistent across systems: EHRs store dates differently than claims platforms, column naming conventions vary, and field semantics overlap (a "status" column means different things in encounters vs. claims). The Bronze layer normalizes these differences so downstream views work with a predictable schema.

Create the layer folders:

CREATE FOLDER IF NOT EXISTS healthcareexp.bronze;
CREATE FOLDER IF NOT EXISTS healthcareexp.silver;
CREATE FOLDER IF NOT EXISTS healthcareexp.gold;

This view standardizes the patient demographics table. The date-of-birth cast to TIMESTAMP ensures consistent date arithmetic when calculating age or filtering by date ranges downstream.

CREATE VIEW healthcareexp.bronze.ehr_patients AS
SELECT
  patient_id,
  first_name,
  last_name,
  CAST(date_of_birth AS TIMESTAMP) AS dob_timestamp,
  gender,
  zip_code,
  insurance_type,
  primary_condition
FROM healthcareexp.postgres.patients;

The encounters view brings in the encounter date, type, diagnosis code, department, and discharge disposition. The discharge_disposition field is critical for readmission modeling because patients discharged to a Skilled Nursing Facility (SNF) have different readmission patterns than those discharged home.

CREATE VIEW healthcareexp.bronze.ehr_encounters AS
SELECT
  encounter_id,
  patient_id,
  CAST(encounter_date AS TIMESTAMP) AS encounter_timestamp,
  encounter_type,
  diagnosis_code,
  diagnosis_desc,
  provider_id,
  department,
  discharge_disposition
FROM healthcareexp.postgres.encounters;

The claims view renames status to claim_status to avoid collisions and maintains the three-tier cost structure (billed, allowed, paid) that's essential for financial analysis. The difference between billed_amount and paid_amount represents patient responsibility, which we'll compute in the Gold layer.

CREATE VIEW healthcareexp.bronze.insurance_claims AS
SELECT
  claim_id,
  patient_id,
  encounter_id,
  CAST(claim_date AS TIMESTAMP) AS claim_timestamp,
  claim_type,
  billed_amount,
  allowed_amount,
  paid_amount,
  status AS claim_status
FROM healthcareexp.s3.claims;

The pharmacy view standardizes prescriptions with a date cast. Prescription data is useful for understanding medication adherence patterns and correlating treatment intensity with clinical outcomes.

CREATE VIEW healthcareexp.bronze.pharmacy_rx AS
SELECT
  rx_id,
  patient_id,
  CAST(prescribed_date AS TIMESTAMP) AS prescribed_timestamp,
  medication,
  dosage,
  frequency,
  prescriber_id,
  refills_remaining
FROM healthcareexp.mysql.prescriptions;

Step 4: Build Silver Views

Silver views combine clinical data across systems to create patient-centric analytics. Each Silver view answers a specific clinical question by joining data that previously required manual chart review.

Patient Encounter History

This view answers: "What does each patient's utilization pattern look like?" It counts total encounters by type (inpatient, emergency, outpatient) and tracks SNF discharges. These are the core indicators that CMS (Centers for Medicare & Medicaid Services) readmission models use to predict 30-day readmission risk. The LEFT JOIN ensures patients with zero encounters still appear, which matters for population health reporting.

CREATE VIEW healthcareexp.silver.patient_encounters AS
SELECT
  p.patient_id,
  p.first_name || ' ' || p.last_name AS patient_name,
  p.gender,
  p.insurance_type,
  p.primary_condition,
  COUNT(e.encounter_id) AS total_encounters,
  SUM(CASE WHEN e.encounter_type = 'inpatient' THEN 1 ELSE 0 END) AS inpatient_count,
  SUM(CASE WHEN e.encounter_type = 'emergency' THEN 1 ELSE 0 END) AS ed_visits,
  SUM(CASE WHEN e.discharge_disposition = 'snf' THEN 1 ELSE 0 END) AS snf_discharges,
  MIN(e.encounter_timestamp) AS first_encounter,
  MAX(e.encounter_timestamp) AS last_encounter
FROM healthcareexp.bronze.ehr_patients p
LEFT JOIN healthcareexp.bronze.ehr_encounters e ON p.patient_id = e.patient_id
GROUP BY p.patient_id, p.first_name, p.last_name, p.gender, p.insurance_type, p.primary_condition;

Patient Claims Summary

This view answers: "How much does each patient cost the system?" It aggregates all claims per patient, computing total billed, total paid, and average claim amount. The split between total_billed and total_paid reveals the financial burden on both the payer and the patient. We use COALESCE to ensure patients without claims show zeros rather than NULLs.

CREATE VIEW healthcareexp.silver.patient_claims AS
SELECT
  p.patient_id,
  p.first_name || ' ' || p.last_name AS patient_name,
  p.insurance_type,
  COUNT(c.claim_id) AS total_claims,
  COALESCE(SUM(c.billed_amount), 0) AS total_billed,
  COALESCE(SUM(c.paid_amount), 0) AS total_paid,
  COALESCE(AVG(c.paid_amount), 0) AS avg_claim_paid
FROM healthcareexp.bronze.ehr_patients p
LEFT JOIN healthcareexp.bronze.insurance_claims c ON p.patient_id = c.patient_id
GROUP BY p.patient_id, p.first_name, p.last_name, p.insurance_type;

Step 5: Build Gold Views

Gold views are the final clinical analytics layer. Clinicians, care managers, and the AI Agent query these directly. Each Gold view pre-computes a specific clinical decision so the end user gets actionable information without needing to understand the underlying data model.

Patient Readmission Risk

This is the primary view for care management teams. It classifies patients into readmission risk categories (High, Moderate, Low, Minimal) by combining encounter utilization patterns with total cost. A patient with 2+ ED visits and 2+ inpatient stays is "High" risk because this pattern strongly correlates with 30-day hospital readmission. Patients discharged to a SNF or with any ED visit are "Moderate" because they need follow-up but aren't in the highest-risk bracket. This classification drives outreach prioritization: care coordinators focus on "High" patients first.

CREATE VIEW healthcareexp.gold.patient_readmission_risk AS
SELECT
  enc.patient_id,
  enc.patient_name,
  enc.primary_condition,
  enc.insurance_type,
  enc.total_encounters,
  enc.inpatient_count,
  enc.ed_visits,
  enc.snf_discharges,
  cl.total_billed,
  cl.total_paid,
  CASE
    WHEN enc.ed_visits >= 2 AND enc.inpatient_count >= 2 THEN 'High'
    WHEN enc.ed_visits >= 1 OR enc.snf_discharges >= 1 THEN 'Moderate'
    WHEN enc.inpatient_count >= 1 THEN 'Low'
    ELSE 'Minimal'
  END AS readmission_risk
FROM healthcareexp.silver.patient_encounters enc
JOIN healthcareexp.silver.patient_claims cl ON enc.patient_id = cl.patient_id;

Cost Per Encounter

This view enables financial analysis at the individual encounter level. It joins each encounter with its corresponding claim and computes patient_responsibility (the difference between billed and paid amounts). The cost_category classification (High Cost > $20K, Medium Cost > $5K, Standard Cost) lets financial analysts and the AI Agent quickly identify expensive encounters. This is useful for answering questions like "Which departments generate the highest-cost encounters?" or "What percentage of heart failure admissions are High Cost?"

CREATE VIEW healthcareexp.gold.cost_per_encounter AS
SELECT
  e.encounter_id,
  p.patient_id,
  p.first_name || ' ' || p.last_name AS patient_name,
  p.insurance_type,
  e.encounter_type,
  e.diagnosis_desc,
  e.department,
  e.discharge_disposition,
  c.billed_amount,
  c.paid_amount,
  c.billed_amount - c.paid_amount AS patient_responsibility,
  CASE
    WHEN c.paid_amount > 20000 THEN 'High Cost'
    WHEN c.paid_amount > 5000 THEN 'Medium Cost'
    ELSE 'Standard Cost'
  END AS cost_category
FROM healthcareexp.bronze.ehr_encounters e
JOIN healthcareexp.bronze.ehr_patients p ON e.patient_id = p.patient_id
LEFT JOIN healthcareexp.bronze.insurance_claims c ON e.encounter_id = c.encounter_id;

Step 6: Enable AI-Generated Wikis and Tags

  1. Click Admin in the left sidebar, then go to Project Settings.
  2. Select the Preferences tab.
  3. Scroll to the AI section and enable Generate Wikis and Labels.
  4. Go to the Catalog and navigate to your Gold views under healthcareexp.gold.
  5. Click the Edit button (pencil icon) next to the desired view.
  6. In the Details tab, find the Wiki section and click Generate Wiki. Do the same for the Tags section by clicking Generate Tags.
  7. Repeat for each Gold view.

The AI will produce descriptions like: "patient_readmission_risk: One row per patient with encounter frequency, ED visit count, SNF discharge count, total cost, and a readmission risk classification (High/Moderate/Low/Minimal) based on utilization patterns."

To enhance the generated wiki with clinical context, copy the output into the AI Agent on the homepage and ask it to produce an improved version in a markdown code block. For example, you might ask the Agent to add: "High readmission risk patients are flagged for care management outreach within 48 hours of discharge. SNF indicates Skilled Nursing Facility." Copy the refined output and paste it back into the wiki editor.

Step 7: Ask Questions with the AI Agent

Navigate to the Homepage and use the AI Agent:

"Which patients have a High readmission risk?"

The Agent queries healthcareexp.gold.patient_readmission_risk and returns patients with 2+ ED visits and 2+ inpatient stays.

"Show me a chart of total cost by primary condition."

The Agent generates a bar chart from the claims and encounter data, highlighting which conditions drive the most spending.

"What is the average cost per encounter by department?"

This query uses healthcareexp.gold.cost_per_encounter. The Agent groups by department and calculates the average, showing that cardiology and pulmonology inpatient stays cost significantly more than outpatient visits.

"Create a chart showing monthly encounter volume by type."

The Agent generates a monthly trend line separated by inpatient, emergency, and outpatient encounters.

Next Steps

  • Connect real clinical systems. Replace the simulated folders with federated connections to your EHR, claims, and pharmacy databases.
  • Add FGAC for PHI. Use Fine-Grained Access Control to mask patient names and dates of birth for analysts who only need de-identified data.
  • Use Iceberg time travel. Run point-in-time queries for HIPAA compliance audits without maintaining separate audit databases.
  • Scale with Autonomous Reflections. Readmission risk and cost queries run frequently. Dremio will automatically create Reflections to keep them fast as data volumes grow.

The healthcare analytics pipeline you built here demonstrates that clinical analytics doesn't require a 6-month data warehouse project. With Dremio, you go from raw clinical data to AI-powered insights in a single session.

Try Dremio Cloud free for 30 days and build healthcare analytics on your own data.

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.