Dremio Blog

23 minute read · March 26, 2025

Disaster Recovery for Apache Iceberg Tables – Restoring from Backup and Getting Back Online

Alex Merced Head of DevRel, Dremio

Start For Free

Copied to clipboard

Disaster Recovery for Apache Iceberg Tables – Restoring from Backup and Getting Back Online

Common Recovery Scenarios After a Filesystem Restore

Deep Dive – Spark Procedures for Iceberg Recovery

Best Practices for Backing Up and Restoring Iceberg Tables

Conclusion

Disaster recovery isn’t just for databases and monolithic data warehouses anymore. In the world of modern data lakehouses—where Apache Iceberg is increasingly used to power large-scale, multi-engine analytics—knowing how to recover from failure is just as important as knowing how to scale.

Imagine this: your data platform experiences a serious outage. Maybe a cloud region goes offline. Maybe a migration goes wrong. Maybe a human error wipes out part of your data lake. You’ve got a backup of your files, and you restore them to storage. But when you go to query your Iceberg tables… nothing works.

What happened?

The truth is, restoring Iceberg tables from a filesystem backup isn’t as simple as putting files back in place. There are two critical questions you need to ask:

Does your catalog metadata still reference the correct metadata.json file?
Do your metadata files still reference valid data and manifest file paths, especially since Iceberg uses absolute paths?

If the answer to either is “no,” your tables won’t function properly—even if all your data is physically present.

The good news: Apache Iceberg provides tools for precisely this situation. Spark procedures like register_table and rewrite_table_path give you the ability to bring your tables back to life by realigning the catalog and metadata with the restored files.

In this blog, we’ll walk through:

How Iceberg stores metadata and why that matters for recovery
Common recovery scenarios and how to handle them
How to use built-in Spark procedures to fix broken references
Best practices for disaster-proofing your Iceberg tables in the future

If your backup strategy includes Iceberg tables—or should—this is the guide to keep handy.

Common Recovery Scenarios After a Filesystem Restore

Let’s say you’ve restored your Iceberg table files—metadata, manifests, and data—from a backup. That’s a good start. But unless your environment is exactly the same as it was pre-disaster, chances are, something is out of sync.

Let’s look at the two most common recovery scenarios and how to fix them using Iceberg’s built-in Spark procedures.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Scenario 1: The Catalog Is Missing or Points to the Wrong `metadata.json`

In this scenario, all your table files are present, but the catalog doesn’t point to the correct snapshot of the table. You might be dealing with:

A catalog entry pointing to an newer metadata.json from after your last backup
A table that was never registered in the current catalog environment

This is where Iceberg’s register_table procedure comes in.

✅ How to Fix It: Register the Latest Metadata File

You’ll need to:

Locate the most recent metadata.json file in the restored table’s metadata/ directory.
Use the register_table procedure to create a new catalog entry pointing to that file.

CALL spark_catalog.system.register_table(
  table => 'db.restored_table',
  metadata_file => 's3a://restored-bucket/db/restored_table/metadata/v17.metadata.json'
);

🔐 Important:

Only use register_table if the table is not already registered or the current reference is incorrect, or if you’re intentionally moving it to a new catalog.
Registering the same metadata in multiple catalogs can lead to corruption or lost updates.

File Paths Have Changed During Restore

This scenario is a bit trickier. Let’s say you originally stored your data in HDFS, but you restored it to S3 during recovery. Or maybe you moved the files to a different bucket, base directory, or region. In either case, your metadata files still reference the old paths.

Because Iceberg uses absolute file paths for all its references—data files, delete files, manifests, even snapshot lineage—these path mismatches break everything.

✅ How to Fix It: Rewrite Metadata Paths with `rewrite_table_path`

Iceberg provides the rewrite_table_path procedure to safely rewrite all metadata references from one path prefix to another.

CALL spark_catalog.system.rewrite_table_path(
    table => 'db.my_table',
    source_prefix => 'hdfs://nn:8020/warehouse/my_table',
    target_prefix => 's3a://my-bucket/warehouse/my_table'
);

This procedure:

Rewrites the metadata.json, manifests, and manifest lists with the new prefix
Outputs the location of a CSV mapping file, showing which files need to be physically copied to the new location

After this, you’ll still need to:

Use a tool like Distcp or AWS CLI to actually copy the data files based on the mapping
Optionally, use register_table if the table has not yet been registered in the catalog

🛠 Optional: Incremental Rewrites

If you only need to update a subset of metadata versions—say, from v2.metadata.json to v20.metadata.json—you can specify a version range and a custom staging location:

CALL spark_catalog.system.rewrite_table_path(
    table => 'db.my_table',
    source_prefix => 's3a://old-bucket/db.my_table',
    target_prefix => 's3a://new-bucket/db.my_table',
    start_version => 'v2.metadata.json',
    end_version => 'v20.metadata.json',
    staging_location => 's3a://staging-bucket/db.my_table'
);

This provides more control and limits the scope of changes—ideal for partial restores or migrations.

Deep Dive – Spark Procedures for Iceberg Recovery

Once you’ve diagnosed the issue—whether it’s a missing catalog entry or outdated file paths—you’ll need to get hands-on with the Spark procedures Iceberg provides for recovery. These aren't just utility functions—they’re purpose-built tools that give you surgical control over table state.

Let’s examine each procedure in more detail, including what it does, how to use it, and where to be careful.

`register_table`: Reconnect the Catalog to Your Table

The register_table procedure allows you to create a new catalog entry for an Iceberg table, pointing to an existing metadata.json file. This is useful when:

The catalog was lost, corrupted, or never existed in the restored environment
You’ve manually restored metadata and need to reintroduce the table to the catalog
You’re moving a table between catalogs (e.g., from Hive Metastore to AWS Glue)

Usage

CALL spark_catalog.system.register_table(
  table => 'db.restored_table',
  metadata_file => 's3a://my-bucket/db/restored_table/metadata/v17.metadata.json'
);

Arguments

Name	Required	Type	Description
`table`	✔️	string	The table identifier to register in the catalog
`metadata_file`	✔️	string	Full path to the Iceberg `metadata.json` to register

Output

Output Name	Type	Description
`current_snapshot_id`	long	ID of the current snapshot in the table
`total_records_count`	long	Total record count in the restored table
`total_data_files_count`	long	Number of data files in the restored table

⚠️ Caution

Avoid using this procedure if the same table is still registered in another catalog. Having the same table registered in multiple catalogs can lead to:

Conflicting updates
Lost snapshots
Corrupted metadata

Always de-register a table from one catalog before re-registering it in another.

`rewrite_table_path`: Update File Paths in Metadata

The rewrite_table_path procedure is used when the physical file locations have changed, and you need to update all metadata references accordingly.

This could be due to:

Restoring from HDFS to S3
Moving to a new cloud bucket
Renaming directory prefixes during recovery

Usage

CALL spark_catalog.system.rewrite_table_path(
    table => 'db.my_table',
    source_prefix => 'hdfs://nn:8020/warehouse/db.my_table',
    target_prefix => 's3a://new-bucket/warehouse/db.my_table'
);

Arguments

Name	Required	Default	Type	Description
`table`	✔️		string	Name of the Iceberg table
`source_prefix`	✔️		string	Prefix to be replaced in absolute file paths
`target_prefix`	✔️		string	Replacement prefix for the updated file paths
`start_version`	❌	First	string	First `metadata.json` to include in rewrite
`end_version`	❌	Latest	string	Last `metadata.json` to include in rewrite
`staging_location`	❌	New dir	string	Directory to write new metadata files to

Output

Output Name	Type	Description
`latest_version`	string	Name of the last metadata file rewritten
`file_list_location`	string	Path to CSV with source-to-target file copy plan

Modes of Operation

Full Rewrite: Rewrites all reachable metadata files (default).
Incremental Rewrite: Set start_version and end_version to limit the scope.

What Happens After Rewrite?

New metadata files are written to the staging_location
You’ll receive a CSV mapping of which data files need to be moved or copied
You’re responsible for copying those files (e.g., with Distcp, aws s3 cp, etc.)
Once in place, you can use register_table to reconnect the rewritten metadata to the catalog

Putting It All Together

In a real disaster recovery situation, you might use both procedures in sequence:

Use rewrite_table_path to update all absolute paths in metadata
Physically move or sync the files to the new location
Use register_table to register the updated table in your target catalog

Let's explore best practices for backing up and restoring Iceberg tables and how to ensure a smooth, stress-free recovery.

Best Practices for Backing Up and Restoring Iceberg Tables

Apache Iceberg’s architecture gives you fine-grained control over data recovery, but with that control comes responsibility. If you want disaster recovery to be painless (or at least manageable), it's critical to design your backup strategy and recovery plan around the unique way Iceberg handles metadata and file paths.

Here are the top best practices to follow before and after a restore.

1. Always Backup Metadata and Data Together

Iceberg separates metadata from data, but you need both for recovery.

Metadata includes the metadata/ directory (containing v1.metadata.json, v2.metadata.json, etc.), manifest files, and manifest lists.
Data includes all your Parquet, ORC, or Avro files referenced by those manifests.

If you’re using tools like Distcp, rsync, or object storage replication, make sure both metadata and data directories are included in the same snapshot or backup window.

⚠️ Backing up only data files without corresponding metadata makes restoration nearly impossible.

2. Track the Latest `metadata.json` in Every Backup

When backing up an Iceberg table, always record the path to the latest metadata.json file. This file acts as the entry point to the entire table state.

Why this matters:

If the catalog entry is lost, you'll need this file to re-register the table using register_table.
If you restore multiple tables, it can be hard to determine which snapshot was the most recent without a tracking log.

Create a backup log or manifest for each table with entries like:

table_name: customer_events
latest_metadata: s3a://backups/iceberg/customer_events/metadata/v57.metadata.json
timestamp: 2025-03-24T04:00:00Z

3. Check for File Path Changes Before Recovery

One of the most common issues after a restore is that metadata files contain absolute paths that no longer match your storage layout. For example:

You restored from hdfs:// to s3a://
You renamed the top-level directory or changed bucket prefixes
You used a new storage mount or alias

Before re-registering a table, inspect the metadata and manifest files to confirm whether the file paths match their new locations. If not, use rewrite_table_path to adjust them before registering.

4. Automate Validation Post-Restore

After restoring and registering a table, don’t assume everything works—validate it.

Things to check:

Can you run a simple SELECT COUNT(*)?
Does the snapshot history include what you expect?
Are partition pruning and data skipping functioning normally?
Do DESCRIBE HISTORY and DESCRIBE DETAIL return expected results?

You can automate these checks as part of your recovery workflow using Spark or Trino queries to catch misconfigurations early.

5. Dry-Run Your Recovery Plan

Don’t wait until a real disaster to test your recovery workflow.

Restore a test table from a known-good backup to a staging catalog
Practice using rewrite_table_path and register_table
Measure how long the recovery takes and what tools you need
Document the entire process in an internal runbook

Even a few dry runs will make a difference when you're under pressure in an actual outage.

Conclusion

In data lakehouses, disaster recovery is no longer just an afterthought. As Apache Iceberg powers more mission-critical workloads across cloud and hybrid environments, your ability to recover tables reliably becomes as important as your ability to query them quickly.

Unlike traditional databases, Iceberg doesn’t bundle storage, metadata, and catalog into a single system. Instead, it gives you flexibility—with the tradeoff that restoring from a backup requires understanding how those components fit together:

If your catalog is missing or misaligned, use register_table to re-establish the link to your latest metadata.json.
If your file paths have changed, use rewrite_table_path to rewrite metadata and create a clean recovery copy.
And if you want to avoid firefighting in the future, implement proactive backup practices, track your metadata versions, and validate your recovery process in advance.

The truth is, Iceberg gives you everything you need to be resilient—but it doesn't hold your hand. That’s a strength, not a weakness, for teams ready to take ownership of their data systems.

Recovery doesn’t have to be a high-stakes scramble. It can be a smooth, auditable, and even routine process with a thoughtful plan, the right tools, and a few good habits.

So the next time disaster strikes your data lake—whether it’s a misconfigured job, a storage failure, or a region outage—you won’t be scrambling. You’ll be executing.

Get Hands-on with Apache Iceberg on your Laptop

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.

Start For Free

Article Topics

Dremio Blog: Open Data Insights

Sep 22, 2023 Dremio Blog: Open Data Insights

Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop

We're always looking for ways to better handle and save money on our data. That's why the "data lakehouse" is becoming so popular. It offers a mix of the flexibility of data lakes and the ease of use and performance of data warehouses. The goal? Make data handling easier and cheaper. So, how do we […]

Alex Merced

Aug 16, 2023 Dremio Blog: News Highlights

5 Use Cases for the Dremio Lakehouse

With its capabilities in on-prem to cloud migration, data warehouse offload, data virtualization, upgrading data lakes and lakehouses, and building customer-facing analytics applications, Dremio provides the tools and functionalities to streamline operations and unlock the full potential of data assets.

Alex Merced

Aug 31, 2023 Dremio Blog: News Highlights

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.

Jeremiah Morrow

Disaster Recovery for Apache Iceberg Tables – Restoring from Backup and Getting Back Online

Table of Contents

Common Recovery Scenarios After a Filesystem Restore

Try Dremio’s Interactive Demo

Scenario 1: The Catalog Is Missing or Points to the Wrong `metadata.json`

✅ How to Fix It: Register the Latest Metadata File

🔐 Important:

File Paths Have Changed During Restore

✅ How to Fix It: Rewrite Metadata Paths with `rewrite_table_path`

🛠 Optional: Incremental Rewrites

Deep Dive – Spark Procedures for Iceberg Recovery

`register_table`: Reconnect the Catalog to Your Table

Usage

Arguments

Output

⚠️ Caution

`rewrite_table_path`: Update File Paths in Metadata

Usage

Arguments

Output

Modes of Operation

What Happens After Rewrite?

Putting It All Together

Best Practices for Backing Up and Restoring Iceberg Tables

1. Always Backup Metadata and Data Together

2. Track the Latest `metadata.json` in Every Backup

3. Check for File Path Changes Before Recovery

4. Automate Validation Post-Restore

5. Dry-Run Your Recovery Plan

Conclusion

Try Dremio Cloud free for 30 days

Ready to Get Started?

Table of Contents

Common Recovery Scenarios After a Filesystem Restore

Try Dremio’s Interactive Demo

Scenario 1: The Catalog Is Missing or Points to the Wrong metadata.json

✅ How to Fix It: Register the Latest Metadata File

🔐 Important:

File Paths Have Changed During Restore

✅ How to Fix It: Rewrite Metadata Paths with rewrite_table_path

🛠 Optional: Incremental Rewrites

Deep Dive – Spark Procedures for Iceberg Recovery

register_table: Reconnect the Catalog to Your Table

Usage

Arguments

Output

⚠️ Caution

rewrite_table_path: Update File Paths in Metadata

Usage

Arguments

Output

Modes of Operation

What Happens After Rewrite?

Putting It All Together

Best Practices for Backing Up and Restoring Iceberg Tables

1. Always Backup Metadata and Data Together

2. Track the Latest metadata.json in Every Backup

3. Check for File Path Changes Before Recovery

4. Automate Validation Post-Restore

5. Dry-Run Your Recovery Plan

Conclusion

Try Dremio Cloud free for 30 days

Related Dremio Articles

Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop

5 Use Cases for the Dremio Lakehouse

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Ready to Get Started?

Scenario 1: The Catalog Is Missing or Points to the Wrong `metadata.json`

✅ How to Fix It: Rewrite Metadata Paths with `rewrite_table_path`

`register_table`: Reconnect the Catalog to Your Table

`rewrite_table_path`: Update File Paths in Metadata

2. Track the Latest `metadata.json` in Every Backup