19 minute read · October 2, 2025
Minimizing Iceberg Table Management with Smart Writing
· Head of DevRel, Dremio
Managing Apache Iceberg tables effectively is often a balancing act between write performance, storage efficiency, and query speed. While Iceberg’s flexibility enables powerful features like time travel and schema evolution, many teams find themselves running frequent OPTIMIZE jobs to compact small files and rebalance partitions. These jobs improve performance but also consume valuable compute resources, extend maintenance windows, and can complicate low-latency or streaming pipelines.
The good news is that many of these challenges can be mitigated before they ever appear. By designing ingestion pipelines, both batch and streaming, to write optimized data up front, data engineers and architects can reduce or even eliminate the need for heavy post-write optimization. This not only lowers operational costs but also keeps systems responsive in real-time scenarios where every second counts.
In this blog, we’ll explore practical strategies for smart writing in Iceberg, including:
- Choosing effective file sizes and compression formats
- Designing partitions and clustering to align with query patterns
- Buffering and batching streaming writes to avoid “small files”
- Leveraging Iceberg table properties for compaction-friendly layouts
By the end, you’ll have a clear framework for building ingestion flows that minimize the need for reactive maintenance, helping you focus less on running compaction jobs and more on delivering value from your data.
Batch Ingestion Best Practices
When loading data into Iceberg tables in batch mode, whether via scheduled ETL jobs or one-time bulk imports, your choices at write time set the stage for future performance and maintenance. Poorly sized files, fragmented partitions, and inefficient compression schemes all increase the likelihood that you’ll need frequent OPTIMIZE jobs later. The goal is to design your batch ingestion so that each write produces data that is “close to optimal” from the beginning.
1. Write Fewer, Larger Files
Iceberg performs best when files are in the 128 MB – 512 MB range. Writing too many small files creates metadata overhead and slows down queries, while very large files reduce query parallelism. Most engines, including Spark and Flink, provide options to control the target file size (for example, via the write.target-file-size-bytes table property). Setting this appropriately ensures your batch jobs emit well-sized Parquet files right away.
Try Dremio’s Interactive Demo
Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI
2. Use Compression and Row Group Settings Wisely
By default, Dremio and Iceberg recommend Zstandard compression because it offers an excellent balance between size reduction and decompression speed. In addition, configuring row group sizes to align with your target file size helps keep files compact while still enabling predicate pushdown and efficient scans.
3. Align Partitioning with Query Workloads
Partitioning is one of the most powerful features of Iceberg, but it can be a double-edged sword if not designed carefully. When writing batch data:
- Partition on fields that are frequently used in filters (e.g., date, region).
- Avoid over-partitioning, which creates many small partitions and leads to scattered small files.
- Use hidden partitioning in Iceberg when possible to simplify write logic and reduce ingestion complexity.
4. Consider Pre-Clustering or Sorting Data
If your queries often scan by a particular column (such as customer_id or event_time), you can pre-sort your batch data before writing. This ensures related records are physically co-located, reducing the need for costly reclustering later.
5. Monitor and Adjust Over Time
Batch ingestion isn’t “set it and forget it.” Regularly check the distribution of file sizes and partition balance in your Iceberg tables. If you consistently see small files being produced, tune your job settings, like increasing parallelism or adjusting the flush size, to correct course before small-file accumulation becomes a bigger issue.
Streaming Ingestion Strategies
While batch jobs give you full control over file sizes and partitioning, streaming pipelines introduce new challenges. Continuous ingestion, especially with tools like Kafka Connect, Flink, or Spark Structured Streaming, often leads to many small files being written to your Iceberg tables. Left unchecked, this creates the classic small files problem, which degrades query performance and increases the need for frequent optimization jobs. With the right techniques, however, you can significantly reduce this overhead.
1. Buffer Events into Micro-Batches
Instead of writing every incoming event as a new file, configure your streaming framework to buffer records into small in-memory batches before flushing to storage. For example:
- Adjust checkpoint intervals in Spark or Flink so files flush every few minutes rather than seconds.
- Set flush thresholds (by record count or size) in Kafka Connect to ensure each commit produces files at least 128–256 MB in size.
This approach keeps latency low while still producing files large enough for efficient reads.
2. Tune File Size Properties for Streaming Workloads
Iceberg provides table properties like write.target-file-size-bytes, which defaults to ~512 MB. For streaming, you may want to lower this target so that files don’t take too long to fill up while still avoiding tiny files. Choosing a target between 128–256 MB often balances latency and efficiency.
3. Leverage Partitioning that Matches Stream Characteristics
Streaming data often arrives sorted by time or another event attribute. Use that to your advantage:
- Partition by event time windows (e.g., date, hour) to keep related records together.
- Avoid overly granular partitions (like minute), which can generate sparse partitions and increase metadata overhead.
4. Align Flush Intervals with Downstream SLAs
Every streaming pipeline has a tradeoff between data freshness and file quality. If downstream systems require data to be queryable within seconds, you may accept smaller files. But if a few minutes of latency is acceptable, you can buffer longer and produce well-sized files that minimize the need for compaction later.
5. Monitor and Adjust in Real Time
Unlike batch ingestion, streaming workloads evolve with traffic patterns. Monitor your file size distribution and number of files per partition daily. If you see a buildup of undersized files, adjust your stream’s buffer or interval settings immediately. Some organizations even implement adaptive flush logic, where flush intervals scale dynamically with throughput.
Partitioning and Clustering Design
Even if your ingestion pipelines are writing data at optimal file sizes, the way those files are organized across partitions and sort orders has a huge impact on query performance and maintenance overhead. Poorly chosen partitions can scatter data, create thousands of small files, and force the system to scan unnecessary data. On the other hand, smart partitioning and clustering strategies ensure that queries read only what they need, while minimizing the burden of post-ingestion optimization.
1. Partition for Your Most Common Queries
Partition columns should reflect how your users and applications filter the data. For example:
- Time-based partitions (e.g., date, hour) are effective when most queries slice data by time ranges.
- Categorical partitions (e.g., region, customer_id) make sense when workloads are strongly aligned with those dimensions.
The key is to choose high-cardinality fields carefully, while they can reduce scan sizes, they may also produce too many small partitions.
2. Avoid Over-Partitioning
It’s tempting to create highly granular partitions (like date + hour + region) for precision, but this often leads to an explosion of small files. Remember that Iceberg supports hidden partitioning, which automatically derives partitions (like extracting year/month/day from a timestamp) without exposing this complexity to the query layer. This keeps ingestion simple while still optimizing file placement.
3. Use Clustering (Sorting) for Secondary Optimizations
Partitioning is not always enough, especially when queries filter by multiple dimensions. In these cases, clustering (sorting) your data at write time can yield big performance gains:
- Sorting by event_time within each date partition makes range scans much more efficient.
- Sorting by customer_id ensures all rows for a customer are co-located, speeding up point lookups and joins.
Dremio’s OPTIMIZE command can recluster data post-write, but pre-clustering during ingestion means fewer heavy compaction jobs later.
4. Balance Query Flexibility with Maintenance Cost
Every partition and clustering decision represents a tradeoff: more selective queries versus more complex ingestion. As a rule of thumb:
- Start with time-based partitions plus one high-value dimension.
- Add clustering on secondary dimensions if queries consistently filter by them.
- Reevaluate partition strategy as workloads evolve, what works for a batch warehouse may not work for a real-time application.
5. Monitor Partition Health
Partition skew, where some partitions grow very large while others stay tiny, can reduce parallelism and create hotspots. Regularly check partition distribution to ensure your strategy is still aligned with data volume and query behavior. Adjust as needed before it snowballs into a costly optimization problem.
Table Properties Tuning
Beyond ingestion strategies and partitioning, Iceberg’s table properties give data engineers and architects fine-grained control over how data is written and managed. Tuning these settings at the table level helps ensure files are sized, compressed, and organized properly from the start, reducing reliance on heavy post-processing like OPTIMIZE.
1. Control File Sizes at Write Time
The property write.target-file-size-bytes sets the desired file size for new data files. By default, it targets ~512 MB, which works well for large batch jobs but may be too high for streaming workloads. Adjusting this value to 128–256 MB in low-latency scenarios helps avoid small files while still producing query-friendly sizes.
2. Optimize Parquet Write Behavior
Iceberg exposes a range of Parquet-specific settings, including:
- write.parquet.compression-codec – defaults to Zstandard for a balance of size and speed.
- write.parquet.row-group-size-bytes – controls how data is chunked within files; larger row groups reduce overhead but may increase scan sizes.
- write.parquet.page-size-bytes – manages the encoding page size, affecting predicate pushdown performance.
Fine-tuning these properties allows you to align storage format with query patterns while keeping files compact and efficient.
3. Manage Metadata Growth
Iceberg uses manifests to track file metadata. Left unchecked, manifest files can accumulate and increase query planning costs. Properties like commit.manifest.target-size-bytes (default ~8 MB) let you control the target size of manifests, ensuring they don’t fragment into too many small files. Pair this with occasional manifest compaction (via OPTIMIZE … REWRITE MANIFESTS) to keep metadata lean.
4. Configure Snapshot Expiration Defaults
Snapshots drive time travel in Iceberg but also consume storage. Table properties such as history.expire.max-snapshot-age-ms (default 5 days) and history.expire.min-snapshots-to-keep (default 1) determine how long old snapshots are retained. Setting these appropriately can reduce storage overhead and automate cleanup without needing to run VACUUM too frequently.
5. Tune for Your Workload, Not Just Defaults
The default values provided by Iceberg and Dremio are designed for general-purpose use. In practice, every workload is different:
- High-throughput batch jobs may benefit from larger target file sizes and row groups.
- Streaming ingestion may require lower targets and smaller flush intervals.
- Metadata-heavy pipelines may need more aggressive manifest sizing.
Monitoring table health (file counts, file sizes, metadata size) will help you decide when tuning is necessary.
Monitoring and Adjustment
Even with the best ingestion patterns, partitioning strategies, and table property tuning, data systems are not static. Data volumes, query patterns, and workload priorities evolve over time. To ensure your Iceberg tables remain optimized with minimal post-processing, you need to actively monitor their health and adjust settings as needed.
1. Track File Size Distribution
Regularly analyze the distribution of file sizes within your Iceberg tables. If you see a buildup of files below 128 MB, that’s a clear signal that your ingestion job parameters, like flush intervals, checkpoint settings, or file size targets, need adjustment. Conversely, if you’re consistently creating files above 512 MB, consider lowering the target size to improve parallelism in query execution.
2. Watch Partition Balance
Partition skew can silently degrade query performance. If a few partitions contain the bulk of data while others remain sparse, queries may overburden certain executors while others stay idle. Monitoring partition-level row counts and file counts helps you identify when partitioning logic needs to be rethought.
3. Monitor Metadata Growth
Over time, manifest files and snapshots accumulate. If query planning starts slowing down, check the number and size of metadata files. This is often an early indicator that you need to adjust manifest sizing properties, expire old snapshots, or schedule lightweight maintenance like REWRITE MANIFESTS.
4. Establish KPIs for Table Health
To stay proactive, define metrics that represent “healthy” tables in your environment, such as:
- Average file size in the 128–512 MB range
- Partition balance within acceptable skew ratios
- Metadata size staying within predictable growth patterns
By monitoring these KPIs, you can react to issues before they require heavy-handed optimization jobs.
5. Iterate as Workloads Change
As data grows, query patterns shift, or new applications come online, revisit your ingestion and table configurations. What worked at 1 TB of data may not scale at 100 TB. Iterative tuning ensures your system remains performant and minimizes the need for frequent OPTIMIZE operations.
Conclusion
The real secret to minimizing Iceberg table maintenance isn’t running more optimization jobs, it’s writing smarter data from the very beginning. By combining batch and streaming ingestion best practices, designing thoughtful partitioning and clustering strategies, tuning table properties, and monitoring file health, you can dramatically reduce the frequency and cost of downstream operations like OPTIMIZE.
For data engineers and architects, this shift in mindset is powerful:
- Lower operational overhead – fewer compaction jobs mean less scheduling, fewer compute spikes, and simpler pipelines.
- More predictable performance – consistently well-sized files and balanced partitions lead to stable, high-performing queries.
- Future-ready systems – as workloads evolve, your ingestion design adapts more easily than constantly relying on reactive optimizations.
Ultimately, smart writing turns maintenance into a proactive practice rather than a reactive burden. Instead of firefighting small files and partition skew, your team can focus on building reliable, performant data platforms that scale cleanly with growth.