May 2, 2024

Optimizing Data Lakehouse Performance: Leveraging Dremio’s SQL Query Engine, Lakehouse Management Features and Apache Iceberg for Scalable Analytics

This talk provides an in-depth exploration of advanced strategies and tools to maximize data processing efficiency and scalability.

At the heart of our discussion is the seamless integration of Dremio and Apache Iceberg, focusing initially on the ease of partitioning with Iceberg. This feature significantly enhances query performance by organizing data that aligns with how it is queried, thereby reducing data scanning volume. We then delve into the essential practices for Iceberg maintenance, ensuring that your data lakehouse remains optimized for current needs and future scalability.

Topics Covered

Iceberg and Table Formats
Performance and Cost Optimization

Sign up to watch all Subsurface 2024 sessions