Dremio Blog

8 minute read · April 16, 2024

Deep Dive into Better Stability with the new Memory Arbiter

Matthew Baker Senior Product Manager

Start For Free

Copied to clipboard

Deep Dive into Better Stability with the new Memory Arbiter

Introduction

How Memory Arbiter changes execution

How Memory Arbiter Works

Testing Memory Arbiter and Spillable Hash Join

Initial Customer Testing

Recommended changes to WLM

Closing

Introduction

Memory Arbiter is a new and important feature that was released to Dremio Cloud and is coming to Dremio 25.0. MA enables executors to better utilize their direct memory by accurately tracking its usage across certain operators and, where possible, dynamically reducing the memory footprint of those operators to allow others to consume it. This, along with Spillable Hash Join, is the first of several features in the pipeline for Dremio to react better to diminishing resources and keep running where it previously would have failed.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

How Memory Arbiter changes execution

Memory Arbiter changes execution for a particular set of operators once a fragment of work has landed on an executor and processing is underway. The way queries are planned and initial memory allocation remain the same and follow this flow:

A query is submitted, and a plan is generated
Based on the constituent operators of the plan and metadata regarding the source, a minimum viable estimate is generated

For operators that do not hold or expand as data is processed, this is a small static number
For operators that do hold or expand as data is processed, formulas are leveraged to generate this initial allocation based on the stats of the queried datasets

The initial memory allocation for a fragment of execution is made on the executors
Further memory allocations are made throughout the execution of the fragment as needed

The key differentiator is the final point. Previously, all operators would request as much memory as needed until query completion or memory exhaustion. Existing guard rails did offer some fault tolerance but would often intervene too late and not be able to prevent query failure. For example, spillable operators had their spill trigger point defined during planning; however, if one encountered an out-of-memory exception, it would start spilling instead. This would prevent failures, but such an exception signals there is little headroom available to continue execution. With Memory Arbiter, executors are now able to react more quickly to changes in allocatable memory instead of relying on arbitrary spill points set in the plan or an out-of-memory exception.

How Memory Arbiter Works

As mentioned above, Memory Arbiter only changes the execution of four operators (as of the time of writing):

Hash Join
Hash Agg
External Sort
TopN

These operators are often the highest memory consumers and, importantly, also have a mechanism to shrink their memory usage. Hash Join, Hash Agg and External Sort can spill in mid-operation, and the TopN can discard data structures that were used while processing and are no longer needed for the output. Each executor has its own Memory Arbiter that tracks each instance of these operators, the amount of memory they’re using and, importantly, the amount of memory they’re able to give up. If memory usage on an executor crosses a threshold, Memory Arbiter will start requesting that these operators shrink their usage, starting with the one with the most to give. Memory Arbiter also acts as a middleman for new memory allocation requests for these operators. Therefore, requests to shrink can be paired with temporarily blocking new memory allocations, thus reducing existing allocations and growth rate.

Memory Arbiter only tracks Direct memory since data processing happens in Direct memory. Not all operations in Direct memory are being tracked, despite Memory Arbiter's best efforts, Direct memory exhaustion remains possible.

Testing Memory Arbiter and Spillable Hash Join

The addition of Memory Arbiter and Spillable Hash Join is a significant change that requires expensive testing. We used TPC-DS, our Regression suite, and a new suite based on Customer queries. The customer queries selected were either very complex or were prone to failures before Memory Arbiter. All of these were compared to runs of the same version of Dremio with the features disabled.

Test Suite	Data	Run style	Number of executors
TPC-DS	1TB partitioned and non-partitioned	In-series	2, 4, 8
	1TB partitioned and non-partitioned	Concurrent 10, 20, 50	2, 4, 8
	10TB partitioned and non-partitioned	In-series	16
	100TB partitioned	In-series	32
Regression	Synthetic - relatively small to allow for faster cycles	In-series	8
Customer Queries	Synthetic - Same size as the original data set	In-series	8

We targeted 0 failures with a 5% performance delta for all executor counts.

With the inclusion of Memory Arbiter and Spillable Hash Join, Dremio could complete a full 100TB TPC-DS run without failure, which was previously impossible.

Initial Customer Testing

Multiple EE Cloud customers took part in the private beta. Some had been experiencing memory-related issues, while others were more to gauge any perceived performance impact. Comparing the same-sized time window before and after the features enablement (close to two months on either side), we saw some very positive results for customers facing memory-related issues.

The highest saw a 90% decrease in Direct memory allocation errors, with the lowest seeing a 56% decrease. This graph shows the customers that saw the 90% decrease:

MA = Memory Arbiter, SHJ = Spillable Hash Join. The two spikes were driven by an automated process, which repeatedly submitted the same query, even if it had not succeeded before and was destined to fail again.

The graph shows direct memory errors occurred with a relatively constant background rate with some spikes. After the enablement, these errors dropped immediately and lastingly.

The results for customers testing for performance degradation during day-to-day operation were also positive, with one asking if we’d forgotten to switch it on.

Recommended changes to WLM

This only concerns customers using Dremio self-managed once it launches in version 25.0. If you have set queue memory limits, we recommend removing them. The queue and job memory limits help self-managed customers avoid the noisy neighbor problem between workloads of varying priority. However, the queue and job memory limits are relatively heavy-handed and may intervene despite Memory Arbiter having the situation under control, thus diminishing its effectiveness.

Closing

With the Cloud general availability release behind us, we’ve been very happy with the results and are looking forward to the imminent 25.0 software release. We are already planning how to enhance the feature further with additional shrinkable operators, a better shrink selection algorithm, prioritized operators, integration with other monitoring systems, and others.

Try Dremio Cloud free for 30 days

Deploy agentic analytics directly on Apache Iceberg data with no pipelines and no added overhead.

Start For Free

Article Topics

Product Insights from the Dremio Blog

Blog coverpage for Ingesting Data into Aparche Iceberg with Dremio

Feb 1, 2024 Product Insights from the Dremio Blog

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

By unifying data from diverse sources, simplifying data operations, and providing powerful tools for data management, Dremio stands out as a comprehensive solution for modern data needs. Whether you are a data engineer, business analyst, or data scientist, harnessing the combined power of Dremio and Apache Iceberg will undoubtedly be a valuable asset in your data management toolkit.

Alex Merced

Oct 12, 2023 Product Insights from the Dremio Blog

Table-Driven Access Policies Using Subqueries

This blog helps you learn about table-driven access policies in Dremio Cloud and Dremio Software v24.1+.

Albert Vernon

Aug 31, 2023 Dremio Blog: News Highlights

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code.

Jeremiah Morrow

Deep Dive into Better Stability with the new Memory Arbiter

Table of Contents

Introduction

Try Dremio’s Interactive Demo

How Memory Arbiter changes execution

How Memory Arbiter Works

Testing Memory Arbiter and Spillable Hash Join

Initial Customer Testing

Recommended changes to WLM

Closing

Try Dremio Cloud free for 30 days

Ready to Get Started?

Table of Contents

Introduction

Try Dremio’s Interactive Demo

How Memory Arbiter changes execution

How Memory Arbiter Works

Testing Memory Arbiter and Spillable Hash Join

Initial Customer Testing

Recommended changes to WLM

Closing

Try Dremio Cloud free for 30 days

Related Dremio Articles

Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg

Table-Driven Access Policies Using Subqueries

Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud

Ready to Get Started?