July 22, 2021

Increasing Performance with Arrow and Gandiva

In this talk, Vivek will start with an overview of how Arrow represents columnar data; and how it is more efficient on modern processors. Then he will introduce Gandiva and explain: 1) How it uses LLVM and generates optimized compiled code for expressions; and 2) How it leverages SIMD instructions to gain performance. For demonstration, Vivek will use Dremioi, to show how a Data Lake engine, can use Gandiva for improved SQL query processing power. To wrap things up, Vivek will give a glimpse of on-going work in Gandiva, such as a project to improve code generation.

Topics Covered

Dremio Subsurface for Apache Arrow

In-Memory Formats

Speakers

Vivekanand Vellanki Dremio Author & Contributor

Vivek Vellanki

Vivek Vellanki comes from a systems programming background, having worked at Microsoft and MapR prior to joining Dremio. His expertise is in the areas of distributed systems, performance, Hadoop and SQL query engines.

Try Dremio’s Interactive Demo

Explore this interactive demo and see how Dremio's Intelligent Lakehouse enables Agentic AI

Increasing Performance with Arrow and Gandiva

Speakers

Try Dremio’s Interactive Demo

Get Started Free

See Dremio in Action

Talk to an Expert

Make data engineers and analysts 10x more productive