July 21, 2024

Flight as a Service (FaaS) for Data Pipelines: Combining fast Python functions with Dremio through Arrow

As the lakehouse architecture drives data access, new users need support for multiple use cases and runtimes to facilitate them. Leveraging Dremio for Flight streams and Arrow tables, we explain how we built a fast, serverless Python runtime for data pipelines mixing SQL and Python. The result is a developer experience that combines ease of use with unlimited use cases. If multi-language pipelines live in our own cloud, we have the advantage of a high-trust, highly secure and cost-effective environment: through the virtues of open formats and the composable data stack, the entire organization can now access a consistent – language independent – data representation through a shared Nessie catalog.

Topics Covered

Data Catalogue
Data lakehouse
Data Science
In-Memory Formats

Video Unavailable

Check again soon!