Table of Contents
Accelerating Queries with Dremio’s Snowflake ARP Connector
What is Dremio?Faster time to insight enables organizations to move rapidly to action and accelerate the speed with which they can bring new product and service capabilities to market. To enable faster time to insight, organizations across the world have spent considerable money on data collection, storage and analytics. For the most part, data collection and storage has been addressed in the past decade. Hadoop and S3 compatible object storage became the de-facto for storing big data on-prem and S3, GCS, ADLS became the de-facto for cloud storage. There is also a big list of vendors who provide data visualization like Tableau, Looker, Qlik, Business Objects, SiSense etc. However, the question of how to provide a semantic layer across the enterprise that can bridge the BI tools with the massive data to achieve the faster time to insight has not been addressed well yet. To address this gap, we need a platform that can not only do the following but also do it well without any dependencies on anything:
- Lightning fast query engine for ad-hoc analysis
- Sub-second query performance for frequently accessed data
- Self-service layer for data engineers and analysts
- No vendor lock-in
“Dremio is a data platform that enables businesses to be data-driven by providing faster time to insight with no vendor lock-in”
Dremio addresses the above problems as follows.
- Lightning fast query engine for ad-hoc analysis. Dremio’s scale out architecture enables querying on large data sets.
- Sub-second query performance for frequently accessed data. Data Reflections provide the sub-second performance.
- Self-service layer for data engineers and analysts.Unified Web UI that provides lots of curation capabilities to empower data engineers and analysts.
- No vendor lock-in. Dremio works with open source columnar data format Apache Parquet whether its on cloud or on-prem directly without the dependency of another RDBMS or any other execution engine. Data Reflections are also stored in parquet format in a cheap storage like S3 or ADLS.
The Snowflake Connector and ARP frameworkDremio’s Advanced Relational Pushdown (ARP) framework allows data consumers and developers to create custom relational connectors for those cases where data is stored in uncommon locations. Using the ARP framework not only allows you to create better connectors with improved push-down abilities but it also provides a way to develop connectors more efficiently and easily. The ARP framework lets you build a connector very easily for any data source with a JDBC driver. Even the push downs are simply mappings on YAML which makes it very easy for anyone to create a connector. With ARP, I started working on building a connector for Snowflake about a month ago. Today, I’m happy to announce a major release of this connector with the following features:
- Complete data type support
- Pushdown of over 50+ functions
- Verified push downs of all TPCH queries supported by Snowflake
- Join data in your data lake (S3/Azure Storage/Hadoop/Other relational sources) with Snowflake. — No need to move all your on-prem data to Snowflake saving you $$$
- Accelerate Snowflake data with Reflections™ to provide interactive sub second SQL performance. — Enables your business to make decisions faster
- Use CTAS to export data from Snowflake to open source PARQUET format into your cheap data lake store. — Avoid vendor lock-in
- Use COPY INTO from Snowflake to export large data sets in PARQUET to S3/ADLS and use Dremio on top of that making use of Columnar Cloud Cache and/or join them with data directly in Snowflake.
Ready to get started?
Dremio Test Drive
Experience Dremio with sample data
The simplest way to try out Dremio.
Open & fully-managed data lakehouse
Best Option if your data is on AWS. Forever Free Usage.
Software for any environment
Download Dremio’s Community Edition