How to Query Your Data Lake Using SQL Parameters in Excel

   

Table of Contents

Introduction

There are a myriad of BI tools available today. While they offer a great variety of features for almost every data analysis use case imaginable, spreadsheets are still the most popular way to consume data. If you are one of the whopping 800 million Microsoft Excel users, read on to learn how to tackle one of the most common challenges when trying to query data – from Microsoft Excel – directly on your data lake. In order to make the most out of your data lake, it is important to have swift access to your data, this will allow you to create interactive reports that will support you or your stakeholders to make timely data-driven decisions to improve operations in your organization. A common challenge when creating these reports is the real-time requirements that they bring along, i.e: the ability to change parameters in the report to satisfy data requirements by different users or departments, custom levels of granularity, etc. Up to this point, a normal workaround for this challenge was to create a static report based on each request. In this tutorial I will guide you through the process of how you can connect directly to Dremio via Excel, we show you how to write queries in Excel that get executed in Dremio whenever the sheet is refreshed and we show you how you can parameterise these queries to make Excel’s interactions with the data more dynamic. This tutorial assumes you already have the Dremio ODBC driver installed on the machine where you will run Excel, and that you can connect successfully to Dremio from this machine. If you are not familiar with, or what to refresh your Dremio basics, take a look at our Getting Oriented to Dremio tutorial before proceeding.

Add connection to Dremio and define a query on that connection

The first step in the process is to connect to Dremio in order to define the data that we eventually want to parameterize.

  1. Under the Data tab select Get Data > From Other Sources > From ODBC image alt text
  2. Select Dremio Connector from the list of Data source names, then expand the Advanced options: image alt text
  3. In the SQL Statement (optional) box, enter a valid Dremio query, then click OK. RECOMMENDATION: Add a WHERE clause that represents the type of filtering you want to perform when any parameters are introduced: image alt text
  4. When the query has executed you will see a results summary window, click Load image alt text

Now we have a query prepared and ready for a parameter to be added.

Create Excel Query Function

Before we attempt to add a parameter to the Dremio query, one of the key pieces to enabling a seamless parameterization experience is creating a generic Excel query function that allows you to specify a named Excel cell as input and provides the value in that cell as output.This is essentially a one-time task, the function can be used many times across all sheets in a single workbook. In this tutorial we assume the function is called fGetValue, but you can call it whatever you like.

  1. In Excel, under the Data tab select Get Data > From Other Sources > Blank Query image alt text
  2. In the Power Query Editor that is presented, select Advanced Editor image alt text
  3. Paste the following function into the Advanced Editor window, then click Done:
<figure class="highlight">
<pre><code class="language-sql" data-lang="sql"></code></pre>
<table class="rouge-table">
<tbody>
<tr>
<td class="gutter gl">
<pre class="lineno">1 2 3 4 5 6 7</pre>
</td>
<td class="code">
<pre><span class="n">let</span> <span class="n">fGetValue</span><span class="o">=</span><span class="p">(</span><span class="n">rangeName</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="n">let</span> <span class="n">name</span> <span class="o">=</span> <span class="n">Excel</span><span class="p">.</span><span class="n">CurrentWorkbook</span><span class="p">()</span><span class="err">{</span><span class="p">[</span><span class="n">Name</span><span class="o">=</span><span class="n">rangeName</span><span class="p">]</span><span class="err">}</span><span class="p">[</span><span class="n">Content</span><span class="p">],</span> <span class="n">value</span> <span class="o">=</span> <span class="n">name</span><span class="err">{</span><span class="mi">0</span><span class="err">}</span><span class="p">[</span><span class="n">Column1</span><span class="p">]</span> <span class="k">in</span> <span class="n">value</span> <span class="k">in</span> <span class="n">fGetValue</span></pre>
</td>
</tr>
</tbody>
</table>
<pre><code class="language-sql" data-lang="sql"></code></pre>
<p><code class="language-sql" data-lang="sql"></code><code class="language-sql" data-lang="sql"></code></p>
</figure>
image alt text
  1. Change the name of the query to fGetValue, then click Close & Load: image alt text

Add a parameter to a Dremio SQL query in Excel

With our Dremio SQL query and our generic function now ready, we are ready to add the function into the SQL definition in place of any fixed value we may have added to a WHERE clause in the original query. If no WHERE clause was initially present then now will be the time to add one.

  1. Right-Click on a cell in Excel that you want to use as a value for a parameter, select Define Name image alt text
  2. Provide a name for the column, in this case we choose to name the column region2, notice how it captures the specific cell this name refers to. Click OK:
    image alt text
  3. Right-click on the data table that was previously loaded into Excel and select Table > Edit Query from the popup menu
    image alt text
  4. Select Advanced Editor image alt text
  5. In the SQL Query dialog, replace any fixed value predicates with calls to the previously created function instead, passing the name of the cell containing the parameter value into the function, then click Done. Example:
let
    Source = Odbc.Query("dsn=Dremio Connector",
        "SELECT COUNTRIES.COUNTRY_ID, COUNTRIES.COUNTRY_NAME, COUNTRIES.REGION_ID
        FROM ""Staging.Application"".COUNTRIES WHERE COUNTRIES.REGION_ID = " & Number.ToText(fGetValue("region2")))
in
    Source
image alt text
  1. Back in the Power Query Editor, click Close & Load image alt text
  2. Now we can change the value in the cell whose name we changed to region2, then click Refresh All: image alt text
  3. At this point a new query will run in Dremio with the parameter set to our new value and the results will be returned to Excel: image alt text

image alt text

Wrapping Up!

Keeping efficiency in mind is fundamental when working with data. In this tutorial we went through the steps to create parameters in SQL queries when using Microsoft Excel to consume data from your data lake using Dremio. Following these steps will help you increase the efficiency of reporting activities, especially when these reports require real-time abilities such as custom data granularity based on who is viewing the data. We saw how this is possible by implementing parameters that will allow you to use the same report to satisfy the data needs of multiple users instead of creating a static report for each one of them. To learn more about Dremio visit our tutorials and resources, also if you would like to experiment with Dremio in your own virtual lab, go ahead and checkout Dremio University, and if you have any questions visit our community forums where we are all eager to help.

Ready to Get Started? Here Are Some Resources to Help

Whitepaper Thumb

Whitepaper

Dremio Upgrade Testing Framework

read more
Whitepaper Thumb

Whitepaper

Operating Dremio Cloud Runbook

read more

Webinars

Unlock the Power of a Data Lakehouse with Dremio Cloud

read more
get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.