Welcome to Dremio!
In this tutorial we’ll orient you to the basics concepts of Dremio so you know your way around. We’ll point you to resources that will help you now and in the future, and we’ll end by suggesting the next tutorial, Working With Your First Dataset.
You can follow this tutorial without installing Dremio. However, we think it will be easier to follow along if you have an installation you can access. Installing Dremio is easy - see the Quickstart for instructions. There are options for Windows, Mac, and Linux.
We also suggest you ask questions on the Dremio Community site - we love to help.
Companies create data in a wide range of technologies, including relational databases, SaaS applications, NoSQL, Amazon S3, Hadoop, and other systems. In order to make sense of all this data, companies use BI tools like Tableau, Power BI, and Qlik, or data science products like Python and R. To make data from all the different sources available to all these analytical technologies, companies build data warehouses, data marts, and data lakes, and they move data with ETL, custom scripts, or data prep tools. With this approach, companies build enormous complexity and cost around their data. They also create an environment where business users are entirely dependent on IT.
We built Dremio to help analysts, data scientists, and data engineers be more effective with data. Dremio is different from traditional approaches to data analytics because it combines:
If you’d like to know more about Dremio’s design, we suggest the Dremio Architecture Guide. Or, if you want to know more about how to deploy Dremio in your infrastructure, we suggest the Dremio Deployment Considerations guide.
If you are the first person accessing Dremio, you’ll be asked to create an administrator account:
Once the admin user is set up, users will be asked to log into Dremio:
Once you’re authenticated, you will see the home screen. If this is a new installation, it will look pretty empty:
Let’s take a look around and start with the bottom left, Sources:
Sources are systems where data is managed in your company. Dremio connects to these sources to access datasets.
Let’s create a source for Amazon S3 to connect to some sample data. Click on the New Source button, and you’ll see the dialog for adding a new source:
Next, select Amazon S3. You’ll then be presented with a configuration screen:
Let’s call this source Samples. Because we’ll be using a public folder, you can leave the two credentials fields blank. Enter samples.dremio.com in the External Buckets field, and click Add. Now click Save. When you return to the home screen you should now see that you have your first source, Samples, listed in your Sources. On the right, you’ll see the public files available to you, which we’ll use in our next tutorial:
In our next tutorial we’ll take a look at how to work with these files. For now, let’s finish getting oriented to Dremio.
Next, let’s create a space. Spaces allow users to collaborate around virtual datasets. You might create a space for a project, or for a team or department. Let’s create a space called “Stargate.” Click on the “New Space” button on the left side of the screen:
By default new spaces are public and visible by everyone. You can also configure which users have access to a space, either now or after the space is created. Let’s leave this space public.
Click Save to finish creating the ‘Stargate’ space. The screen should now show Stargate as a space:
At the moment there’s nothing in this space. Users can save their virtual datasets in spaces they have “Can Edit” access to. We’ll take a closer look at these abilities in our next tutorial.
Now, just above our spaces, we can see our user name next to a home icon.
This is your personal space. If you want to upload a file from your desktop, such as an Excel spreadsheet, it will be stored in your personal space and will not be visible to other users. Datasets in your personal space are like any other dataset in Dremio - they can be queried, joined to other datasets, analyzed with BI tools or Python, and so on.
Now let’s take a closer look at the bar on the top left of the screen:
To wrap things up, let’s take a look at the buttons on the upper right:
Now that you have a sense for how Dremio works, it’s time to get started working with some data. In the next tutorial we’ll look at Working With Your First Dataset.