Dremio Jekyll

Setting Up Dremio On AWS With Oracle

Intro

In the next series of posts I want to cover several scenarios using Dremio and relational databases with the most popular cloud providers.

The first sample will be devoted to the Amazon Web Services and Oracle database. I will guide you through the process of creating virtual Linux machine and Oracle database instance, as well as Dremio installation and connecting it with the database.

Assumptions

In this tutorial we’ll use the free tier of AWS and Dremio Community Edition, so you should be able to follow free and easily. We assume you are familiar with the basics of Dremio, so if you haven’t already, we suggest you read Getting Oriented to Dremio.

We also suggest you ask questions on the Dremio Community site - we love to help.

Getting Started With Oracle on AWS

First of all, you must have an AWS account. If you don’t have an AWS account, open https://aws.amazon.com/, and then choose Create an AWS Account. Amazon provides one year free tier for the minimal services configuration, which can be used for the first steps with AWS. In this tutorial I assume using only free tier services.

Once you have Amazon account, Oracle instance can be launched. First sign in to the AWS Management Console (a button at the top right corner of the home screen), then open RDS (relational database service) link under the Database header:

AWS Services

In the top right corner of the AWS Management Console, choose the region in which you want to create the DB instance - I am using EU (Frankfurt). Then click Instances* under the RDS Dashboard header. If you are new to RDS, you’ll get an empty DB instances list:

List of DB Instances

Choose Launch DB Instance to start the Launch DB Instance Wizard, set Free tier eligible only, and select Oracle:

View Oracle RDS options

Select Oracle EE and fill the necessary fields:

Chose Oracle EE RDS

Setting Up VPC

Choose Next Step. Here you can choose default (VPC) option for the VPC Security Group(s), because later we will adjust default security group to allow access from outside VPC. Also the default VPC code for the VPC will be different for your account.

Configure VPC

Choose Launch DB Instance and then View DB Instances. You will get your DB instances list like this:

View your DB instances

You should wait about 15-20 min while the Status becomes Available. Then you can click the Details icon and get full information about the new DB instance:

Details about your DB instance

The next thing we need is to get an access to our DB instance. First it is necessary to allow external inbound connections, choose the default link under the Security Group header. You will see the Security Groups list, which includes one default security group:

Details about the Security Group

Open Inbound tab and click Edit button, Edit Inbound rules dialog will be opened. For this simple scenario we can add one rule for the Oracle client TCP connection from your public IP, and later one inbound rule for the EC2 instance. After choosing My IP as a Source, your public IP address will be automatically determined.

Configuring your IP

Setting Up Oracle SQL Developer

After saving the inbound rule we need to set up some client software, which give us a possibility to work with the Oracle database instance. I am using Oracle SQL Developer, it is a is a free integrated development environment that simplifies the development and management of Oracle Database in both traditional and Cloud deployments.

Before setting up new connection in SQL Developer, open DB instance details and copy the Endpoint address

Copy the endpoint address

Open SQL Developer and choose New connection:

Create a new connection in SQL Developer

New / Select Database Connection dialog will be opened. Fill required fields and paste Endpoint address into the Hostname.

Enter the endpoint address for Oracle hostname

Press Test to check the connection and then click Connect, and a new connection will be saved and opened.

Just for the learning purposes let us create simple table and insert a data. Open new SQL Worksheet:

Opening a new SQL worksheet

Creating Test Table

The next step is to run two simple sql scripts - create TEST table in the TEST schema (which was created when we launched a DB instance) and insert a data:

1
2
3
4
5
6
7
create table test

(TEST varchar(2000));

insert into test values ('Hello world!');

commit;

Executing the test table creation SQL

Setting Up Dremio

Now we have running Oracle DB instance, and the next task is to launch Linux instance, install Dremio and then create dataset, which reflects Oracle TEST table. Choose EC2 from the dropdown Services menu:

Choose EC2 from services menu

EC2 Dashboard will be opened:

EC2 dashboard

Press Launch Instance to open instance launching wizard:

Launch instance wizard

Check Free tier only, and select Linux image. I am using Ubuntu Server 16.04:

View your new Ubuntu image

Free tier eligible configuration is checked by default, so choose Review and Launch:

image alt text

Configuring Your Key Pair

Linux machine will be run in the same VPC (virtual private cluster), but with the new security group. By default only SSH inbound connection is allowed, later we will add new inbound rule for the Dremio. Choose Launch, and If you don’t have any Amazon key pair, you can create a new one in the opened dialog window:

Review and launch your new image

Download dremio-oracle.pem file, click Launch Instance and then View Instances at the right down corner:

Connect to your instance

To test connection to the instance, choose Connect, help dialog will be opened:

Test your connection to the instance

Setting Up Java

Now you can copy connection string, open terminal window, if you are using Linux on your client computer, and then establish ssh connection:

Setting up ssh connection

So we have running Oracle database and EC2 Ubuntu instances on Amazon cloud. Since we want to install Dremio on the EC2 instance, Java installation is also required. It could be done using apt-get utility - connect to the EC2 instance using ssh and run next commands:

1
2
3
4
5
6
7
8
9
10
11
# Update packages:

$ sudo apt-get update

# Install Java:

$ sudo apt-get install default-jre

# After Java installation you can check the result:

$ java -version

View output of java version

Installing Dremio

After that we are creating dremio folder for the better file structuring, and download Dremio archive:

1
2
3
4
5
6
7
8
9
# Create directory for Dremio.

$ mkdir dremio

$ cd dremio

# Download from www.dremio.com/download. Be sure to get the latest version.

$ wget http://download.dremio.com/community-server/1.1.0-201708121825170680-436784e/dremio-community-1.1.0-201708121825170680-436784e.tar.gz

Downloading Dremio

Now we can extract Dremio files, rename Dremio folder (just for the commands simplifying), and run the service:

1
2
3
4
5
6
7
8
9
10
11
# Untar the file you downloaded. Be sure to download the latest version of Dremio.

$ tar -xvzf dremio-community-1.1.0-201708121825170680-436784e.tar.gz

# Rename the process

$ mv dremio-community-1.1.0-201708121825170680-436784e dremio-community-1.1.0

# Start Dremio

$ dremio-community-1.1.0/bin/dremio start

Starting Dremio

Configuring Network Access

As I mentioned above, new inbound rule for the EC2 instance is required to allow external access to the Dremio port 9047. Go to the EC2 Dashboard and choose Security Groups:

Configuring AWS Security Groups

Click corresponding security group launch-wizard-1, choose Inbound tab and click Edit to open inbound rules editor. Add new rule:

Adding a new rule

After selecting My IP option, your public IP address will be automatically determined. After saving rules, Dremio server can be reached on the EC2 public DNS name and port 9047.

Also we need to add inbound rule for the RDS instance, to allow connection from the EC2 (i.e. from the Dremio). First copy launch-wizard-1 group id to the clipboard. Click corresponding security group default, choose Inbound tab and click Edit to open inbound rules editor. Add new rule:

Adding a new rule for your security group

Setting Up Dremio

Now we can open Dremio using EC2 public DNS name:

Accessing Dremio running on your EC2 instance

After filling all necessary fields on the first Dremio page, we can see the standard Dremio home page for a new installation:

Viewing the Dremio home page.

We are now ready to proceed with the Oracle source and sample dataset!

Setting Up The Oracle Data Source

Before adding Oracle source, copy RDS Endpoint to the clipboard as we did for the SQL developer connection. Choose Add Source and select Oracle, enter connection information:

Creating the connection to Oracle

After saving source definition, we can see available Oracle schemas:

Viewing available Oracle schemas

Open TEST schema and click on the TEST physical dataset. That is what we need - get data from the Oracle table using Dremio query:

Viewing contents of Oracle TEST table

Save new virtual dataset into your own space:

Create a new VDS in Dremio

In this tutorial has been built typical scenario, how to use Dremio and relational database within Amazon cloud. Scenario includes the following steps:

  • Launching Oracle database service
  • Launching EC2 virtual Linux machine
  • Running Dremio within EC2 machine
  • Establish connection between Dremio and Oracle