Dremio Jekyll

How To Use Inbound Impersonation

Mark Johnson

REQUIREMENTS

  • Dremio 4.x+ cluster is installed and accessible
  • Dremio latest ODBC driver is installed on the client machine (machine where Tableau, Python, etc. resides)

CONFIGURE DREMIO FOR INBOUND IMPERSONATION

Step 1: Create ‘depta__user’ and ‘deptb_user’ as ‘User’ role within Dremio. These users will only be able to Query the datasets to which they should have permissions.

image alt text

Step 2: Create a service account (in this case ‘tpcds_service’) as the generic access for the specific datasource or dataset.

image alt text

Step 3: Specify that the ‘tpcds_service’ user has access to a specific data source (or dataset). In this case we are permitting only queries on the ‘tpcds-Hive3.default’ datasource directory.

Step 4: Setup the inbound impersonation policies and confirm that the exec.impersonation.unbound.policies have been updated.

image alt text

The cluster is now enabled to support Inbound Impersonation for ODBC related queries using the depta_user and deptb_user. In the following sections are examples on how to use Inbound Impersonation for both Python and BI tools such as Tableau (other ODBC related tools will follow a similar pattern).

Python ODBC Query example

Leveraging the above configuration steps we will go through an example where we run a Python program containing the Python requesting user and its matching delegated user.

ODBC Property Value Comments
UID Depta_user This user does not require permission to the targeted dataset. But, it MUST already have been mapped to the specified DelegationUID before running the ODBC related program or you will receive a message "pyodbc.InterfaceError: ('28000', u"[28000] [Dremio][Connector] (40) User authentication failed. Server message: [30017]User authentication failed"
DelegationUID tpcds_service The DelegationUID specified user must have sharing rights associated with the requested Dremio table in the request or you will see the “User authentication failed” message. This delegationUID also needs to have been mapped to the uid ODBC property.

Step 1: Create your Python ODBC program as shown below

1
import pyodbc, pandas

host = "dremio-c1"
port = 31010
uid = "depta_user"
pwd = "dremio123"
driver = "/Library/Dremio/ODBC/lib/libdrillodbc_sbu.dylib"
duid = "tpcds_service"
cnxn = pyodbc.connect("Driver={};ConnectionType=Direct;HOST={};PORT={};AuthenticationType=Plain;UID={};PWD={};DelegationUID={}".format(driver, host,port,uid,pwd,duid),autocommit=True)

sql = '''select * from "tpcds-Hive3"."default".customer'''

dataframe = pandas.read_sql(sql,cnxn)
print(d

Step 2: Run the Python Program

image alt text

As we see the query completed successfully.

Step 3: Validate that the query was handled by the ‘tpcds_service’ user and not the ‘depta_user’ user by looking in the Dremio Job Logs screen as shown below.

image alt text

As we see above the query type is ODBCClient which is how we submitted the Sample Python query and in fact we see the tpcds_service account and not depta_user.

Step 4: Validate that an authorized user cannot leverage the delegationUID

In this continuation of this example, we will perform the same query using the ‘deptb_user’.

image alt text

In the above example we received the error message “Proxy user ‘deptb_user’ is not authorized to impersonate target user ‘tpcds_service’.” Ensuring that the Python program is not able to inappropriately hijak a delegation user.