h2h2h2h2h2h2h2

15 minute read · July 29, 2022

New: Row-Level and Column-Level Access Controls

Albert Vernon

Albert Vernon · Senior Product Manager, Dremio

Introduction

Dremio v16.0 introduced role-based access control (RBAC), which allows you to grant privileges to users for controlling access to objects in your cluster such as tables, views, and folders. For example, you could give Alice the SELECT privilege on a table, which lets her read from that table, or you could give Bob the ALTER privilege on a view, which lets him modify the definition of that view.

RBAC privileges apply to entire objects, but sometimes it is useful to control access at the level of individual rows or columns. To accommodate this requirement, Dremio Cloud and Dremio Software v22.0 introduce fine-grained access control, which allows you to specify column-masking and row-access policies without an external server such as Apache Ranger. You can add, remove, and view policies with SQL commands. You can set row and column policies on tables, views, and columns from most data sources including object storage and database sources. The remainder of this article describes how to create and set column-masking and row-access policies in Dremio.

Specifying Your Policies

Column-masking policies allow you to substitute values in a given column. Row-access policies let you exclude rows before returning a result set to the client. In both cases, you specify your policies as user-defined functions (UDFs), which Dremio uses to control access to rows and columns. UDFs are callable routines that take input parameters and return a single value. Expressing a policy as a UDF allows you to define the logic in one place and then set it on multiple objects. It also permits modifying your policy and having the changes apply everywhere you have set the policy.

You can also list, view, and remove UDFs as described in the blog post: Announcing Scalar User-Defined Functions.

Creating a User-Defined Function

You create a UDF with the following syntax:

CREATE FUNCTION
	function_name ( [ function_parameter [, ...] ] )
    RETURNS { data_type }
    RETURN { query }
function_parameter:
	parameter_name data_type

Here is an example that returns the product of two numbers:

CREATE FUNCTION multiply (x INT, y INT) RETURNS INT RETURN SELECT x * y;
SELECT multiply(2, 3);
-- 6

Within the body of a UDF, it can be useful to call QUERY_USER to identify the current user or IS_MEMBER to see if the current user is a member of a given role. In this way you can control access to rows and columns depending on who is running the query. The examples below use these functions.

Column-Masking Policies

Column-masking policies allow you to substitute or transform values in a given column. You typically do this when you want to redact private or sensitive columns before returning them to the client. For example, a policy could return "XXX-XX-XXXX" instead of a social security number, or a policy could return only the last four digits of a credit card number. A UDF serving as column-masking policy must accept and return the same data type as the column it is masking. In addition, a given column can have at most one column-masking policy. Setting the masking policy of a column that already has a masking policy will replace the previous one.

Setting a Column-Masking Policy

After creating a suitable UDF to serve as a column-masking policy, you can use ALTER TABLE … SET MASKING POLICY on an existing table or CREATE TABLE … MASKING POLICY when creating a new table. 

The example below has a table named employees with a sensitive column ssn_col. The UDF protect_ssn replaces the first five characters with Xs unless the current user is Dave or Mike.

-- Create UDF to serve as the column-masking policy.
CREATE FUNCTION protect_ssn (val VARCHAR) RETURNS VARCHAR RETURN
	SELECT
    	CASE
            WHEN query_user() IN ('dave', 'mike') THEN val
            ELSE CONCAT('XXX-XX-', SUBSTR(val, 8, 4))
        END;
-- Set the UDF as a policy when creating a table.
CREATE TABLE $scratch.employees (
	name VARCHAR
  	,ssn_col VARCHAR MASKING POLICY protect_ssn (ssn_col)
  	,department VARCHAR
);
-- You can also set the UDF as a policy when modifying a table.
ALTER TABLE $scratch.employees 
	MODIFY COLUMN ssn_col 
    SET MASKING POLICY protect_ssn (ssn_col);
-- Insert some test data.
INSERT INTO $scratch.employees VALUES ('Charlie', '592-32-8562', 'Quality Assurance');
-- Notice that the sensitive column has been redacted.
SELECT * FROM $scratch.employees;
-- Charlie, XXX-XX-8562, Quality Assurance

Unsetting a Column-Masking Policy

To remove a column-masking policy, use ALTER TABLE … UNSET MASKING POLICY, for example:

ALTER TABLE $scratch.employees
	MODIFY COLUMN ssn_col
    UNSET MASKING POLICY protect_ssn

To learn more, see the documentation for Dremio Cloud and Dremio Software.

Row-Access Policies

Row-access policies let you exclude entire rows before returning a result set to the client. For example, a policy could skip rows if the current user is forbidden to see them. A UDF serving as a row-access policy must return a boolean value (true or false). A table can have multiple row-access policies.

Adding a Row-Access Policy

After creating a suitable UDF to serve as a row-access policy, you can use ALTER TABLE … ADD ROW ACCESS POLICY on an existing table or CREATE TABLE … ROW ACCESS POLICY when creating a new table.

The example below has a table named officers with a column assignment. The UDF hide_undercover protects the identities of undercover officers unless the current user has the role of "chief" or "captain."

-- Create UDF to serve as the row-access policy.
CREATE FUNCTION hide_undercover (val VARCHAR) RETURNS BOOLEAN RETURN
	SELECT
		CASE
        	WHEN is_member('chief') or is_member('captain') THEN TRUE
            WHEN val = 'undercover' THEN FALSE
            ELSE TRUE
        END;
-- Set the UDF as a policy when creating a table.
CREATE TABLE $scratch.officers (
  	name VARCHAR
  	,assignment VARCHAR)
    ROW ACCESS POLICY hide_undercover(assignment);
-- You can also set the UDF as a policy when modifying a table.
ALTER TABLE $scratch.officers ADD ROW ACCESS POLICY hide_undercover(assignment);
-- Insert some test data.
INSERT INTO $scratch.officers VALUES ('Deckard', 'undercover'), ('Murphy', 'patrol');
-- Notice that the undercover officer is not in the results.
SELECT * from $scratch.officers;
-- Murphy, patrol

Removing a Row-Access Policy

To remove a row-access policy, use ALTER TABLE … DROP ROW ACCESS POLICY, for example:

ALTER TABLE $scratch.officers DROP ROW ACCESS POLICY hide_undercover(assignment);

To learn more, see the documentation for Dremio Cloud and Dremio Software.

Showing Your Policies

In Dremio Cloud, you can view row-access and column-masking policies with the following queries:

SELECT view_name,  masking_policies, row_access_policies FROM sys.project.views;
SELECT table_name, masking_policies, row_access_policies FROM sys.project."tables";

In Dremio Software, you can view row-access and column-masking policies with the following queries:

SELECT view_name,  masking_policies, row_access_policies FROM sys.views;
SELECT table_name, masking_policies, row_access_policies FROM sys."tables";

In both Dremio Cloud and Dremio Software, you can run DESCRIBE TABLE to see what column-masking policies have been set on a given table.

Healthcare Example

To illustrate the concepts from above, consider the following scenarios that demonstrate column-masking and row-access policies.

Scenario 1: Data Redaction

A hospital maintains a table of medical records. Columns containing the medical record number, patient name, and diagnosis are protected health information (PHI) that must only be shown to users in the "MEDICAL" role. Users such as billing clerks who are not in that role must see "[redacted]" instead.

-- Create a table to hold medical records.
CREATE TABLE $scratch.records (
  patient_mrn VARCHAR
  ,patient_name VARCHAR
  ,date_of_service DATE
  ,diagnosis VARCHAR
  ,is_vip BOOLEAN);
-- Insert some test data.
INSERT INTO $scratch.records VALUES ('00847571', 'John Smith', '2022-04-26', 'cough', false);
INSERT INTO $scratch.records VALUES ('00195483', 'Jane Doe', '2021-03-16', 'rash', false);
-- Create UDF to serve as the column-masking policy.
CREATE FUNCTION mask_string (val VARCHAR)
	RETURNS VARCHAR
    RETURN
    	SELECT
        CASE WHEN is_member ('MEDICAL') THEN val
        	ELSE '[redacted]'
        END;
-- Set the UDF as a policy for the sensitive columns.
ALTER TABLE $scratch.records MODIFY COLUMN patient_mrn
	SET MASKING POLICY mask_string (patient_mrn);
ALTER TABLE $scratch.records MODIFY COLUMN patient_name
	SET MASKING POLICY mask_string (patient_name);
ALTER TABLE $scratch.records MODIFY COLUMN diagnosis
	SET MASKING POLICY mask_string (diagnosis);
-- When logged in as a user that does not have the "MEDICAL" role, PHI columns are redacted.
SELECT * from $scratch.records;
-- [redacted], [redacted], 2022-04-26, [redacted], false
-- [redacted], [redacted], 2021-03-16, [redacted], false
-- When logged in as a user having the "MEDICAL" role, all columns are visible.
SELECT * from $scratch.records;
-- 00847571, John Smith, 2022-04-26, cough, false
-- 00195483, Jane Doe, 2021-03-16, rash, false

Scenario 2: Data Concealment

In addition to redacting PHI as required by law, the hospital also has a policy to protect the records of patients they consider to be high profile such as government officials and celebrities. Employees who have undergone additional training in handling sensitive data are added to the "VIP" role and are allowed to see records where the is_vip flag is true.

-- Insert some test data.
INSERT INTO $scratch.records
    VALUES ('00541357', 'Mr. Celebrity', '2020-02-13', 'insomnia', true);
-- Create UDF to serve as the row-access policy.
CREATE FUNCTION conceal_vip (vip_patient BOOLEAN)
	RETURNS BOOLEAN
    RETURN
    	SELECT
        	CASE WHEN vip_patient THEN is_member ('VIP')
        		ELSE true
         	END;
-- Set the UDF as a policy for VIP patients.
ALTER TABLE $scratch.records ADD ROW ACCESS POLICY conceal_vip (is_vip);
-- When logged in as a user having the "MEDICAL" but not the "VIP" role, VIP rows are hidden.
SELECT * from $scratch.records;
-- 00847571, John Smith, 2022-04-26, cough, false
-- 00195483, Jane Doe, 2021-03-16, rash, false
-- When logged in as a user having the "MEDICAL" and "VIP" roles,
-- all rows and columns are visible.
SELECT * from $scratch.records;
-- 00847571, John Smith, 2022-04-26, cough, false
-- 00195483, Jane Doe, 2021-03-16, rash, false
-- 00541357, Mr. Celebrity, 2020-02-13, insomnia, true

Get Started with Dremio Cloud – It’s Free!

Dremio Cloud: The easy and open, fully managed data lakehouse platform.

Sign Up Now

Everything you need to build, automate, and query your data lakehouse in production.

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.