November 29, 2022

How to Securely Copy Database Environments Using Snowflake Cloning

By Frank Meads

A recent client we worked with asked if we could leverage cloning in the Snowflake Data Cloud to copy their production database to their test environment. The idea was sound, but there was still one major challenge with cloning, security.

When you clone a whole database, all of the role privileges from the source database are also cloned – meaning (in the case of cloning production to test) our client’s production roles end up having privileges granted to them in the test database. In contrast, their test roles would have no access.

To satisfy our customer, we had to find a way to resolve this security issue.

In this blog, we will cover cloning a database and properly setting its role privileges.

The Solution: How to get the Proper Privileges

Fortunately, there were two pieces of technology to help overcome this issue.

First, Snowflake had a view in INFORMATION_SCHEMA called OBJECT_PRIVILEGES, which provided all the information about which privileges were granted to which roles.

Second, the client implemented phData’s free Project Administration (Tram) tooling to provision their Snowflake infrastructure. For those unfamiliar, Tram is a tool that automates the often slow, error-prone, and tedious process of onboarding users and projects to Snowflake.

With Tram, we knew that all the database and role names followed a standard pattern. The standard naming pattern is critical because we know what the names of the roles will be for the different environments.

For example, if a read-only production role is named PROD_DB_RO, then we know the test role will be TEST_DB_RO. With that knowledge, we can query the OBJECT_PRIVILEGES table for the production database, swap the PROD portion of the role name with TEST, and then have everything we need to grant privileges to test roles in the test database.

We still have two other issues to overcome with this solution.

Challenge 1: Not All Privileges in OBJECT_PRIVILGES Are Supported. 

Snowflake has privileges listed in the OBJECT_PRIVILEGES table that are not fully supported yet, or do not apply to the version of Snowflake. To address this, we created a table with supported privileges. We then joined OBJECT_PRIVILEGES with our new table to limit the results for those supported privileges.

Below is a list of the supported privileges:

CREATE FILE FORMT

CREATE PROCEDURE

CREATE TAG

MONITOR

CREATE MATERIALIZED VIEW

INSERT

CREATE VIEW

REFERENCE USAGE

SELECT

MODIFY

CREATE TEMPORARY TABLE

DELETE

UPDATE

CREATE SCHEMA

CREATE STAGE

CREATE STREAM

CREATE FUNCTION

CREATE TASK

REFERENCES

CREATE SESSION POLICY

CREATE PIPE

USAGE

CREATE SEQUENCE

CREATE MASKING POLICY

OWNERSHIP

CREATE EXTERNAL TABLE

CREATE ALERT

CREATE DATABASE ROLE

ADD SEARCH OPTIMIZATION

CREATE TABLE

table

We created a database table with supported privileges and the stored procedures for the rest of this solution. Below are the database and schema names we used:

Database: UTIL_DB

Schemas:

  • Routines – Contains stored procedure code.
  • Data – Contains relevant tables.

Tables:

  • SUPPORTED_PRIVILEGES

We recommend you create something similar.

Challenge 2: Getting Future Privileges.

The other problem had to do with future privileges. While OBJECT_PRIVILEGES had data on current permissions, it does not have data on future ones, so we would need another solution to handle these. 

To address the challenge, we used OBJECT_PRIVILGES to discover the roles that have access to the production environment. We could then use this list of roles to check if they also had future grants and use those results to set future permissions on the cloned database.

The Code

First, we need to write the SQL queries that will generate the grant statements. Below is the one that generates the GRANT ownership statements we want to execute against the cloned database:

				
					SELECT 'GRANT OWNERSHIP ON ' || OBJECT_TYPE || ' ' || 
    CASE IFNULL(OBJECT_CATALOG,'.') 
        WHEN '.' THEN '' 
        ELSE REPLACE(OBJECT_CATALOG,'PROD','TEST') || '.' 
        END  || 
    CASE IFNULL(OBJECT_SCHEMA,'.') 
        WHEN '.' THEN '' 
        ELSE OBJECT_SCHEMA || '.' 
        END || 
    CASE WHEN OBJECT_TYPE='DATABASE' THEN REPLACE(OBJECT_NAME,'PROD','TEST') 
        ELSE OBJECT_NAME 
        END || 
' TO ROLE '|| REPLACE(GRANTEE,'PROD','TEST') || ' REVOKE CURRENT GRANTS;' 
FROM PROD_DB.INFORMATION_SCHEMA.OBJECT_PRIVILEGES 
WHERE (OBJECT_CATALOG='PROD_DB' OR OBJECT_NAME = 'PROD_DB') AND PRIVILEGE_TYPE = 'OWNERSHIP';


				
			

The thing you will notice about this query is the case statements, and they do three things:

  1. Determine if the database name should be included in the fully qualified domain name (FQDN) of the object.
  2. Determine whether the schema name should be included in the FQDN.
  3. Determine if the object_name the privilege is for is the database, and replace the PROD portion of the name with TEST. We are assuming that we are cloning a database names PROD_DB to TEST_DB.
 

Here are the query’s results:

query-result

We can then use this query, make minor changes, and create grants for the other privileges.

				
					SELECT 'GRANT ' || P.PRIVILEGE_TYPE || ' ON ' || OBJECT_TYPE || ' ' || 
    CASE IFNULL(OBJECT_CATALOG,'.') 
        WHEN '.' THEN '' 
        ELSE REPLACE(OBJECT_CATALOG,'PROD','TEST') || '.' 
        END  || 
    CASE IFNULL(OBJECT_SCHEMA,'.') 
        WHEN '.' THEN '' 
        ELSE OBJECT_SCHEMA || '.' 
        END || 
    CASE WHEN OBJECT_TYPE='DATABASE' THEN REPLACE(OBJECT_NAME,'PROD','TEST') 
        ELSE OBJECT_NAME 
        END ||
' TO ROLE '|| REPLACE(GRANTEE,'PROD','TEST') || ';' 
FROM PROD_DB.INFORMATION_SCHEMA.OBJECT_PRIVILEGES P 
    INNER JOIN UTIL_DB.DATA.SUPPORTED_PRIVILEGES S 
    ON P.PRIVILEGE_TYPE = S.PRIVILEGE_TYPE 
WHERE (OBJECT_CATALOG='PROD_DB' OR OBJECT_NAME='PROD_DB') 
    AND P.PRIVILEGE_TYPE <> 'OWNERSHIP';


				
			

The results from the second query are below:

result-two

With the grant statements generated, we now need a way to execute them. To do this, we created two stored procedures that perform the following steps:

  1. Parameterize the environments (PROD, TEST) and the database we are cloning.
  2. Replace placeholders in the query that generates the grant statements with the passed parameter values.
  3. Generate the grant statements.
  4. Execute the grant statements.

 

Below is the code for the stored procedures. One grants ownership, and the other grants the remaining privileges. 

				
					CREATE OR REPLACE PROCEDURE UTIL_DB.ROUTINES.CLONE_DB_OWNER_PRIVS(DBNAME VARCHAR, SOURCE_ENV VARCHAR, DEST_ENV VARCHAR)
RETURNS VARCHAR
LANGUAGE JAVASCRIPT
EXECUTE AS OWNER
AS
$$
/*
Definition: Procedure resets ownership privileges in the cloned database.
Params:
   DBNAME (STRING): Name of the cloned database.
   SOURCE_ENV (STRING): Name of the source environment (PROD)
   DEST_ENV (STRING): Name of the destination environment (TEST,DEV)

*/
var SQLSTRING = "SELECT 'GRANT OWNERSHIP ON ' || OBJECT_TYPE || ' ' || \
CASE IFNULL(OBJECT_CATALOG,'.') \
   WHEN '.' THEN '' \
   ELSE REPLACE(OBJECT_CATALOG,'@','#') || '.' \
   END  || \
CASE IFNULL(OBJECT_SCHEMA,'.') \
   WHEN '.' THEN '' \
   ELSE OBJECT_SCHEMA || '.' \
   END || \
CASE WHEN OBJECT_TYPE='DATABASE' THEN REPLACE(OBJECT_NAME,'@','#') \
   ELSE OBJECT_NAME \
   END || \
' TO ROLE '|| REPLACE(GRANTEE,'@','#') || ' REVOKE CURRENT GRANTS;' \
FROM $.INFORMATION_SCHEMA.OBJECT_PRIVILEGES \
WHERE (OBJECT_CATALOG='$' OR OBJECT_NAME = '$') AND PRIVILEGE_TYPE = 'OWNERSHIP' \
;";

SQLSTRING = SQLSTRING.replaceAll("$", DBNAME);
SQLSTRING = SQLSTRING.replaceAll("@", SOURCE_ENV);
SQLSTRING = SQLSTRING.replaceAll("#", DEST_ENV);

var SQLSTMT = snowflake.createStatement( {sqlText: SQLSTRING} );
var results = SQLSTMT.execute();
// Loop through results and execute individual statements.
while( results.next() ){
   var SQLPERM = results.getColumnValue(1);
   snowflake.execute( {sqlText: SQLPERM });
}

return "DONE!"

$$;
				
			
				
					CREATE OR REPLACE PROCEDURE UTIL_DB.ROUTINES.CLONE_DB_GRANT_CURRENT_PRIVS(DBNAME VARCHAR, SOURCE_ENV VARCHAR, DEST_ENV VARCHAR)
RETURNS VARCHAR
LANGUAGE JAVASCRIPT
EXECUTE AS OWNER
AS
$$
/*
Definition: Procedure grants current privileges required by the new database clone.
Params:
   DBNAME (STRING): Name of the cloned database.
   SOURCE_ENV (STRING): Name of the source environment (PROD)
   DEST_ENV (STRING): Name of the destination environment (TEST,DEV)

*/
// SQL statement to get grant statements that will need to be
// executed against cloned database.
// There are a few statements here, but some privileges need to be handled differently.
var SQLSTRING = "SELECT 'GRANT ' || P.PRIVILEGE_TYPE || \
   ' ON ' || OBJECT_TYPE || ' ' || \
   CASE IFNULL(OBJECT_CATALOG,'.') \
   WHEN '.' THEN '' \
   ELSE REPLACE(OBJECT_CATALOG,'@','#') || '.' \
   END  || \
CASE IFNULL(OBJECT_SCHEMA,'.') \
   WHEN '.' THEN '' \
   ELSE OBJECT_SCHEMA || '.' \
   END || \
CASE WHEN OBJECT_TYPE='DATABASE' THEN REPLACE(OBJECT_NAME,'@','#') \
   ELSE OBJECT_NAME \
   END \
|| ' TO ROLE '|| REPLACE(GRANTEE,'@','#') || ';' \
FROM $.INFORMATION_SCHEMA.OBJECT_PRIVILEGES P \
   INNER JOIN UTIL_DB.DATA.SUPPORTED_PRIVILEGES S \
   ON P.PRIVILEGE_TYPE = S.PRIVILEGE_TYPE \
WHERE (OBJECT_CATALOG='$' OR OBJECT_NAME='$') AND P.PRIVILEGE_TYPE <> 'OWNERSHIP' \
;";

SQLSTRING = SQLSTRING.replaceAll("$", DBNAME);
SQLSTRING = SQLSTRING.replaceAll("@", SOURCE_ENV);
SQLSTRING = SQLSTRING.replaceAll("#", DEST_ENV);

var SQLSTMT = snowflake.createStatement( {sqlText: SQLSTRING} );
var results = SQLSTMT.execute();

// Execute individual grant statements.
while( results.next() ){
   var SQLPERM = results.getColumnValue(1);
   snowflake.execute( {sqlText: SQLPERM })
}

return "DONE!";

$$;


				
			

We can now clone a database and set the appropriate privileges by executing the following:

				
					CREATE OR REPLACE DATABASE TEST_DB CLONE PROD_DB;
CALL UTIL_DB.ROUTINES.CLONE_DB_OWNER_PRIVS('PROD_DB','PROD','TEST');
CALL UTIL_DB.ROUTINES.CLONE_DB_GRANT_CURRENT_PRIVS('PROD_DB','PROD','TEST');
				
			

Now, when we show the grants on the BRONZE schema, we see the correct role assigned to the privileges:

role assigned

Finally, we can deal with future privileges. 

				
					CREATE OR REPLACE PROCEDURE UTIL_DB.ROUTINES.CLONE_DB_GRANT_FUTURE_PRIVS(DBNAME VARCHAR, SOURCE_ENV VARCHAR, DEST_ENV VARCHAR)
returns varchar
language javascript
execute as caller
as
$$
/*
Definition: Procedure grants future privileges required by the new database clone.
Params:
   DBNAME (STRING): Name of the cloned database.
   SOURCE_ENV (STRING): Name of the source environment (PROD)
   DEST_ENV (STRING): Name of the destination environment (TEST,DEV)

*/
// We need to get all roles that have potential future grants.
var SQLGETROLES = "SELECT DISTINCT GRANTEE \
FROM $.INFORMATION_SCHEMA.OBJECT_PRIVILEGES \
WHERE OBJECT_CATALOG = '$' \
   AND PRIVILEGE_TYPE IN ('SELECT','INSERT','UPDATE','DELETE','REFERENCES') \
UNION \
SELECT DISTINCT GRANTEE \--
FROM $.INFORMATION_SCHEMA.TABLE_PRIVILEGES \
WHERE TABLE_CATALOG = '$'";

SQLGETROLES = SQLGETROLES.replaceAll("$", DBNAME);

var stmtRoles = snowflake.createStatement( {sqlText: SQLGETROLES} );
var resultRoles = stmtRoles.execute();

while(resultRoles.next()){
   // We have to first execute SHOW FUTURE GRANTS before
   // we can return a result set.
   var SQLFUTUREPRIVS = "SHOW FUTURE GRANTS TO ROLE " + resultRoles.getColumnValue(1) + ";";
   var stmtShowPrivs = snowflake.createStatement( {sqlText: SQLFUTUREPRIVS } );
   var resultsFuturePrivs = stmtShowPrivs.execute();
  
   // Now we can execute a statement that will return a results set.
   var SQLGETPRIVS = "SELECT 'GRANT ' || $2 || ' ON FUTURE ' || $3 || \
   'S IN SCHEMA ' || \
   REPLACE(REPLACE(REPLACE($4,'.<TABLE>',''),'.<VIEW>',''),'@','#') \
   || ' TO ' || REPLACE($6,'@','#') || ';' \
   FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())) \
   WHERE SUBSTRING($4,1,CHARINDEX('.',$4)-1)='^';"
  
   SQLGETPRIVS = SQLGETPRIVS.replaceAll("^", DBNAME);
   SQLGETPRIVS = SQLGETPRIVS.replaceAll("@", SOURCE_ENV);
   SQLGETPRIVS = SQLGETPRIVS.replaceAll("#", DEST_ENV);
  
   var stmtGetPrivs = snowflake.createStatement( {sqlText: SQLGETPRIVS } );
   var resultsFutureGrants = stmtGetPrivs.execute();

   // Loop results and execute grant statements.
   while(resultsFutureGrants.next()){
       var SQLPERM = resultsFutureGrants.getColumnValue(1)
       var stmtGrant = snowflake.createStatement( {sqlText: SQLPERM } );
       var resultsOfGrant = stmtGrant.execute();
   }

}

return "DONE!";
$$;


				
			

Here is an outline of what the stored procedure is doing:

  1. Get a list of roles that have SELECT, INSERT, UPDATE, DELETE or REFERENCE privileges. Only roles with these rights were granted future privileges at my prior client.
  2. Loop through the roles and execute SHOW FUTURE GRANTS TO statements on those roles.
  3. Take the results from the SHOW FUTURE GRANTS statements and generate new grant statements for the test environment.
  4. Execute the grant statements for the test environment.

After we execute

				
					CALL UTIL_DB.ROUTINES.CLONE_DB_GRANT_FUTURE_PRIVS('PROD_DB', 'PROD', 'TEST') 
				
			

and

				
					SHOW FUTURE GRANTS ROLE PROD_DB_RW
				
			

we get the following results :

results

We see our test role now has future privileges required to operate on any new tables in the test database.

Conclusion

By leveraging the uniformity Tram provides to our database and role names, in conjunction with meta-data provided by Snowflake, we can successfully clone databases with proper role assignments. 

This ability allows us to test or develop against production data and reduce compute costs because we do not need to run simultaneous ELT processes for each environment. Hopefully, this solution will help you with any current or future endeavors.

If you’re looking for any additional help with Snowflake, phData can help! As the 2022 Snowflake Partner of the Year, phData is a data engineering, analytics, and machine learning consulting company that helps with Snowflake migrations, platform management, automation needs and machine learning foundations.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit