November 12, 2024

How to Secure Your Snowflake Account

By Justin Delisi

Snowflake is one of the most powerful cloud-based data warehouses on the market, offering a scalable solution built for analytics. However, when storing sensitive data in Snowflake, it’s crucial to implement every security measure possible to protect it from unauthorized access and potential breaches.

In this blog, we’ll explore various strategies and best practices for securing your Snowflake account. We’ll cover features for authentication and authorization, network security, access control, and some applications you can use to determine if you’re securing your account properly.

Why Secure Your Snowflake Account?

There are some (hopefully) very obvious reasons why you should take every measure to keep your Snowflake account safe and secure. First and foremost, you must protect any sensitive data you may have stored in your account, especially that of your customers. 

Some industries have strict data privacy and security regulations (e.g., GDPR, HIPAA, PCI DSS), so those under such regulations should secure their data to comply with the policies. Lastly, protecting your intellectual property from the hands of your competitors is most likely very important to you and your business.

Authentication and Authorization

The first step to securing your account is not letting unauthorized users get in without a fight. Utilizing some key features from Snowflake is the first line of defense to keep people out of your account.

Single Sign-On (SSO) Integration

Using the default username and password for users to log in to Snowflake is not recommended, as this method is too simple and prone to being compromised. Snowflake offers Single Sign-On (SSO) integration from providers, including native support for Okta and Azure ADFS and most SAML 2.0-compliant vendors such as Google G-Suite and Microsoft Entra ID. With SSO, there is less need for users to remember and manage multiple passwords, reducing the risk of password breaches. SSO also enables centralized management of user access and permissions, making it easier to control who can access Snowflake and what they can do. Furthermore, it allows you to utilize System for Cross-domain Identity Management (SCIM) protocol integration for security. It provides an easy process for onboarding users into Snowflake and removing users should they leave your organization.

Multi-Factor Authentication (MFA)

MFA forces users to provide two forms of authentication when logging in to their Snowflake account: text message, phone call, or through an authentication app. This way, even if a password is compromised, the risk of unauthorized access is reduced unless the attacker also happens to have the user’s phone. MFA can also help protect against phishing attacks, where attackers attempt to trick users into revealing their login credentials.

Key Management and Rotation

You will undoubtedly connect third-party and internal tools to Snowflake, most of which can’t (or wouldn’t be feasible to) use an SSO connection. Instead of falling back on basic authentication with a username and password, Snowflake supports key pair authentication, which uses a minimum 2048-bit RSA key pair. You can generate the key pair using OpenSSL and assign the public key to a user. The private key then connects Snowflake from the tooling, which requires a connection. 

These keys do not expire, but it’s best practice to create new ones regularly to prevent unauthorized access. Snowflake allows for more than one key pair to be assigned to a user at once, so there is no connection loss during the change.

For example, if you previously assigned a key to a user name phData_user like this:

				
					ALTER USER phData_user SET RSA_PUBLIC_KEY='JERUEHtcve...';
				
			

You can add another public key with the RSA_PUBLIC_KEY_2 parameter:

				
					ALTER USER phData_user SET RSA_PUBLIC_KEY_2='DEDHFDSHtcve...';

				
			

Before changing the connection in your tooling and then unsetting the original public key:

				
					ALTER USER phData_user UNSET RSA_PUBLIC_KEY;

				
			

Network Security

Now that we have secured direct access, we can ensure the network is secure by utilizing some Snowflake features.

Network Policy

Network Policies control which networks and IP addresses can access your account, essentially locking out anyone using an unauthorized IP address. The network policy settings allow you to specify and allow a list of trusted IP addresses (or ranges of addresses) permitted to connect to your Snowflake environment and/or Denylist IP addresses not allowed to connect to it. Here is a simple example of creating a Network Policy whitelisting an IP address range and blacklisting a single address:

				
					CREATE NETWORK RULE allow_access_rule
 MODE = INGRESS
 TYPE = IPV4
 VALUE_LIST = ('192.168.1.0/24');


CREATE NETWORK RULE block_access_rule
 MODE = INGRESS
 TYPE = IPV4
 VALUE_LIST = ('192.168.1.99');


CREATE NETWORK POLICY public_network_policy
 ALLOWED_NETWORK_RULE_LIST = ('allow_access_rule')
 BLOCKED_NETWORK_RULE_LIST=('block_access_rule');
				
			

Data Encryption

Snowflake includes end-to-end encryption in all its editions, ensuring that your data is encrypted when it reaches an internal Snowflake stage or a table. However, you can use client-side encryption for further protection while your data is transmitted to Snowflake (or if you are using an external stage). 

Client-side encryption is a security measure that encrypts data before sending it to Snowflake. It follows a specific protocol defined by Snowflake’s cloud storage service. The service SDK and third-party tools implement this protocol. The basic steps are that the customer creates and shares a master key with Snowflake, the cloud storage service encrypts the data when it’s uploaded, and then generates an encryption key, which is saved in the cloud storage metadata. This ensures that your data is encrypted before leaving your on-premises systems.

AWS Privatelink and Azure Private Link

Taking security to the next level involves using services from cloud providers called AWS Privatelink and Azure Private Link. Snowflake does not provide these services, but it supports the use of them. These services set up a direct connection between your cloud provider and Snowflake, allowing data to flow without traversing the public internet. This significantly reduces the chances of a security breach while the data is in transit. Still, it does come with some caveats: It requires a Business Critical edition of Snowflake (or higher), not to mention a complex setup and maintenance and additional costs from AWS/Azure.

Access Control

Securing your account internally is almost as important as externally. There is always going to be data that you don’t want everyone in your Snowflake account to be able to access. Snowflake has some great features for this exact purpose:

Role-Based Access Control (RBAC)

Role-Based Access Control (RBAC) is a security mechanism in Snowflake that allows you to grant users and groups specific permissions based on their roles within your organization. This helps you manage access to resources and ensure that only authorized users can perform certain actions. 

For instance, you may not want all of your employees to be able to access the employee salary data or allow an HR representative to see customers’ private data. With RBAC, you’d set up a role with access to the data a company would need and share that role with the proper users. You may have an HR role granted to users in the HR organization who can view employee data. You can also set up roles as varying levels of permissions on tables. 

For example, a data analyst may have read-only access, an engineer can create tables, and an admin can drop objects. It’s all part of a hierarchy to help you manage access effectively.

Here is an example of what that hierarchy may look like:

To help you better understand the roles and privileges of your Snowflake account, phData has created the Data Access Tool. This tool is a great way to explore your hierarchy and even visualize it.

Dynamic Data Masking/Column-Level Security

Sometimes, you may not want to restrict access to an entire table; you can simply restrict data being viewed by the users. Dynamic Data Masking can mask sensitive data based on the role used to query the table and the data masking policy. Similarly, column-level security allows you to restrict entire columns of tables from being seen based on the role being used.

Trust Center

You may have read through this list of security features and thought, which ones am I already using and which ones am I missing out on? That’s why Snowflake created the Trust Center. The Trust Center is a feature in Snowflake that monitors an account for security risks. Background processes scan your system for risks based on the account configuration. These scans are then evaluated and compared to Snowflake’s security recommendations. If an account violates any of the recommendations, the Trust Center UI displays the account along with suggested strategies for mitigation. This way, you know which features you aren’t using that Snowflake recommends.

Advisor Tool

The Trust Center is great for security recommendations, but there are many more features and best practices for your account that you may be missing out on. These features could make your account more efficient and faster or even save you money. The Snowflake experts at phData created an application that can scan your account, like the Trust Center. Still, it recommends all configurations within Snowflake, including performance, operations, and cost-efficiency. 

To easily get a comprehensive report of our best practices and ensure the parameters above stay compliant going forward, try our Snowflake Advisor tool with one click in the Snowflake Marketplace or via the phData Toolkit today!

Closing

Securing your data is one of the most important actions regarding your data warehouse. Implementing these features lets you know that your business’s and your customer’s data is safe and secure. Snowflake and phData make it easy to know how secure your data is and if you’re doing everything you can to keep it that way. 

Reach out to phData today to discuss how we can support your data security journey and give you the peace of mind that your data is fully protected.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit