Using AWS Glue’s Native Connector to Load Data into Snowflake

In modern data architectures, organizations frequently rely on AWS for data ingestion, transformation, and processing, while Snowflake AI Data Cloud is used as a high-performance data warehouse.

However, connecting AWS-based ETL workflows with Snowflake can be challenging, especially when dealing with custom connectors, complex authentication methods, and multi-step data orchestration. These hurdles often slow down development, increase costs, and create operational overhead.

A native integration between AWS Glue and Snowflake eliminates much of this complexity. By offering out-of-the-box connectivity, Spark-native execution, and drag-and-drop functionality through Glue Studio, the native Snowflake connector streamlines data workflows. This means faster time to value, reduced maintenance, and the ability to scale ETL jobs without managing infrastructure or third-party dependencies.

In this blog, you’ll learn how to:

Set up Snowflake and AWS Glue for secure and efficient integration,
Use AWS Glue to transform raw data from Amazon S3, and
Load transformed data into Snowflake using the native connector.

Why AWS Glue + Snowflake?

AWS Glue and Snowflake together provide a modern, serverless, and scalable data integration solution designed to handle the increasing complexity and volume of data in today’s digital enterprises. Here’s why this combination is compelling:

1. Native Integration for Snowflake

AWS Glue now offers a native, high-performance Spark connector for Snowflake, enabling:

Seamless read/write operations to and from Snowflake.
Custom SQL queries are a source from Snowflake.
Multiple write options like append, merge, truncate, and drop for Snowflake targets.
End-to-end orchestration and monitoring using AWS Glue Studio.

2. Improved Price-Performance

By using AWS Glue with Snowflake:

Companies avoid third-party licensing costs for ETL tools.
Operational costs (OpEx) can be reduced by over 50% by leveraging serverless Spark on AWS Glue.

3. Broad User Appeal

This integration supports different personas from data engineers and analysts to ML engineers:

Visual ETL workflows in Glue Studio.
No-code/low-code options.
Support for SQL-based transformations and custom Spark logic.

4. Faster Time-to-Insight (TTV)

With Glue’s built-in orchestration and performance at scale, organizations can build and deploy pipelines faster, reducing time to value and enabling rapid decision-making.

5. Flexible Data Flow Patterns

Whether you’re reading from or writing to Snowflake, AWS Glue supports:

Reading Snowflake data using custom queries.
Joining Snowflake and S3 data for enrichment
Writing updates or inserts into Snowflake tables using merge logic.

What is the AWS Glue Native Connector for Snowflake?

The AWS Glue Native Connector for Snowflake is a built-in, fully managed connector that allows you to read from and write to Snowflake directly from AWS Glue jobs., It supports high-performance Spark-based ETL, visual workflows in Glue Studio, and operations like append, merge, truncate, and overwrite, enabling seamless integration between AWS data sources (like S3) and Snowflake without needing custom drivers or code.

Prerequisites and Setup

1. Provision Snowflake Resources with Least Privilege

In your Snowflake account, create the following resources to isolate and control access for AWS Glue:

A dedicated Warehouse
A dedicated User
A Role with only necessary privileges
A dedicated Database and Schema
Assign Role to User
Grant Access Privileges

				
					-- 1. Create Warehouse
CREATE OR REPLACE WAREHOUSE <WAREHOUSE_NAME>
  WAREHOUSE_SIZE = 'XSMALL'
  AUTO_SUSPEND = 60
  AUTO_RESUME = TRUE
  INITIALLY_SUSPENDED = TRUE;
USE WAREHOUSE <WAREHOUSE_NAME>;

-- 2. Create Database and Schema
CREATE OR REPLACE DATABASE <DATABASE_NAME>;
CREATE OR REPLACE SCHEMA  <DATABASE_NAME>.<SCHEMA_NAME>;


-- 3. Create Role
CREATE OR REPLACE ROLE <CUSTOM_ROLE>;

-- 4. Create User
CREATE OR REPLACE USER <USER_NAME>
  PASSWORD = '<PASSWORD>'
  DEFAULT_ROLE = <CUSTOM_ROLE>
  DEFAULT_WAREHOUSE = <WAREHOUSE_NAME>
  DEFAULT_NAMESPACE =  <DATABASE_NAME>.<SCHEMA_NAME>;

-- 5. Assign Role to User
GRANT ROLE <CUSTOM_ROLE> TO USER <USER_NAME>;

-- 6. Grant Access Privileges
GRANT USAGE ON DATABASE <DATABASE_NAME> TO ROLE <CUSTOM_ROLE>;
GRANT USAGE ON SCHEMA <DATABASE_NAME>.<SCHEMA_NAME> TO ROLE <CUSTOM_ROLE>;
GRANT CREATE TABLE ON SCHEMA <DATABASE_NAME>.<SCHEMA_NAME> TO ROLE <CUSTOM_ROLE>;
GRANT USAGE ON WAREHOUSE <WAREHOUSE_NAME> TO ROLE <CUSTOM_ROLE>;

2. Create a Secrets Manager Entry

Store your Snowflake username and password securely in AWS Secrets Manager.

3. Create an S3 Bucket

Set up an Amazon S3 bucket to store your input data.

4. Create an IAM Role for AWS Glue

Define a dedicated AWS IAM role with the following permissions:

Access to Amazon S3 (for reading source data)
Access to AWS Glue (to run ETL jobs)
Access to AWS Secrets Manager (to retrieve Snowflake credentials securely)

Note: For tighter security, you can create a custom policy that restricts access to a specific Secrets Manager ARN and S3 bucket ARN.

Building and Running Your Glue Job

Navigate to the AWS Glue Console, and from the left navigation pane, select Data connections.

Click on Create connection and select Snowflake as Data Source.

Enter Connection Details: Set the Host as the Server URL and Port as 443.

Note: Port 443 is primarily used for secure web traffic, specifically for HTTPS connections. It’s the default port for HTTPS, which encrypts communication between a web browser and a server, ensuring the confidentiality and security of data transmitted online.

Authentication: Select the IAM Role and the AWS Secret we created, then test the connection.

Let’s start building Visual ETL Jobs.

Select your preferred data source. We have chosen S3 as the source and provided the S3 URL, data format, and other relevant details accordingly.

We have applied some transformation on the data using the Transform – SQL Query Component.

Select Snowflake as the destination and choose the connection we created earlier. Provide the Database, Schema, and a table name of your choice.

When loading data into the target table, AWS Glue provides four handling options:

Append: Adds all source data to the table. Re-running the job will duplicate data.
Merge: Requires a matching key or custom merge SQL for upserts.
Truncate: Clears the table before loading new data.
Drop: Deletes and recreates the table using the source schema.

Assign a meaningful name to your Glue job. Save the job to preserve your configurations.
Execute the Glue job. Once it completes successfully, navigate to Snowsight (Snowflake’s UI) and query the target table to verify that the transformed data has been loaded as expected.

Real-World Use Case (Optional)

A client needed to preview tabular data from an S3 bucket to define their DataOps requirements. Since we were already using Snowflake as the central data warehouse, and the client had access to Snowsight, they requested the ability to view raw S3 data directly within Snowflake.

We quickly implemented a solution using the AWS Glue for Snowflake Native Connector, which supports only username/password authentication. Our Infra team created a dedicated Snowflake user with minimal access, and we stored its credentials securely in AWS Secrets Manager.

Outcome

We delivered the solution in under two days. The client could now access S3 data through Snowsight securely, speeding up requirement gathering while maintaining proper access control.

Conclusion

AWS Glue’s native integration with Snowflake simplifies the creation of scalable, cost-effective data pipelines. By combining Glue’s serverless ETL with Snowflake’s powerful cloud data platform, teams can easily build, transform, and load data across sources using a visual, no-code interface.

With support for features like custom SQL, merge operations, and bi-directional data flow, this integration accelerates time-to-insight while reducing operational overhead. Together, Glue and Snowflake offer a modern, seamless solution for today’s data-driven organizations.

Want to use Spark and JDBC instead?

Check out this blog to learn how to connect AWS Glue to Snowflake using Spark, JDBC drivers, and CloudFormation for setup.

Reach out to us or explore our Snowflake services and demos to accelerate your data journey.

FAQs

Can I use IAM authentication instead of username/password for Glue–Snowflake integration?

You cannot simply tell Glue to connect to Snowflake with just an IAM role (the way you would with Redshift Spectrum). Snowflake always requires a Snowflake user identity, even if that identity is mapped from IAM via OAuth. You must use a Snowflake user/password combination or key pair authentication, typically stored securely in AWS Secrets Manager.

What’s the difference between Glue’s native connector and writing with JDBC?

The native connector:

It is easier to configure (no driver management),
Offers better performance with built-in Spark parallelism,
Enables simplified security and retry logic, and
Reduces development overhead.