August 8, 2022

Best Practices for Data Activation: Reverse ETL on Snowflake

By Cory Koster

Most data professionals are familiar with the ETL or ELT patterns for data ingestion and the approaches for ingesting data into a Snowflake Data Cloud. In these approaches, we ingest and transform the data, model it for our business use cases, and ultimately use it for analytics and reporting. 

Typically, this is where the data flow ends.

But what if this data could be used to solve an important operational business question like,

“How do we get the data from Snowflake into an application like an Enterprise Resource Planning (ERP) or Customer Relationship Management (CRM) tool?”

This is where Reverse ETL comes in.

In this blog, we’ll cover what reverse ETL and data activation are, best practices and approaches, and finally, next steps on your data activation journey. 

What is Reverse ETL Process on Snowflake?

From a high level, this approach is similar to ETL or ELT data ingestion, but instead is the process of moving or copying data from a data warehouse to an operational system or application. This data supplements and enhances operational data to further drive business decisions, providing more context to certain data points. 

Another way to think about this reverse ETL process is Data Activation.

What is Data Activation?

Data Activation is more than simply ingesting data to an operational system or application; it’s moving beyond analyzing data in a data warehouse context to taking action on the data. 

While there is no limit to what data is involved in this process, oftentimes the data is in the context of marketing and customer data. This data is enhanced with other key data points in the data warehouse and, instead of stopping at providing KPIs and other analytical metrics, this data is brought back to business systems in order to act upon the data.

This idea isn’t necessarily a new concept, but the approaches and technologies involved have certainly changed in recent years (and continue to do so). As the pattern of reverse ETL and data activation is a growing component to the modern data stack, it’s important to keep in mind the considerations and pitfalls to avoid.

A diagram with 3 elements, Snowflake, hightouch, and Business Applications and Systems.

Best Practices for Reverse ETL on Snowflake

The first consideration is to keep the source and destination in mind—the same technology used for data ingestion into the data warehouse cannot always be used for reverse ETL patterns to move data from the data warehouse to an operational system. It’s important to consider the appropriate type of technology for the job. 

One way to ensure your project starts off on the right path is to get this correct. Don’t assume the data ingestion tool can also be leveraged for the reverse ETL workflow as these technologies are often designed to extract data from operational or file stores and consolidate them into a data warehouse.

The second consideration is the data format and structure. It’s important to not assume the data can be copied as-is from the data warehouse. Data in a data warehouse is often structured in a way to enable fast and efficient analytics and reporting. This could include dimensional modeling or even a data vault architecture

This format will very likely not be suitable for an operational application (after all, this application is not a data warehouse). The data to be activated from the data warehouse needs to be queried and then restructured in a way the destination system can consume. This will of course vary by destination and needs to be accounted for before any pipeline is developed.

Lastly, to sync or extract the correct data from the data warehouse, a comparison needs to happen between the extract and the destination. This could be as simple as looking at timestamps or as complex as record comparison. This may seem obvious, but every system will handle updates differently. As the number of destinations grows, this can quickly become complex and difficult to manage without the proper tool.

Not only should you choose a tool that is designed for reverse ETL and data activation specifically, but you want a technology that fully leverages Snowflake’s scalability and architecture. 

As we’ve covered in previous blogs, Snowflake provides a lot of benefits as a cloud data warehouse. A reverse ETL tool that utilizes these features is a huge advantage and will provide a more seamless, integrated modern data stack.

Where To Go From Here?

There are a variety of considerations when thinking through a reverse ETL data flow, so much so that it can feel overwhelming to add another layer to the data platform to manage. So what are we to do? 

Our partners over at Hightouch adhere to best practices and make activating your data easy. Many of these considerations discussed here are resolved with Hightouch. They have a first-class Snowflake connector to leverage the features and architecture of Snowflake, making all reverse ETL pipeline development straightforward. 

Additionally, they will enable your business to further take advantage of Snowflake and move beyond the analysis of data to the activation of your data.

A simple diagram showing how Snowflake and hightouch make it easy for data activation.

Closing

phData has expertise in building modern data architecture that includes Snowflake and Hightouch. Level up your business today and reach out to phData for guidance on activating your data.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit