April 19, 2024

How to Use Fivetran to Ingest Workday Data Into Snowflake

By Byron Ison

As Anne M. Mulcahy, former CEO of Xerox, once stated, “Employees are a company’s greatest asset – they’re your competitive advantage”.   Workforce analytics empowers businesses to understand and enhance employee satisfaction and business performance. By centralizing and surfacing your Workday data in your analytics environment, the potential value of this data can be unlocked and realized. 

There are a number of benefits to using Fivetran to extract data from Workday and replicate it to Snowflake. Among these are:

  • Significant reductions in engineering effort, in that Fivetran abstracts away the need to understand the Workday API, develop authentication and extraction components, and create target data structures.
  • Resource consumption (Workday API and Snowflake) is optimized through Fivetran’s Change Data Capture (CDC) feature, which minimizes data movement while keeping data in sync.
  • Minimal support footprint as Fivetran provides logging, alerting, and even failure resolution in many cases, such as API outages. Furthermore, Fivetran support is available to troubleshoot and resolve any persistent failures or outages.

This article describes a solution for replicating Workday data to the Snowflake Data Cloud using Fivetran, as well as some lessons learned from our past experiences here at phData. 

While the direction and best practices below are specific to the Human Capital Management (HCM) and Reporting-as-a-Service (RaaS) modules of Workday, the content is applicable to the Strategic Sourcing and Financial Management modules that follow a similar pattern.

Understanding Workday, Fivetran, and Snowflake

Fivetran is helpful in that it abstracts away both complex and mundane tasks that are required to achieve accurate and efficient replication of data between Workday and Snowflake.  The diagram below describes some key aspects of the replication process when implementing this solution using Fivetran:

 

Workday

Fivetran

Snowflake

Data EngineeringThe Workday REST API Interface enables data extraction.

Connectors abstract away the task of analyzing and extracting data from each endpoint.

Data is available in Snowflake in pre-built and refined data models (shown below). Enrich data with higher-level models designed by Fivetran for implementation via dbt.

Efficient CDC*

Supports extract frequency of up to every 1 minute (when queried in an incremental way as Fivetran is optimized to do).

Incremental watermarks, extracts, and writes are managed by Fivetran (CDC).

New or updated records are tagged accordingly with created/updated timestamps, which also simplifies CDC for downstream processes.

Flagging of deleted records  is supported via a soft-delete indicator.**

GovernanceProtect and limit sensitive data from exposure using new or existing Workday security groups.

Built-in logging and metadata

Fivetran logs and audit tables available in Snowflake for analysis and reporting

* Change Data Capture (CDC) capability enables efficient use of Workday and Snowflake resources, as well as sync frequencies up to every minute.

** Soft deletes are available on most objects, with some exceptions depending on if the object is a parent or child to related objects, thus preventing orphaned records.

Prerequisites

Access and Permissions

Fivetran Environment

An existing Fivetran account is needed. If you don’t have one, you can start for free with a 14-day trial.

Snowflake Destination

If this is the first time using Snowflake as a Fivetran destination, there are some steps to take as described below. Otherwise, this section can be skipped. 

Prior to starting, ensure that a Snowflake user account and role have been created for Fivetran and this user has been granted the create schema and usage privileges on the destination database and schema respectively. Note: Fivetran will create the destination schema if it does not already exist.

Authentication methods supported are Key-Pair and User/Password.

Workday Source

The steps to configure Workday security are detailed here.  These steps include creating an integration user and security group to which security policies are assigned.  Existing security groups and policies may be used if they are aligned with the intended domain(s) that one wishes to grant replication access to.

How to configure the integration

  1. Add the intended Snowflake destination to the Fivetran environment (if this does not already exist).

  2. From the Fivetran dashboard, add the Workday connector and populate the configuration with credentials established in the Workday Source prerequisite above.

  1. After successfully saving and testing the connection, review the schema and select the data points desired for replication from Workday to Snowflake.

See the Best Practices section below for some guidance if you don’t see the expected data points or see data that you do not want to expose.

  1. After configuring the connector (step 2) and selecting your data for replication (step 3), begin the initial sync by clicking Start Initial Sync and let Fivetran get to work!

Once complete, you will be able to view your replicated data in the destination. Fivetran will keep these tables up to date, triggering an incremental sync at the frequency selected in the setup tab (default of every six hours). Fivetran will generate a usage/cost estimate within the two-week trial period.

Best Practices for Ingesting Workday Data to Snowflake with Fivetran

Fivetran Data Models

Take advantage of the data modeling that has already been done by the team at Fivetran. These models are constantly being enhanced, for example the worker history table was recently added in Q1 2024. Lastly, these models enable per-built higher level logical models via dbt.

Reference the updated Fivetran data models to gain an understanding of the available data and how to link the objects together.

Sensitive Data

The Workday system modules are highly likely to contain sensitive data like employee PII or compensation info. While Fivetran does offer configuration options to block or mask sensitive data, it is recommended to rather limit exposure to sensitive data in the Workday security configuration (groups and policies). This prevents any of the sensitive data from being visible to Fivetran in the first place. 

In the cases where the Fivetran user does not have access to a given domain, Fivetran will populate the sensitive data points (fields) with null.

It is ideal to start with a highly restrictive set of policies and then expand the scope if the needed data points do not show up in the schema tab of the Fivetran connector dashboard. This helps minimize the creation of sensitive objects in Snowflake, even if they are ultimately not populated.  

For example, the optics of a compensation table may be misleading or concerning, even if the object is always null or empty. For this reason, it is ideal to start with the most limiting policies possible that still meet the need.

HCM vs RaaS Connector

It is possible to source the same data points from multiple modules. For example, worker data is accessible via both the HCM connector and the Reporting as a Service (RaaS) connector, and the question may come up as to which path to choose.  Below is some guidance based on our experience:

  • Use the HCM connector whenever possible to take advantage of the data modeling and ongoing development mentioned above

    • New data points are constantly being added, for example support for custom fields was recently introduced. If the needed data points are not currently available, consider submitting a feature request.

  • Use the RaaS connector in cases where:

    • There are existing reports with logic or formatting that needs to be maintained.

    • There are datasets (reports) that span multiple modules and the integration is not yet possible using the module specific connectors.

    • The data point(s) needed are not yet included in the module specific connector and there is an immediate need for this data.

Note: The RaaS connector has become increasingly useful following the introduction of dynamic query parameters.

Conclusion

As described above, Fivetran is an effective tool to surface the value of your Workday data in Snowflake, via data analysis, BI and Reporting, and even advanced use cases such as improving employee retention using AI or ML models. 

The nature of the Workday HCM module, being a source of employee master data, is particularly valuable in that it can be used as a key (data) integration point across several systems and business functions. 

These are just some of the many use cases unlocked by surfacing Workday data assets.

At phData, we love helping customers implement value-generating use cases with the modern data stack. If you’re interested in better utilizing the power of Workday data, the expert at phData can help!

FAQs

Currently, Fivetran supports four Workday connectors: HCM, RaaS, Strategic Sourcing, and Financial Management. While the details above are focused on HCM and RaaS, they are very much applicable to the Strategic Sourcing and Financial Management modules. In most cases, RaaS can be used to fill in for any unsupported modules, however, this does require development in Workday to create reports.

Fivetran supports a number of different destinations, and many of these support the Workday connectors. The content above is mostly destination independent and the guidance is transferable in cases of a different destination for the Workday data.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit