February 23, 2024

How to Ingest Veeva Vault Data Natively in Snowflake

By John Nowak

This article was co-written by John Nowak & Troy Fokken.

The Snowflake Data Cloud released the Healthcare and Life Sciences Data Cloud in March 2022 to help HCLS enterprises improve patient outcomes, optimize care delivery, enhance clinical decision-making, and accelerate research and time to market.

Snowflake’s unique architecture, near limitless scaling, strict data governance, and security have allowed these organizations to unlock business value and potential not previously obtainable with legacy on-prem applications and data platforms.

At phData, we’re seeing more and more of our HCLS customers focus on centralizing mission-critical healthcare data sets in Snowflake in an effort to better navigate a quickly changing healthcare landscape.

The rise of generative AI and advanced analytics is accelerating the need to have comprehensive data stored and accessible in a single data platform. The datasets range from complex EHR systems to transactional events such as HL7, FHIR, and EDI messages.

A customer recently challenged phData’s engineering team with designing a framework for integrating Veeva Vault data into Snowflake. In this blog, we will describe our approach.

What is Veeva Vault?

Veeva Vault is a cloud-native SaaS enterprise content management system built for the Healthcare and Life sciences industry. Veeva Vault manages content and data, allowing organizations to streamline end-to-end processes across commercial, medical, clinical, regulatory, quality, and safety business areas. 

Delivered as a cloud product, customers benefit from an always current, up-to-date application, scalability, performance to serve companies large or small, and zero management. Traditionally, organizations must deploy multiple applications to manage the content and data associated with these domains. 

Veeva provides a platform to design, implement, deploy, and quickly iterate as the business needs change.

Why Integrate Veeva Data?

By migrating Veeva Vault into Snowflake, a unified data storage platform can be created that offers improved scalability, streamlined workflows, and better data democratization. This integration completely transforms digital content operations and provides new opportunities for healthcare and life sciences organizations. 

These organizations can now obtain actionable real-time insights, optimize their content functions, enhance patient care, and become more cost-effective by streamlining operations with data applications.

Why Snowflake for Veeva?

Snowflake is the perfect solution for centralizing Veeva’s operational data into an analytical store supporting advanced analytics, AI, and Machine Learning. Recent features and product updates have made it possible to ingest external data sources directly into Snowflake natively. 

Public-facing APIs, such as those exposed through Veeva Vault’s Open APIs, provide access to operational data that can be integrated with analytical workflows on Snowflake. 

Challenges with Veeva Integration

Veeva Vault obtains a specialized architecture custom-built for the healthcare and Life sciences Industry. This creates complexities in aligning diverse data sources and workflows, resulting in challenges with interoperability to streamline operations and business insights. 

This challenge further propagates across the organizations by hindering data accessibility and business operations efficiency. Currently, there are no commercially available connectors to aid with Veeva vault integration. 

How Does Our Solution Address Challenges?

We had to custom-build our own connector due to the unavailability of commercially available connectors. This connector is designed to ensure a seamless flow of data between Veeva Vault and Snowflake. For the best results, we combined this connector with the phData Access Tool.  

The Access tool simplifies compliance and security within the regulated industry by providing easy control of roles, privileges, and account metadata in the Snowflake Data Cloud. To make this solution available to the Snowflake community, the phData Engineering team is converting this connector into a Native Application to be hosted on the Snowflake marketplace. You can find further details of the technical specifications of the solution below.

External Network Access

External network access is a Public Preview feature in Snowflake, allowing for secure access to specific network locations external to Snowflake. Engineers must first create a network rule defining the public endpoint; next, the network rule defines the external access integration that includes optional authentication secrets when interacting with the API.

External Access Integration
				
					USE ROLE ACCOUNTADMIN;
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION veeva_vault_access_integration
  ALLOWED_NETWORK_RULES = (veeva_vault_rule)
  ALLOWED_AUTHENTICATION_SECRETS = (oauth_token)
  ENABLED = true;

				
			

Finally, a developer can use the External Access Integration object directly in Snowflake user-defined functions (UDFS) or stored procedures.

Function for Veeva ingestion
				
					USE ROLE developer;

CREATE OR REPLACE FUNCTION veeva_rim_collector(endpoint STRING)
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = 3.8
HANDLER = 'get_translation'
EXTERNAL_ACCESS_INTEGRATIONS = (veeva_vault_access_integration)
PACKAGES = ('snowflake-snowpark-python','requests')
SECRETS = ('cred' = oauth_token )
AS
$$
<VEEVA INGESTION IP>
$$

				
			

Snowflake’s introduction of External Access Integration allows developers to create data integrations that can be run natively all within the Snowflake compute layer.

Snowpark

As covered in our What is Snowpark? blog, Snowpark is a set of libraries and a runtime environment that allows developers to use non-SQL-based programming languages such as Python, Java, and Scala to build robust data applications. With Snowpark, developers can directly integrate other open-source third-party libraries with data processing, serving advanced use cases.

Solution

phData’s solution for integrating Veeva Vault data into Snowflake involves using External Access integrations and Snowpark to collect data from Veeva’s APIs, process the result with Snowpark, and write the data to modeled Snowflake tables.

Native App

To bring this solution to other customers, phData is bundling this offering into a Native Application in the Snowflake marketplace that will be available later. The Native App Framework allows organizations to create and distribute data applications in the Snowflake Marketplace.

Is your organization using Veeva’s products and looking to centralize Veeva Vault data in Snowflake?

The experts at phData can help! Reach out today to schedule a demo of our new Veeva Solution.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit