September 18, 2024

Using Fivetran’s New Hybrid Architecture to Replicate Data In Your Cloud Environment

By Will Dunlap

As data and AI continue to dominate today’s marketplace, the ability to securely and accurately process and centralize that data is crucial to an organization’s long-term success. Fivetran is a data movement platform that offers multiple system architectures that extract data from source systems and centralize it in cloud data warehouses like Snowflake AI Data Cloud, Redshift, and others.

In this blog, we’ll explore Fivetran’s new Hybrid Deployment Model, how it differs from traditional SaaS architecture, and which organizations might best benefit from this new offering.

Introduction to SaaS Architecture with Fivetran

Typically, data integration software as a service (SaaS) providers like Fivetran follows an architecture where data is extracted from the source, processed on resources owned and managed by the SaaS organization, and then delivered to the client’s warehouse. 

At Fivetran, this offering comes with the most security certifications and baked-in security best practices in the industry, including GDPR, SOC2, HIPAA, HiTrust, PCI, ISO compliance, and many others.

An advantage of the managed SaaS approach to data integration is that Fivetran takes responsibility for the complexity and costs associated with maintaining and scaling data movement from source to target. This architecture is mature, and supports numerous data sources (Fivetran currently has 600+ connectors), offering the flexibility to accommodate a wide range of client use cases.

Industries like Financial Services and Healthcare would greatly benefit from the managed approach to data movement that Fivetran’s SaaS solution provides. But, due to industry, company, or government regulation that restricts sensitive data sets from being processed by 3rd parties, they’re unable to take advantage of this offering. 

Fortunately, Fivetran’s new Hybrid Architecture addresses this security need and now these organizations (and others) can get the best of both worlds: a managed platform and pipelines processed in their own environment.

What is the Hybrid Deployment Model?

Fivetran’s Hybrid Architecture allows an organization to maintain ownership and control of its data through the entire data pipeline. Unlike a traditional SaaS architecture where data processing occurs on Fivetran-managed resources and data traverses as many as three networks (source, processing, target), the hybrid architecture allows the data processing agent to be installed on your own resources.

This approach not only eliminates network hops, but ensuring all connections between source, processing, and target are initiated from your network, rather than from the outside in.

How Does the Hybrid Model Work?

With the hybrid deployment architecture, a containerized agent is downloaded onto the network resources where the pipeline will run. This local agent is an application that can be deployed anywhere that the container can run, such as a docker environment or a set of VMs. 

One of those containers acts as a coordinator node, which will pull pipeline configurations down from the Fivetran Cloud and spin up additional containers for any actual pipeline executions. This allows a customer to create a local, scalable environment for processing data in pipelines configured in the Fivetran web application.

After the containerized application is downloaded, during the configuration process, the user gives it a name that will make it available as a processing agent when setting up your destination through the Fivetran interface.

Using the same agent name in the source and destination specifies that the pipeline will run on the same network as the agent.

Once one of these local agents is installed and registered with a specific Fivetran account, it will be available to select during connector setup. 

One thing to be aware of is that select metadata from these local agents is sent to Fivetran for auditing and billing purposes. To be clear though, your data never leaves your network. The metadata includes:

  • Registration information such as orchestration server hostname, port number, client certificate, and key.

  • Local processing agent logs and metrics, including job initiation status, container errors, and job start and run times.

  • Sync job logs and metrics, including the number of rows extracted/loaded and data volume processed (MAR data for billing).

Hybrid Source Support

As the newest architecture, Fivetran focused their development on the data sources that enterprises identified as most security conscious, including databases, Salesforce, Github, and more. They are continuing to develop new sources at a fast clip to expand this offering and allow customers to run data from any source to any destination via their Hybrid Deployment. 

Fivetran is continually expanding the number of supported source systems, but it’s important to consider these limitations when designing your architecture. A list of source systems supported by the Hybrid Architecture can be found here.

Functionality Review

Now, let’s review the major distinctions between the two architectures to clarify the differences.

 Saas ArchitectureHybrid Architecture

Data Processing Location

Primarily cloud-based on Fivetran-managed infrastructure.

Local processing in your network behind your own firewall, with selective cloud integration.

Security

Data is processed in the public cloud. At all times, Fivetran ensures stringent security measures are applied.

Reduced attack surface since data is processed locally and you control initiating all resource connections.

Control & Compliance

Can specify the provider and region where the pipeline runs, but the overall processing is opaque.

Greater control over data handling, ensuring compliance with strict industry regulations.

Flexibility & Customization

Limited customization.

Full flexibility of deployment, to support specific location and resource needs.

Source System Support

Numerous source system connectors.

Limited source system connectors.

Conclusion

Fivetran’s ability to offer both a full SaaS architecture and a hybrid architecture makes them a single vendor with a solution for all customers’ data movement. The SaaS deployment model provides a straightforward solution for standard enterprise sources like Shopify or Google Analytics, while the hybrid architecture offers a compelling solution for industries and data sources with specific higher-level data residency and connectivity ownership requirements.  

As Fivetran continues to expand the capabilities of its Hybrid Deployment, this new architecture presents a strong option for businesses with specific data locality and ownership requirements.

If you’re interested in tapping into the potential of Fivetran’s Hybrid Deployment, phData can help! As Fivetran’s 2024 Partner of the Year, phData will help you optimize your data integration, uncover valuable insights, and achieve your business goals with the Fivetran platform.

If you’re interested in tapping into the potential of Fivetran’s Hybrid Deployment, phData can help! As Fivetran’s 2024 Partner of the Year, phData is here to help you optimize your data integration, uncover valuable insights, and achieve your business goals with the Fivetran platform

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit