A global manufacturing company was looking to modernize and simplify its data and analytics initiatives by moving to a cloud-based data platform that would scale with the business and enable frictionless data sharing with third-party vendors. Their existing Cloudera Data Warehouse was complex, expensive, and required enormous internal resources to manage. With renewal deadlines approaching, they needed a partner to make the migration seamless, and a reality.
phData put together a Proof of Concept (POC) for the client to move to the Snowflake Data Cloud, calling out specific advantages of Snowflake’s simple architecture, zero maintenance, and renowned data sharing capabilities. Once the choice was made, phData got to work immediately, migrating data from nine sources into Snowflake. The client now has a single place for all its data and can share data effortlessly within its partner ecosystem.
Our client is a renowned global manufacturing company that provides a broad range of innovative air, fluid, energy, and medical technologies. With over 40 brands selling its products across the globe, the manufacturing company relies on sharing pertinent data across its businesses and vendors to deliver exceptional products and services on a global scale.
Having utilized an on-premise Hadoop environment, the client was not seeing the results they expected and needed to continue to scale their business. In a purposeful effort to modernize their data, they realized they needed a new data platform that was cloud-based, scalable, reliable, simpler in architecture, and supported data sharing.
In addition, having a cloud data warehouse that worked nicely with modern data engineering tools like HVR, Fivetran and dbt would be a huge plus.
With upcoming warehouse demands and a rapidly approaching renewal deadline for their existing Cloudera agreement, the global manufacturing company had to make a strategic decision and execute it quickly.
Having worked successfully with phData over the past five years on a variety of data engineering projects, the choice was clear on who to partner with for their data modernization needs, they just needed the “how”.
For the first part of this project, the client needed a proof of concept to determine which cloud data platform was best suited for the business and its needs (future & current). phData recommended the Snowflake Data Cloud as the best option due to its storage, computing power, secure data sharing capabilities, and consumption-based pricing model.
With a new data platform selected and a deadline on the horizon, phData got to work lifting and shifting over 10,000 data objects from the client’s existing Hadoop environment to Snowflake. phData’s solution was broken down into three parts:
The first step in the process was to define the information architecture. Once defined, phData provided the tools and resources to ensure that new projects and initiatives stuck to the architecture. When the prescribed information architecture didn’t seem to fit the needs of a project, phData’s team consulted with the client’s team to identify gaps and propose workable solutions. This approach saved the company’s architects enormous time and effort.
With a challenge to migrate 100+ databases from Cloudera to Snowflake, phData leveraged our purpose-built data migration validation tool, Data Source, to replicate Cloudera Database Object Metadata onto Snowflake followed by provisioning an entire suite of infrastructure pipeline objects in Snowflake to move data from S3 to Snowflake.
phData helped the client’s development team to design and build a database versioning framework using Flyway, which managed and provisioned 5000+ views from Cloudera Impala to Snowflake. With the introduction of Flyway, the client’s development team is now able to version their SQL changes and apply them to Snowflake in a controlled manner.
In the first two weeks, a well-architected Snowflake environment with automated workspace/user provisioning was delivered. Within a month, the environment was integrated for CDC with the necessary security setup. After just 30 weeks, the global manufacturing company was working with a fully functional Snowflake environment—all with virtually no business disruptions thanks to phData.
In addition to having a modern, cloud-based data platform at its fingertips, the client was able to:
Looking into better data options for your organization? Learn how phData can help solve your most challenging problems.
Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.