Case Study

Industrial Manufacturer Turns IoT Data to Profit on Snowflake


Customer's Challenge

A top manufacturer of mining and earth-moving equipment sought to boost revenue with new offerings, including smart-connected equipment and post-purchase proactive maintenance services. That meant transforming their existing sensor-based analytics platform into a more efficient, centralized, IoT data solution. And that meant they needed help.

phData's Solution

phData designed a cloud-native IoT solution on Snowflake and Microsoft Azure, then helped migrate from Hadoop to validate production readiness. The manufacturer has transformed their small web app into a unified IoT data store, analytics, and visualization platform — all built around CI/CD and infrastructure-as-code to maximize the value of the cloud.


The manufacturing corporation now has a proven path to further break down data silos and migrate more large applications from Hadoop to the modern Snowflake-based solution architected by phData. Before long, they’ll be able to empower customers using equipment across their entire product portfolio with all the improved efficiencies of an IoT data and analytics platform built from the ground up with phData’s Cloud 2.0 approach.

The Full Story

A leading manufacturer of earth-moving equipment, including construction, mining, and forestry equipment, has increasingly come to rely on sensor data to understand how their machines are performing.

Most machines they make — from excavators and front-end loaders to subsurface mining equipment and drills — include sensors to track a variety of indicators like hydraulic pressure, engine RPMs engine, oil temperature, and wheel speed. This Internet of Things (IoT) data not only allows them to predict when individual machines require maintenance, but also how they might help customers operate their products more efficiently (by analyzing operational cycles and patterns).

Because many of these machines may stay running 24×7 — and because such large machines often require similarly large outlays in capital — these insights provide enormous business value. This materialized by increasing top line revenue through new products and services, including smart-connected equipment and post-purchase proactive maintenance services.

After several previous iterations, the manufacturer had been using a Hadoop-based solution to process, store, and analyze all their sensor data. However, maintaining the platform required their small analytics team to spend more time administering the cluster than getting value from the data they collected, in addition, the static resource allocation model meant they could not scale dynamically and their compute costs were increasing.

As a result, they decided to explore how they might take advantage of the latest cloud-native services and data technologies to streamline systems management and improve efficiency, while simultaneously consolidating their siloed data sources.

Mountains of sensor data​​

To meet their goals and justify the costs of moving to a new platform, the manufacturer would need to design a modern, cloud-based data analytics solution; they also needed to ensure this solution could intake their existing data from Hadoop, and handle the high volume of new IoT data being pushed daily from their equipment.

Key Challenges

Designing and validating the right solution architecture

The manufacturer knew they wanted to move off of Hadoop and take advantage of cloud-native data technologies; however, they were less sure about which of those technologies were right for the job, how they should be optimized, and how they demonstrate the feasibility of the new solution. ​

Moving mountains of IoT data

With sensors generating thousands of data points per minute, per individual machine, the revamped solution would need to handle billions of sensor records per day. And with 40+TB of data to ingest, migration from their existing Hadoop-based solution was bound to be a complex challenge.

Unifying disparate systems and data

To handle the volume and heterogeneity of sensor data from all their different equipment (often transmitted from highly remote locations with poor internet connectivity), the manufacturer was parceling the data into files and uploading them once a minute. Accordingly, the new solution would need to incorporate their existing proprietary API, then somehow convert these files into a consistent and usable format. It would also need to serve as a central repository to help tear down corporate data silos and unify the multitude of existing systems of record.

Digging deeper with a Cloud 2.0 Architecture

The phData team worked closely with the manufacturer’s analytics team to understand both their existing Hadoop-based platform and their goals for overhauling it, then provided technology recommendations and support they needed to successfully transform it.

To deliver the required improvements in efficiency, maintainability, and data accessibility, phData designed a new architecture around Snowflake. They leveraged both the right mix of cloud-native services and data technologies (such as Spark and Kafka for data processing, and Microsoft Azure and Kubernetes for infrastructure and orchestration) as well as the right “Cloud 2.0” design and deployment practices (such as taking containerized, “infrastructure-as-code” approach to deploy the Kafka Connector using Azure Kubernetes Service) to make the most of those technologies. Finally, they proved the viability of the new solution by helping to successfully migrate one of the manufacturer’s large applications from Hadoop to Snowflake.

The data files generated once per minute by the sensors are now uploaded to Azure blob storage via a proprietary REST API; then, these hundreds of millions of small files are processed and normalized by Spark before being transmitted to Snowflake via the Spark-Snowflake connector.

Once in Snowflake, the data is consolidated and enhanced, every two minutes, via a series of tables and schemas designed to flatten data structures and introduce new data columns that provide more ways to break down the data.

The final result? A common data warehouse that’s easily accessible via Power BI dashboards.

Striking paydirt with IoT on Snowflake

Thanks to the solution architecture design and migration support from phData, the manufacturing corporation has transformed what started out as a small Microsoft SQL Server-based web application into a unified IoT data store, analytics, and visualization platform — one with the potential to now support the entire business:

Take the next step
with phData.

Learn how phData can help solve your most challenging data analytics and machine learning problems.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit