Case Study

Luxury Automaker Improves Forecasting Capabilities with AI

The Customer’s Challenge

A high-end auto manufacturer needed more accurate sales forecasting for the release of its first electric model. The company needed to move from ‘gut feeling’ and manual regression analysis to full machine learning supported by centralized data.

phData’s Solution

Using both Snowflake’s powerful data platform and a custom Machine Learning framework, phData introduced a faster, more accurate and user-friendly sales forecasting approach. The team also set the client up with the Machine Learning models they’ll need for better Business Intelligence in the future.

The Full Story

There’s a reason sales forecasting is a critical piece of a company’s financial health: it means better planning, more accurate inventory allocation, and the ability to more quickly course correct.

For a high end auto manufacturer ($34BN+), it meant running manual regression analysis in Excel — at least until they rolled out a new electric model with no historical sales data.

Manual analysis, limited insights

Analysts had built confidence in the automaker’s sales and marketing pipeline by using a standard forecasting model, collaborating in Excel when necessary. But this approach had a number of limiting factors for the company:

Most importantly, the company had no historical sales data to accurately forecast sales for its debut electric model.

Centralized data was just the beginning

The client’s needs quickly stacked up — one led to the other until the full scope of the project became clear: 

Why phData?

Since the client didn’t already have its data centralized, they needed a partner to own the whole project — from moving to Snowflake to designing and implementing the Machine Learning model.

The client came to phData with all 3 of their needs in mind because we could:

Building the engine

Our data science team knew they were facing a unique problem: creating an ML model that can accurately forecast order demand and sales for new and existing car models, often without the baseline of historical sales data.

We needed to create an ML framework that could:
Our data science team decided to create a model that uses data clustering to create a new dataset from similar car models’ historical data, present derived features and feature forecasts from significant features in both this internal dataset and a collection of economic datasets, and allow the team to use this data for the multivariate time-series forecasting framework (see Figure 1).
The framework provides two kinds of outputs: back testing results based on historical data and future forecasts at user-defined time intervals. To account for the lack of historical data for the new model, the framework utilized data clustering to make like-to-like comparisons within the sales patterns of previous models (see Figure 2); this approach allowed for the augmentation of the data for the new model.

The team included advanced functionality within the forecasting system to allow for subject matter experts to use their domain knowledge to further inform the system’s output by identifying analogous sales scenarios.

Best yet, the ML framework adapts to data drift at run time. In other words: the forecasting models are able to continually adapt, re-fit the latest data, and keep improving as more data is collected.

Pedal to the metal

All in all, our data science team created an ML framework that not only accurately forecasted demand for the new electric model, but also set the team up to effectively deal with data drift, incomplete datasets, and allow for the incorporation of additional data in the future.

These integrated forecasting models described above can be run on demand to forecast sales data at both the model and trim level.

Instead of just handing raw data over to the sales team, our engineering team built pipelines to feed the forecasted data back into Snowflake. Now, the client has access to a continuous loop of data for faster, more accurate forecasting and ad-hoc analysis. Our analogous forecasting engine allows domain experts to derive sales forecasts for brand new scenarios using only the data available from potentially similar trims and models.

With Snowflake’s User Defined Functions (UDFs), end-users don’t need to use custom Python to access insights on the command line. Instead, they can hook into relevant data in Snowflake using their existing BI products, searching for any combination of models and trim levels to turn up forecast data.


Our expert data scientists:
The outcomes included:

Take the next step with phData.

Sometimes even the “classic” machine learning problems such as forecasting can have intricacies and gotchas that one might not expect. In this case, the customer had to overcome a severe lack of historical data in order to forecast quantities that had not been measured before.  

Do your business problems qualify as “classic” machine learning that can be solved trivially, or are there special circumstances that will require special adaptations?  More significantly, are you confident that you have a solid understanding of what scenarios would call for more advanced machine learning methods?  

phData’s machine learning experts can help you plan a comprehensive project from day one, evaluate your existing plans and algorithms to help identify possible improvements, and provide full project management and execution capabilities in order to get you started right away.

Talk to an expert today, and make sure you’re getting the highest possible ROI from your machine learning program.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit