January 1, 2022

Customer Retention Analysis: How Data Science Can Predict Customer Churn

By Christina Bernard

As an analytics consultant, I talk to a lot of managers that wonder what data science can actually do for their business. Oftentime I hear:

… it’s just an overrated buzzword,

…it never works and takes too much money to implement,

…or it’s a waste of your already filled day to explore it.

Even as a lover of all things math, I also think that data science can be an overrated buzzword. But when done correctly, data science can provide key insights and remarkable cost savings.

How Can Data Science Help Businesses?

We’re going to explore how to best partner with a data scientist to answer the important question: “how do we retain customers who are leaving?” or, in other words, “how can we decrease the number of our customers leaving?” 

Customer churn is the percentage of customers who leave your product or service within a predefined time frame. That time frame can be two weeks, three months, or even six months. Again, it depends on your business model and strategy.

Your customer churn can directly affect the revenue lost within a given period. Having a free service that churns monthly is very different from having a paid service that churns monthly.

A 5% dip in free customers doesn’t necessarily equate to a loss in revenue, but a 10% loss in paid customers could represent a 20% loss in revenue generated.

Knowing your revenue loss from customer churn can enable a data science consultant to better identify your problem and ultimately improve your customer retention efforts. 

Identifying a customer’s risk of leaving is what a data scientist does when creating a customer churn model.

The customer churn model uses behaviors such as customer purchase intervals, cancellations, follow-up calls, and emails, and on-page engagement to predict when a customer will leave. This usually happens in the form of a unique score that is attributed to the customer.

How Can We Stop Customers Churning?

The customer churn score can notify a person or system of the possibility of a customer leaving. Then that person or system can then respond with a call or an offer that is tested to prevent a customer from leaving.


Company A connects its churn model to HubSpot with a special column. Their salespeople see the scores daily. If a customer enters the danger zone, a call is given to that customer within 24 hours. Customer churn rate is dropped by 2%, saving the company from losing 10% of its revenue that quarter.

Company B is an eCommerce store. With their data scientist, they developed a system that sends a personalized offer to customers who entered the churn danger zone. Their customer retention increased by 25%, and their revenue by 10%.

This sounds great, right! Keep customers, maintain or increase revenue. What’s not to love about creating a customer churn model?

Well, you see, experienced data scientists have already identified significant areas of customer churn. These include:


  • Switching to a competitor
  • Closing down a business venture
  • Negative customer experiences


  • Expired credit cards
  • Reaching the limit of available funds
  • Failed mechanical payment processing
  • Fraud protection on recurring payments
  • Seasonality

This means that incorporating a fact-based identification model, like a churn model, can create tension within an organization.


The model can show that most customers seem to churn after purchasing X product. That product was developed by a senior manager with ten years on the job, and she is insisting there is nothing wrong with her product. She doesn’t want to change the content or the packaging.

A data scientist can only point to the problem; they cannot fix internal customer support or experience issues. They can, however, help you test solutions and back up the solution with math and accountability.

What is the Ideal Outcome of a Churn Model?

The ideal outcome for a churn prediction model is a customer retention plan. A retention plan can only work when every department shown to be a part of the drop in customers is cooperative and engaged in creating a solution.

Lack of cross-department cooperation can be one of the biggest reasons why customer churn models fail. This could be because the data scientist failed to clarify the churn situation and then built the wrong model. Or they were not given access to the right data, they used overly-fancy math instead of simple math (if they use the word, ‘neural net’… just run), or the severity of the issue was hidden from the data scientist because employees were afraid of losing the jobs.

Tip #1: Ensure that your data scientist interacts with everyone at your company, to ensure they’re part of the model’s success. Data scientists need to understand the problem from each perspective to turn it into mathematical mumbo-jumbo that can actually grow your business.

Can You Ensure the Success of a Churn Model?

Absolutely. Here are a few tips to make sure your customer churn model is successful and can truly help improve your customer retention:

  • Ensure that your company is collecting and labeling all the right data.
  • Have multiple sessions with a data scientist to discuss the problem before the solution is built.
  • Have the data scientist develop a clear proof of concept model first before building the full system.
  • Talk with your entire team about how the model will be incorporated and the expected responses from them.

Common Questions About Churn Models:

Yes, these systems are quite useful. However, they won’t be tailored to your business and can cause false positives when it comes from truthful answers.

Well, you can always work with me at phData! Overall, you need to find a data science consultancy that truly listens to your problem. There should always be much more listening than model creation at the beginning. Also, have your potential data scientist consultant design a lower cost profit of concept to ensure the input data is available and the outcome will match your expectations.

This answer varies. The best advice I can give you is an example. Suppose you need to predict the churn rate of people from X product. You need to have the data on who bought that product, where they signed up, what ads they have seen, what platforms were used, where they are located… to just name a few useful pockets of information.

If you don’t have information gathering set up, don’t worry. A smaller project to ensure you are collecting the right data may slow you down in the short term, but the long-term benefits will significantly outweigh the downsides. Remember, the tortoise won the race, not the hare. 

They are not explainable. They will give you a score, but they won’t tell you why you got the score. It’s like having a teacher tell you the answer is four but not explain what 2+2 is. You won’t have the necessary information to make operational changes.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit