December 21, 2021

Supervised vs. Unsupervised vs. Reinforcement Learning: What’s the Difference?

By Christina Bernard

If you ever heard a data scientist discussing supervised, unsupervised, or reinforcement learning, they’re discussing the best way to solve your problem given the data provided to them. 

In this blog post, we’ll cover the core differences between supervised, unsupervised, and reinforcement learning within the realm of machine learning (ML), which is itself a subset of the field of Artificial Intelligence

But before we get down to types of ML, what do we mean when we say learning in the first place? Fundamentally, the field of ML specializes in creating programs from examples – i.e. data. Where traditional programs are built on established rules and intuition, data scientists create new insights by applying machine learning algorithms to short-term observed data. 

At the end of the day, this is why the type of learning – supervised, unsupervised, or reinforcement – will depend on the data available for the application.

What is Supervised Learning?

Supervised learning is a methodology in data science that creates a model to predict an outcome based on labeled data. To put it simply, labeled data contains a collection of variables (features) and a specific output that we are trying to predict. 

As a very basic example, if we want our ML model to predict whether fruits are apples or bananas, the label would take the values of “apple” or “banana,” and the feature set could include weight, length, width, and any other relevant measurements of the fruits that are available.

Now, let’s look at a more business-relevant example: customer churn (attrition).

To better understand customer attrition, you must first analyze what indicators might lead to a customer leaving. Your dataset for this type of model would include indicator variables, such as days since last purchase and average purchase amount, as well as the labeled predictor variable, which is whether the individual is still a customer. Since we have historical data on a customer’s status, creating a model with this type of dataset would be a great candidate for supervised learning.

Here are some words to help you decipher if a supervised learning technique is being discussed:

  • Linear classifiers
  • Support vector machines (SVM)
  • Decision trees
  • Random forest
  • Linear regression
  • Logistic regression

What Are The Key Considerations for Supervised Learning?

Supervised learning assumes that future data will behave similarly to historical data. The algorithms “learn” off a given dataset, which means it fits a model based on past behaviors and labels. Sometimes when these models see fresh data, they do not perform as well. When this happens, we say that the model is “overfit”, meaning it is overly tuned to the historical data. Some data scientists alternatively say that an overfit model “does not generalize well.”

When a model stops performing well, data scientists are tasked with finding ways to balance the accuracy of the model with allowing for flexibility as the underlying dataset changes. Retraining and adjusting the model to anticipate these shifts generally come over time as the model is being used. This is a normal part of the data science lifecycle and a reminder that models should be consistently monitored to ensure they remain valid.

What is Unsupervised Learning?

Unsupervised learning is a technique that determines patterns and associations in unlabeled data. This technique is often used to create groups and clusters. 

For example, let’s consider an email marketing campaign. 

Your dataset may contain information about the receivers, such as past purchasing behavior, the last time they visited a website, and the average purchase amount. You do not have any field defining a specific customer group that an individual falls into, so you might consider creating your own through unsupervised learning. 

You can use unsupervised learning to take all this behavior data and cluster your customers into groups. The important benefit of ML is that you don’t need to know anything about the inherent group dynamics – the clusters are learned automatically based on the data. From those groups, you may define a business term to associate with each cluster. From there, you can determine which subset of customers you would like to target with your email campaign. 

Unsupervised learning is often used for exploratory analysis and anomaly detection because it helps to see how the data segments relate and what trends might be present. They can be used to preprocess your data before using a supervised learning algorithm or other artificial intelligence techniques.

Yes, you read that correctly! In data science, multiple models are often created to get to the final result. Back to the case about customer churn, you might use the groups you created in your unsupervised model to feed into your supervised model. 

Examples of unsupervised learning include:

Often unsupervised learning algorithms are used on unlabeled data because we don’t have the output desired included when we use this technique. It is also challenging to evaluate the accuracy of an unsupervised learning model without labels to represent the target behavior; the efficacy of a model requires manual inspection of the learned output or carefully crafted heuristics. 

In many cases, these challenges can be addressed by “stacking” unsupervised learning algorithms with other algorithms.

What is Reinforcement Learning?

Reinforcement learning is a technique that provides training feedback using a reward mechanism. The learning process occurs as a machine, or Agent, that interacts with an environment and tries a variety of methods to reach an outcome. The Agent is rewarded or punished when it reaches a desirable or undesirable State. 

The Agent learns which states lead to good outcomes and which are disastrous and must be avoided. Success is measured with a score (denoted as Q, thus reinforcement learning is sometimes called Q-learning) so that the Agent can iteratively learn to achieve a higher score. 

Reinforcement learning can be applied to the control of a simple machine like a car driving down a winding road. The Agent would observe its current State by taking measurements such as current speed, direction relative to the road, and distances to the sides of the road. The Agent can take actions that change its state like turning the wheel or applying the gas or brakes. 

Rewards would be given to the Agent for desired behaviors like staying near the middle of the road and completing the course while penalties would be incurred for crashing into the sides or moving too slowly. A good implementation of Reinforcement learning balances short-term and long-term rewards allowing the machine to optimize for both. The car should learn to both avoid collisions and reach the end goal. 

Reinforcement learning does not require labeled data as does supervised learning. Further still, it doesn’t even use an unlabeled dataset as would unsupervised learning. Rather than seeking to discover a relationship in a dataset, reinforcement learning continually optimizes among outcomes of past experiences as well as creating new experiences. In other words, it creates new datasets and outcomes with each try.

Some applications of reinforcement learning are:

Oftentimes, these algorithms are used to solve more complex problems. While they can play games, research has shown that they don’t play on teams well…at least not yet.

Reinforcement algorithms are being tested in recommendation engines alongside supervised and unsupervised techniques to determine which method is best suited for a particular use case.

Where Does Deep Learning Fit Into the Picture?

Deep Learning refers to a particular type of ML algorithm called an Artificial Neural Network, or just Neural Network. These networks consist of “neurons” arranged into layers; and powerful networks often use many layers, hence the term “deep” learning. Both supervised and unsupervised learning can use deep learning techniques.

Almost all reinforcement learning algorithms will use deep learning in some capacity.  Deep learning is especially effective for creating ML models that take unstructured data, such as images, audio recordings, or raw text.


At the end of the day, your team of data scientists will check what data you have available, check what data you can get but haven’t recorded, and then determine what method would best solve your problem. Often, a variety or blend of approaches will be leveraged when determining which one is the best fit. 

A great rule of thumb is to see how far you can get with a simple model, something like linear regression, before moving on to a more complex algorithm, like a neural network.

Looking to unlock more value from your machine learning initiatives? Contact us today for advice, questions, and strategy!

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit