There’s a reason why report after report shows explosive growth for machine learning (ML). It’s the key to unlocking the sort of predictive intelligence and innovative automation capabilities you need to be competitive in 2021.
However, getting consistent ML results requires a consistent ML approach—hence the need for MLOps. At phData, we think of MLOps as an opinionated, automated assembly line for delivering ML models. The idea is to maximize automation, improve communication and observability, and ultimately get more reliable results.
Here are four tell-tale signs you need MLOps.
#1 - Lack of Consistency and Reusability
Does your research and experimentation feel like a black box that sometimes works and sometimes doesn’t? Or are you concerned that the original data scientist is the only person capable of maintaining your model? What happens if they get promoted or no longer work for your organization?
These are a few examples we’ve seen customers struggle with that makes it impossible to have a consistent, repeatable process for developing models.
That’s because an ML model is a combination of many things: code (e.g. data prep scripts, training scripts, scripts to drive inference using the model), algorithm choices, hyperparameters, and data. With ML, data is a key part of the product, and understanding the path of exploration is important.
But more often than not, this path is captured in isolation, in personal notebooks, datasets stored on personal computers, or cloud instances. Given the high inertia of the data and the variety of other inputs into the process, individuals’ personal work habits often drive the storage and organization of everything.
This introduces new questions and problems to solve. Data scientists need to be able to share and expose their work efficiently. Models need to be made available to the business, where they ultimately deliver value.
PRO TIP: Standardized processes mean that data scientists will no longer need to focus energy on setting up code environments, structuring experiments, or other routine tasks that do not align with their areas of expertise.
One easy way to get started is to create a project template folder that you can duplicate each time a new project is started. It can contain a directory to hold data prep scripts, a directory for data exploration scripts, etc., so that new projects all share a common layout.
#2 - Need for Traceability and Governance
Most organizations have rules related to data. Most of the time this comes down to things like how it is stored, how to control access to the data, how long data must be kept, and when data must be destroyed. But what might not be so obvious is that, as a product of data, ML models need similar care.
We’ve worked with plenty of companies who weren’t taking steps to ensure traceability. Because of this, every model their team built felt like an island. They spent a lot of time and energy tracking down how each one was built – especially when they needed to change a data source or field destination.
Without that background, these changes were risky because they were relying on nothing but their data scientist’s assurances that the model wasn’t built with sensitive or inappropriate data.
These are some additional challenges our customers weren’t prepared to solve before implementing an MLOps strategy:
- When data becomes irrelevant or is destroyed, what does this mean for the models trained from this data?
- Was sensitive data used to train a model? What does this mean for how the model can be used?
- A model is producing “strange” predictions. How can you discover the environment in which it was trained and the data used to do so?
If your organization is struggling to answer these key questions of traceability and governance, odds are you need a more automated approach that takes them into account.
PRO TIP: In addition to strong MLOps practices, tools for tracking the lineage of data and lifecycle of ML models will help you achieve this goal. Apache Atlas can be difficult to master but is very effective for data lineage tracking. MLflow provides a comprehensive framework for model and experiment tracking.
We’ve seen our customers use a wide variety of tools to get the job done, so you should look around to see what might be a good fit for your process.
#3 - Inadequate Reliability
When models are built from the ground up every time, deploying the tenth model will be just as time-consuming as deploying the first. This can also make it difficult to rely on the model’s availability enough to make it part of critical business processes.
And, it’s incredibly risky and time-consuming to retrain your model because of reliability issues. This is why it’s so important to design reliability into your models from the start.
In order to do so, you need to ensure you are meeting all of the typical expectations of business-critical software:
Your model must be highly available. This is especially critical if the model is being used in real-time.
Periodic re-training of models must be automated.
The quality of the data used in automated retraining must be automatically screened for quality and drift.
Don’t assume that your data transformation pipelines will flawlessly run indefinitely, or that the source data will continue to be valid. One large financial services company learned this the hard way when field agents decided one day that the easiest way to flag a transaction for follow-up was to enter “-1” in the field used for total value.
Without automated quality checks, problems like this can propagate all the way to the machine learning models and lead to corrupted business processes.
Your models must be tested to ensure that newly-trained models pass quality checks before they are made available for production use.
Without automated quality checks, there’s a risk that a simple development error can take down an otherwise functioning service or worse – replace it with a model that makes erroneous predictions.
MLOps combines DevOps with data science know-how to ensure you are delivering high-quality, validated, highly-available data products for production use.
PRO TIP: In order to meet these expectations, you’ll need to have a robust infrastructure that allows for nodes to fail (such as Kubernetes or Amazon’s EKS), a job orchestration system that can schedule maintenance tasks (we’re big fans of Airflow), and you’ll need to borrow some techniques from DevOps (like CI/CD) to handle the automated quality checks and model tests.
#4 - Poor Observability
If it’s not immediately clear how much a model is being used or what it’s predicting, it can easily end up causing more work than it’s saving. Companies who aren’t able to determine how a specific prediction was made (or it takes significant effort to test models and ensure that they’re still relevant), should be looking at standardizing processes with MLOps.
Production machine learning models must be monitored like any other software because it’s impossible to escape the dependence of your models on data. Not only should you be aware of the data used to train your models, you should also monitor the data presented to your model at runtime.
To get a better understanding of how observable your models are, ask yourself these questions:
- Is the form of your runtime data drifting in a way that your model is failing to produce useful predictions?
- Is the meaning of various features in your data changing over time? Will seasonal changes or demographic shifts make your predictions less relevant?
- Is your model producing predictions with lower confidence?
If you answered yes to any of these (or even worse – if you have no way of knowing whether or not this is occurring), that means you may not be monitoring your models as closely as you could be and it’s a good time to consider an MLOps framework for your business.
PRO TIP: A comprehensive solution is required to improve observability, but there are some simple ways that you can get started.
By assigning a UUID to all model requests and logging input and output data, you can begin to measure changes in the distributions of inputs and outputs. Then, to determine how much drift has occurred between the training data and the data being seen in production, build a classification model that takes an input record and classifies it as being from the training set or the production set. If a model can do this accurately then you know that there’s something different between the two patterns.
When done right, MLOps enables a much more systematic and sophisticated approach to ML. If your organization is struggling to automate time-consuming steps, keep teams on the same page, and ensure consistency and transparency around model delivery, then it’s probably time to take the need for MLOps more seriously.
Need help getting there? phData MLOps provides an enterprise-tested framework and automated workflow to get ML models into production faster, more efficiently, and with less risk.
For more information on how this can help your organization minimize your technical debt, improve ML reliability, and lower your time to value, reach out to us at email@example.com.