May 18, 2021

A Beginner’s Guide to MLOps: Deploying Machine Learning to Production

By Bob Kraft

Welcome back to our introductory to MLOps series. This is our second post on MLOps and if you missed the first post that covers What is MLOps and 4 Signs You Need It, be sure to give it a read.

In this article, we will focus on the MLOps best practices and discuss a minimally viable path to efficiently deploying ML models into production.

What is the Goal of an MLOps Pipeline?

The goal of an MLOps pipeline is to efficiently apply a machine learning model to incoming data at scale while minimizing operational costs.  

Both the data scientist and the ML engineer have challenging jobs. The data scientist is working with the unknown and has no guarantee that the data will provide valuable insights to address the needs of the business. The data may be too noisy, not contain the appropriate features, or even not be enough. The challenge for the data scientist is to find the answer when the answer may not exist.  

The ML engineer also has a challenging job. However, assuming that the data scientist successfully provides a model, the ML engineer (with enough time and resources) is almost guaranteed success. The data scientist has already proven that a solution exists. It is now the responsibility of the ML engineer to apply the model automatically to new data at scale. An MLOps pipeline is intended to reduce the effort ML engineers spend operationalizing each new model by providing common utilities for deployment.

What are the Foundations of an ML Pipeline?

As discussed in the Ultimate MLOps Guide, the four pillars of an ML pipeline are Tracking, Automation/DevOps, Monitoring/Observability, and Reliability. Adhering to these principles will help you build better ML pipelines. Here is a short review of these four pillars. 

Tracking – ML pipelines are a combination of code, models, and data. Keeping track of all of these components is critical for building systems that are reproducible and can be improved iteratively over time.  Logging is an important part of the Tracking pillar.  Logging what models have been run on what data is important for auditing.

Automation/DevOps – Modern DevOps has shown that working in small iterative cycles produces high-quality working software. MLOps should learn from the clear success made by DevOps. ML engineers by employing Continuous Integration / Continuous Deployment (CI/CD) can deliver ML models faster with better quality.

Your CI/CD should include automated testing that includes unit tests, stress tests, integration tests, and regression tests.

By automating CI/CD, the repetitive work of deploying models can be done consistently and without human errors. CI/CD by removing the burden of deployment from the ML engineers allows them to focus on higher-value work.

Monitoring/Observability – MLOps should be constantly monitoring the ML pipelines with good logging and alerting. Monitoring ensures that the ML pipelines are functioning as designed and producing value for the business. Routinely monitoring the performance of your ML pipelines will allow you to detect problems early. For example, large changes in request latency may be an early warning sign that the latest release of your model is not functioning as designed. Or if you routinely process 1,000 data sets per hour and it precipitously drops, you most likely have a problem with your ML pipeline.

Monitoring the performance of your ML pipeline along with a CI/CD will allow you to quickly find and resolve problems that will inevitably arise. In addition, monitoring of the models and their predictions is a best practice that may be required for auditing and to demonstrate compliance with local laws. MLOps pipeline should monitor for data drift and invalid predictions to make sure the model is performing as expected.

Reliability – A reliable ML pipeline will operate according to design and deliver consistent value to the business. Reliable applications are built to be scalable and highly available; they also have robust testing of all custom source code and integrations.

When Should I Start Building my MLOps Pipeline?

We draw a distinct difference between model development and experimentation by the data scientist and the design and implementation of the MLOps production pipeline by the ML engineer. Ideally, the data scientist and ML engineer will work together in small iterative cycles to produce the best results. How and when they work together will depend entirely on the needs of the business and customer.  

In the early stages of development, the data scientist is exploring the data by becoming familiar with it and searching for actionable insights. At this stage, it may be too early to start building the MLOps production pipeline for the model. With no guarantee that the data scientist will be successful, any effort to build an MLOps production pipeline may be wasted. However, the business may decide that an MLOps production should be developed in parallel to minimize risk and meet deadlines.   

In either case, we believe that building the model and the MLOps production pipeline are two different processes that should be loosely coupled. We consider this loose coupling as a best agile practice. It allows the data scientist to iterate and improve upon the model while allowing the MLOps engineer to independently refine, scale, and improve the MLOps Pipeline. This loose coupling doesn’t mean that the data scientist and ML engineer should work independently.  

For a successful MLOps pipeline, the data scientist and ML engineer should have constant and direct communication. They should work together to understand the components of the model, the format of the model, the model inputs, and the model outputs. If the data scientist and ML engineer can decide on common interfaces, then the model can be easily updated in the ML pipeline. 

Model registries are a common way for data scientists and ML engineers to collectively manage models. Agreeing to use a model registry early in the development cycle will standardize the transferring of models from data science to engineering. Model registries also provide benefits to data scientists in the form of automatic logging, so more experiments can be performed with less bookkeeping overhead. Basic registry tools are lightweight and don’t require sophisticated infrastructure; the small cost of adopting the tool will pay dividends as your models evolve. MLFlow is an excellent open-source tool that includes a model registry.

Deciding on these requirements and development tools early will allow the ML engineer to build the infrastructure of the ML Ops pipeline. Ideally, design decisions for the MLOps pipeline should be flexible enough to work for multiple projects. This will allow components of the MLOps pipeline to be standardized and reused. 

How to Start Building an ML Pipeline

Start simple. 

The implementation of any ML pipeline will depend upon the business requirements. These requirements will dictate the design and require you to ask some probing questions like: 

  • Will you design your ML pipeline to operate on-premise, in the cloud, or in a hybrid environment?
  • How will customers use your ML pipeline? 
  • Will your ML pipeline operate on data batches, streaming data, or be event-driven?
  • How will you monitor the reliability and reproducibility of your ML pipeline?
  • Does the data contain Personal Identifiable Information (PII)?
  • Does the data contain Personal Health Information (PHI)?
  • How will I secure the PII and PHI? 


Once you have a basic understanding of your ML pipeline requirements, focus on simplicity as you start to build your ML pipeline. Let your pipeline evolve and adapt over time to meet the needs of your customers. Paraphrasing from the Agile Manifesto: prefer pipelines that work over detailed requirements and comprehensive documentation. Building working pipelines iteratively will allow you to slowly add features to your pipeline.  

Listed below are five steps to help you get started building an ML pipeline:

Step 1: Establish Version Control

As a first step, you will want to use a Continuous Integration/Continuous Deployment framework for your ML pipeline. This allows you to work quickly in small iterative cycles on an ML pipeline that is always working. Focus on putting all your source code under version control (i.e GIT, Visual Studio Team Services, Team Foundation Version Control). As your pipeline grows you will have source code for:

  • Cleaning and preparing the raw data 
  • Deploying the model
  • Applying the model
  • Gathering, storing, and reporting the results
  • Testing the model with unit tests, integration tests, and regression tests
  • And the list goes on and on 
 

Keeping track of all of this code without version control software will be a difficult and time-consuming task (especially as your codebase and team grow).

In addition to source code, models will also have to be versioned. This can be achieved with a model registry. We recommend adopting a minimalist model registry such as MLflow over development of custom model-storage architectures. Even if storing models in an object store like S3 could technically pass for versioning, a registry will do much more to streamline model development and handoff.

Step 2: Implement a CI/CD Pipeline

Once your software is under version control, add on continuous integration by incorporating automated testing into your CI/CD pipeline. Each time you check in your code, you will want to verify that each test passes. This is critical for developing reliable software and ML pipelines.   

As your ML pipeline matures, you will quickly become tired of the manual process of building, testing, and deploying your models. Manually building, testing, and deploying your ML pipeline will just lead to errors and frustration. This is especially true if you work in small iterative cycles. The sooner you can automate these processes, the faster you will be able to minimize errors and focus on higher-value work. Depending upon your development platform, you may want to use Jenkins, GitHub Actions, GitLab CI/CD, AWS CodeBuild, or Azure DevOps for your CI/CD. Whatever tools you decide to use for your CI/CD, keep it as simple as possible.  

Step 3: Implement Logging Into Your ML Model and ML Pipeline

After you have implemented a robust CI/CD pipeline, we recommend focusing on adding logging to your ML pipeline. As discussed in the tracking section, logging is an essential part of any ML pipeline and will help you achieve clarity and reliability. We will categorize logging into two categories: external and internal.  

  • External logging keeps track of what model is applied to which data.  
  • Internal logging will allow you to see the inner workings of your ML pipeline and debug problems quickly. 


In the beginning, your ML Pipeline will be operating in short iterative cycles. As you are iterating on the ML pipeline, the data scientist may also be experimenting with different models to improve performance. Without external logging, it will be very difficult to keep track of which models were applied to which data and what were the results. 

Finding the best model that provides the most business value will be difficult if not impossible without meticulously logging. When implementing logging, make sure your logs can be traced in some way to the model and software versions tracked in your model registry and version control. The data that is applied to the model along with the results should also be logged. This will allow the data scientist to keep track of the model’s performance over time. An additional benefit is that these logs can be used to detect data drift.  

As the ML pipeline matures, keeping track of all of this metadata may be challenging.  Containerizing the code and model is one way to simplify the logging of dependencies. Creating a Docker container with the code, code dependencies, and model provides a convenient way to package the core components of the ML pipeline.

Step 4: Monitoring

Monitoring your ML pipeline will be critical for extracting business value. You will want to monitor performance metrics such as uptime, calculation time, and latency. You should also monitor how often the ML pipeline delivers actionable insights. An ML pipeline is only delivering business value if someone is acting upon the results. When you monitor your ML pipeline, keep it simple. Focus on performance and operational metrics.  

Performance Metrics will help you measure the business value of the ML model. For example, a predictive model only provides value if its predictions are accurate. Comparing model predictions to actual outcomes will allow you to verify that the model functions properly and provides business value. Defining good performance metrics is extremely challenging. Do not be discouraged if it takes you several iterations before you find the right set of performance metrics.

Operational Metrics will help you with the daily operation of your ML pipeline. Operational metrics are easier to define than performance metrics. You can measure latency, throughput, or the rate at which the model is called. These operational metrics will allow you to monitor the performance of your pipeline over time. If your performance metrics deviate from historic norms, this could be an early warning sign that your pipeline requires additional attention.  

For example, maybe your model’s throughput has reached a plateau and many of your predictions are failing. This could be an indication that your current computational resources are no longer sufficient to keep up with demand. Increasing your computational processing power and memory could be a simple solution to return your pipeline to operating norms. 

In addition to monitoring performance and operational metrics, you will also need to monitor for drift. We live in a dynamic and constantly changing world. The assumptions and conditions used to train the model will eventually depart from reality. Model performance will degrade over time. This is known as model drift. You can monitor for model drift by careful monitoring of your performance metrics. 

Whenever possible, leverage tools and frameworks that reduce the overhead of monitoring metrics in real-time. AWS CloudWatch provides tools for dashboarding performance metrics of your applications. Third-party tools like Grafana and Streamlit can also relieve the burden of reporting metrics without becoming locked into a cloud provider.

Step 5: Iterate

An ML pipeline that focuses on simplicity, version control, logging, active performance monitoring, and has an established CI/CD process, is a great start! 

You are well on your way to extracting business value from your data with ML pipelines. As you become more experienced with building ML pipelines, you will find opportunities to improve them. You will also discover and build better tools to monitor your pipeline’s performance and operational metrics. Don’t be afraid to experiment with ways of improving your pipeline. Let your pipeline evolve to meet your needs while always focusing on simplicity.

In Closing

We hope this blog has given you some pragmatic advice on how to start building an ML production pipeline. Developing an ML production pipeline that delivers business value is extremely challenging. If it was easy, more than just 20 percent of the pipelines would be delivering business value.   

Building a successful ML pipeline takes a diverse set of skills, experience, and knowledge that very few individuals, teams, or even large organizations have. Remember to keep it simple. Don’t become discouraged if your first ML pipeline does not meet all of your business objectives. 

Be sure to check out our final article in our MLOps series titled, The 4 Pillars of MLOps: How to Deploy Models Into Production. In this ultimate MLOps guide, you’ll learn about the four pillars of MLOps that will help your organization deploy ML models to production effectively.

Frequently Asked MLOps Questions

After the data scientist has built the model, the most challenging part of building an ML pipeline is assembling a team with the skills and experience to build it. Machine learning pipelines are still relatively new. There is a wide variety of technologies needed to deploy an ML pipeline that operates automatically and reliably. There are even more technologies you must know if you are going to deploy your ML pipeline to the cloud. The best place to start is to start simple and small. The first step would be to put your code into a GIT repository. Build your ML Pipeline in small iterative cycles, eliminate unnecessary features, and above all else keep it simple.

The biggest challenge will likely not be related to the technology, it will probably be having your customers adopt your ML Pipeline. Some customers will readily embrace these changes while others will be reluctant to adapt to new ideas and processes. Winning over these customers will be your biggest challenge. Every ML pipeline (no matter how good it is) will fail if the customers are unwilling to accept the results. At phData, we have found that keeping the ML pipeline simple and presenting the results clearly and concisely has been the best way to delight our customers.

For a more comprehensive guide to building machine learning pipelines, please read our Ultimate Guide to Deploying ML Models.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit