Any successful machine learning program will eventually have to deploy a model.
An ML program that doesn’t result in models that interact with and change business processes might be a viable research investment for a bit, but if it doesn’t lead to business value, it’s only a matter of time before the program is ultimately seen as unsuccessful.
Neither the immediate nor the long-term requirements to deploy an ML model into production and support its execution are obvious before you get to the middle of the project.
Planning for model deployment affects all groups involved in digital transformation:
Data scientists need to perform experiments and develop models with an understanding of where they will ultimately be executed.
Support teams need to plan for the required infrastructure and business processes to support ML solutions.
Leadership should clearly understand the entire scope of what is included in and required by a successful ML project so that there is no sticker shock when it comes to long-term deployment and maintenance of the project.
In this post, we will look at the requirements and components of maintaining a production-level machine learning solution while investigating the overall costs you can expect from a machine learning model.
We’ll also include some tips on how you can lower these costs and build a system that can scale up as needed.
What is the Cost to Deploy and Maintain a Machine Learning Model?
For the bare minimum required to deploy and maintain an ML model, you can expect to spend around $60K over the first five years for that model. Keep in mind that this bare-bones system will likely not scale over time and will be missing critical features from day one, which will lead to performance degradation over time.
To deploy an ML model at the same time as you build a scalable framework to support future modeling activities, expect to spend closer to $95K over the first five years on the functionality required to deploy the model.
What’s the Most Significant Difference Between These Approaches?
With the bare minimum approach, the first model costs $60k. The second, third, and any additional models will also cost $60k each.
Alternatively, when committing to building a scalable framework, you will incur $95k of expense for the first model. The second model costs $24k; the third model costs $14k, and you can expect that the incremental cost of additional models will continue to decrease.
What are the Bare Minimum Requirements for a “Successful” Production Project?
The most fundamental requirements of a production machine learning solution are:
A model or algorithm developed to solve an actual business problem
A mechanism by which this model can be accessed and included in a business process
In the early proof-of-concept stages of a machine learning program, this may be all the planning that is performed.
In this scenario, you only need to have the infrastructure to run the model execution process and a service that can be used to access this infrastructure.
Over five years, the cost to support these functions can be in the neighborhood of
$60,750 if you take the bare bones approach, and costs can be in the neighborhood of $94,500 with the MLOps framework approach.
What Exactly is MLOps?
MLOps is the process of developing a machine learning model and deploying it as a production system. Similar to DevOps, good MLOps practices increase automation and improve the quality of production models, while also focusing on governance and regulatory requirements. MLOps applies to the entire ML lifecycle —- from data movement, model development, and CI/CD systems to system health, diagnostics, governance, and business metrics.
The two approaches and the cost of each are shown below. Prices are estimated using AWS infrastructure and third-party engineering for deployment support.
If you’re tempted by the lower cost of the “bare-bones” approach, you should consider the comparative cost of launching. The bare-bones approach did not include setting up any automation or systems that can scale, while the fully-featured approach used more labor in exchange for implementing automation and scalable systems.
Over five years, the cost to support two additional models would be $121,500 to do things the bare-bones way — in contrast, the additional cost can be expected to be around $38,300 to do things the MLPOps framework way.
With the bare-bones approach, you will end up with three models that have a five-year TCO of $182,250 instead of the $132,922 five-year TCO of the industrial-strength approach.
The upshot: planning is critical in ensuring that the systems you are putting into place can scale with demand and machine learning adoption.
What are the Requirements for a Successful Production Project Following MLOps Best Practices?
If a model is to be part of a critical business process, consider all of the requirements and expectations that you’d have for any other business process:
Metrics and quantifiable measurements indicating the success or failure of the process
A method of monitoring the process for errors, and an escalation process used to address errors that occur
Automated model deployment pipelines based on a common enterprise model registry
Logging of predictions generated, input requests, and associated metadata (e.g. model version) for audit
Alerting on model drift, erroneous predictions, or other operational issues to resolve issues proactively
Want more info on the components and costs involved in developing and deploying a full MLOps framework? Talk to us.
Why is an MLOps Approach Important?
Gartner predicts that through 2022, 85 percent of AI projects will deliver erroneous outcomes due to bias in data, algorithms, or the teams responsible for managing them.
We see two reasons for adopting an MLOps framework as you gear up to deploy ML models:
If one does not fully plan for the entire support system required for a successful machine learning program, then any individual project will at best have short-lived success. A deployed project will either fail to be adopted by the business due to a lack of confidence or become a critical part of the business process that then fails at a crucial juncture.
Cost. MLOps streamlines operations and reduces the long-term burden on engineering to produce a new model from scratch each time.
3 Tips for Discussing the Cost of the Industrial-Strength Approach
The hardest part about deploying ML models correctly can be convincing others to make a larger investment today in order to establish a long-term pattern of success. Here are a few things to keep in mind:
Direct focus towards the long-term vision and not the short-term cost. That’s easier said than done given the typical enterprise budgeting processes, but try to present costs in the context of protecting the investment you’ve already made in machine learning by giving it the highest possible chance of success.
For unforeseen costs, remember that following best practices today will ensure that no one inadvertently ends up in a tech dead-end situation tomorrow.
Remember that the overall goal is building up a portfolio of ML projects for true digital transformation; whenever possible, present costs in terms of dollars required to add a model to the system as opposed to the absolute cost of deploying the first model. Talk about the TCO per model once 5, 10, or 50 models have been deployed to highlight the value of building scalable systems.
Moving Forward
The costs to deploy and maintain a machine learning model vary greatly depending on choices made about short-term savings versus long-term efficiencies.
If you intend to drive lasting business transformation with a machine learning program that grows over time, then you should be able to explain to others the importance of a good foundation. Discuss costs in terms of the models deployed over the next few years. Build a roadmap for your machine learning program that illustrates the significance of proper planning.
Finally, make sure that the partners you select to help you with your program understand the importance of MLOps and can help you sell your machine learning vision to your stakeholders.
Learn more about the requirements of successful ML deployments with our Ultimate Guide to Deploying ML Models, or contact us directly to discuss your needs.
Appendix: Doing the Math
Operational Costs of a Machine Learning Solution
Model Infrastructure
Bare Bone:
Based on an always-running, on-demand AWS EC2 m6g.4xlarge
instance (16 vCPUs, 64gb of memory, $0.616/hr, ~ $450/month), and 3 terabytes of EBS storage (3,000 GB x $0.10/month).
MLOps:
Based on using an 8 vCPU / 32gb configuration (at $0.47/hr) for AWS Fargate to run 1 baseline instance during work hours (8 per day on a 5 day work week) and scale up to 10 instances during a hypothetical two-hour peak that occurs each workday.
Data Support
Bare Bone:
Based on an estimated one-time cost for a Senior Machine Learning Engineer to develop a data pull script which is then executed as a cron job on the EC2 instance.
MLOps:
Based on using AWS Managed Airflow (MWAA) for data movement using a large scheduler ($0.99/hr) and large additional workers ($0.22/hr), and requiring 10 workers to run for seven minutes to update the analytic data every 10 minutes during workdays.
Engineering / Deployment
Bare Bone:
Based on an estimated one-time cost for a Senior Machine Learning Engineer to manually deploy a single model from a research environment to an EC2 server.
MLOps:
Labor cost based on an estimated one-time cost for a Senior Machine Learning Solutions Architect and a Senior Machine Learning Engineer to develop a CI/CD system using AWS CodePipeline, AWS CodeBuild, and AWS CodeDeploy. Recurring costs based on running these services and deploying a new version of the model every day.
Deploying Additional Models to Production
Model Infrastructure
Bare Bone:
Based on provisioning an additional always-running, on-demand AWS EC2 m6g.4xlarge
instance (16 vCPUs, 64gb of memory, $0.616/hr, ~ $450/month), and 3 terabytes of EBS storage (3,000 GB x $0.10/month).
MLOps:
Based on using the previous AWS Fargate installation, boosting the number of baseline instances to 2, and boosting the number of peak instances to 20.
Data Support
Bare Bone:
Based on an estimated one-time cost for a Senior Machine Learning Engineer to develop an additional data pull script which is then executed as a cron job on the new EC2 instance.
MLOps:
Based on an estimated one-time cost for a Senior Machine Learning Engineer to add additional data pipelines to the existing system.
Engineering / Deployment
Bare Bone:
Based on an estimated one-time cost for a Senior Machine Learning Engineer to manually deploy a single model from a research environment to an EC2 server.
MLOps:
Based on an estimated one-time cost for a Senior Machine Learning Engineer to configure the CI/CD system to include the new model.