What Does it Mean to Deploy a Machine Learning Model? (Deployment Series: Guide 01)

deployment-series-02-whiteboard-image

This is post 1 in my Ultimate Guide to Deploying Machine Learning Models. You can find the other posts in the series here.

I recently asked the Twitter community about their biggest machine learning pain points and what work their teams plan to focus on in 2020. One of the most frequently mentioned pain points was deploying machine learning models. More specifically, “How do you deploy machine learning models in an automated, reproducible, and auditable manner?”

Good question!

The topic of ML deployment is rarely discussed when machine learning is taught. Boot camps, data science graduate programs, and online courses tend to focus on training algorithms and neural network architectures because these are "core" machine learning ideas. I don’t disagree with that, but I’d argue that if a data scientist can’t deploy a model, he won’t be able to add much, if any, value to a business.

If you search for resources on how to deploy a model, you’ll find plenty of blog posts about writing Flask APIs. Many of these are well done but not all ML models need to be deployed behind a Flask API. In fact, sometimes this is counterproductive. These posts rarely discuss what factors to consider when deploying a model, the variety of tools that can be used, and other important ideas. These topics are extensive and a single blog post wouldn’t do them justice.

That’s why I’m writing a multi-part blog series on deploying machine learning models. This series will discuss what it means to deploy an ML model, what factors to consider when deploying models, what software development tactics to use, and the tools and frameworks to utilize. If you’d like to be alerted when each of these posts is published, leave me your email address!

Before discussing any tools, let’s begin by asking: what does it mean to deploy a model?


What does it mean to deploy a Machine Learning Model?

Before you think about what tools to use to deploy your model, you need to have a firm grasp on what deployment means. To attain that understanding, it’s helpful to put yourself in the shoes of a software engineer. How does a software engineer think about "deploying" code? How does the concept of deploying code transfer to the domain of machine learning? Thinking about deployment as a software engineer rather than as a data scientist will dramatically simplify what it means to deploy a model.

In order to understand what it means to deploy an ML model, let’s briefly discuss the lifecycle of an ML project. Hypothetically, a product manager (PM) will discover some user need and determine that machine learning can be used to solve this problem. This will involve creating a new product or augmenting an existing product with machine learning capabilities, typically in the form of a supervised learning model.

Thinking about deployment as a software engineer rather than as a data scientist will dramatically simplify what it means to deploy a model. Click To Tweet

The PM will meet with an ML team lead to plan the project by defining project goals, choosing a metric, and setting up the codebase. If appropriate training and validation data exists, the project will be handed off to data scientists or ML engineers to handle the iterative process of feature engineering and model selection.

The goal at this stage is to build a model whose level of predictive performance meets or exceeds the goals set during the planning stage. Throughout these initial stages the users’ needs that motivated this project are still unmet. These needs won’t be satisfied even when a model exists that achieves the the minimum required levels of predictive performance.

A machine learning model can only begin to add value to an organization when that model’s insights routinely become available to the users for which it was built. The process of taking a trained ML model and making its predictions available to users or other systems is known as deployment. Deployment is entirely distinct from routine machine learning tasks like feature engineering, model selection, or model evaluation.

As such, deployment is not very well understood amongst data scientists and ML engineers who lack backgrounds in software engineering or DevOps. But luckily these skills aren’t very difficult. With practice any data scientist can learn how to deploy their models to production.

How do you decide how to deploy?

In order to decide how to deploy a model, you need to understand how end users should interact with the model’s predictions. This is best understood through a few examples. We’ll work our way up in complexity, beginning with a very simple use case.

In order to decide how to deploy a model, you need to understand how end users should interact with the model's predictions. Click To Tweet

Deployment Example 1: Deploying a Lead Scoring Model

Suppose a data scientist has built a lead scoring model for a group of technical analysts who are well versed in SQL. The analysts seek to group new leads into buckets based on their likelihoods of converting into customers.

Each morning they would like to use data from the database to create/update dashboards they maintain in a BI tool.

Since the analysts know SQL and expect model scores to be stored in the database, "deploying" the lead scoring model means generating daily lead scores for new leads and storing these in the analysts’ database.

The key aspects of this deployment are

  1. predictions can be generated on a group of new leads,
  2. these predictions need to be made available each day, and
  3. the predictions need to be stored in a database. The deployment process needs to satisfy these three constraints in order for the ML model to add value to the business.

Consider a slightly more complex situation.

The head of Sales finds out about the model and wants to make the model’s insights available to his account executives. Naturally and much to our chagrin, the account execs don’t know SQL, so storing the predictions in a database isn’t enough in this case.

The Product Manager determines that lead scores need to be visible in the CRM tool the account executives use in order to add business value.

Deployment aspects 1 and 2 from the previous example (generating predictions for a group of leads and doing so once a day) are still valid, but aspect 3 is not. Deployment involves having the scores flow from the database into the CRM tool. This will involve setting up additional ETLs.

Deployment Example 2: Deploying a Recommender System

For our final example let’s consider how a recommender system, a popular application of machine learning, might be deployed. Suppose that we work for an ecommerce company that wishes to show users recommendations of products to purchase. We’ll consider two variations of deployment.

Scenario 1: The company wishes to display product recommendations to users after they login to either the web or mobile application. Predictions need to be available upon request, which can be at any time of day. This places a latency constraint on our deployment, which affects whether we can generate predictions on-the-fly as a user logs in, or whether we have to generate and cache predictions beforehand. The deployment must make the model’s predictions available to both the mobile and web applications. Thus separating our deployment from either of these applications is desirable.

Scenario 2: The company wishes to add 5 recommendations to its marketing emails to existing customers. These emails are sent to users twice a week; one email goes out Monday afternoon and another goes out Friday morning. In this case, recommendations can be computed for all users at the same time and cached. Latency requirements are much less strict compared to the previous scenario. Storing these recommendations in a database is sufficient. The process for generating the emails can look up the user’s recommendations in this database and add the top 5 to the personalized emails.

As we see from each of these examples, there are multiple factors to consider when determining how to deploy a machine learning model. These factors include:

  • how frequently predictions should be generated
  • whether predictions should be generated for a single instance at a time or a batch of instances
  • the number of applications that will access the model
  • the latency requirements of these applications

Conclusion

Automated deployment of machine learning models is one of the biggest pain points facing data scientists and ML engineers in 2020. Since models can only add value to an organization when insights are regularly available to end users, its imperative that ML practitioners understand how to deploy their models as simply and efficiently as possible. The first step in determining how to deploy a model is understanding how end users should interact with that model’s predictions.

What are your thoughts on this series so far? What other sub-topics would you want me to explore? Shoot me an email at luigi at mlinproduction.com or @ me on Twitter @MLinProduction with your thoughts!

Next, be sure to check out Guide #02 in our Deployment series, which centers on Software Interfaces for Machine Learning Deployment.

Attribution

Icons made by Freepik from www.flaticon.com

2 thoughts on “What Does it Mean to Deploy a Machine Learning Model? (Deployment Series: Guide 01)”

  1. This is a great introduction to deployment of ML models. It’s always a mysterious subject for many data scientists and clients. I have been in many meeting where clients who are end users would like to know how they will receive or view the predictions or recommendations made by a model. And this article provides good examples of this. Kudos !

    1. Thanks Priscilla! I agree that this is a mysterious subject for many data scientists, but it doesn’t have to be that way. I hope this series can help de-mystify the process!

Leave a Reply

Your email address will not be published. Required fields are marked *