“Change is the only constant in life.”
This quote was by a Greek philosopher named Heraclitus, and it’s such an interesting quote because it’s unironically true. The term ‘constant’ is defined as occurring continuously over a period of time, and so, you could say that change is perpetual. This poses a problem for machine learning models, as a model is optimized based on the variables and parameters in the time that it was created. A common and sometimes incorrect assumption made while developing a machine learning model is that each data point is an independent and identically distributed (i.i.d) random variable.
Imagine a classification model created to detect phishy emails (spam emails) created several years ago. Back then, spam emails would look something like this:
You could tell that this is a spam email because it includes an unrealistic lump sum of money being offered ($4.8 million USD), it includes a contact within the email, and it asks you to follow the instructions urgently, or “as soon as you read this email.”
Times have changed since then, and scammers are creating more sophisticated and realistic emails that make it harder to differentiate. Here’s an example of a more recent phishy email:
Notice how different this spam email is compared to the one several years ago. Do you think that the fraudulent detection model created several years ago would be able to classify this email correctly? Probably not because the presentation of phishy emails has changed and models don’t like change. One of the main assumptions when creating a model is that future data will be similar to past data used to build the model.
This is an example of model drift. In this article, you’ll learn what model drift is, the types of model drift, how to detect model drift, and how to address it.
What is Model Drift?
Model Drift (also known as model decay) refers to the degradation of a model’s prediction power due to changes in the environment, and thus the relationships between variables. Referring to the example above, changes in the presentation of spam emails would cause fraudulent detection models created several years ago to degrade.
Types of Model Drift
There are three main types of model drift:
- Concept drift
- Data drift
- Upstream data changes
Concept drift is a type of model drift where the properties of the dependent variable changes. The fraudulent model above is an example of concept drift, where the classification of what is ‘fraudulent’ changes.
Data drift is a type of model drift where the properties of the independent variable(s) change(s). Examples of data drift include changes in the data due to seasonality, changes in consumer preferences, the addition of new products, etc…
Upstream data changes refer to operational data changes in the data pipeline. An example of this is when a feature is no longer being generated, resulting in missing values. Another example is a change in measurement (eg. miles to kilometers).
How to Detect Model Drift
Measuring the Accuracy of the Model
The most accurate way to detect model drift is by comparing the predicted values from a given machine learning model to the actual values. The accuracy of a model worsens as the predicted values deviate farther and farther from the actual values.
A common metric used to evaluate the accuracy of a model among data scientists is the F1 score, mainly because it encompasses both the precision and recall of the model (See below for a visual representation of precision and recall). That being said, there are a number of metrics that are more relevant than others depending on the situation. For example, type 2 errors would be extremely important for a cancer-tumor image recognition model. Thus, when a specified metric falls below a given threshold, you’ll know that your model is drifting!
Other Methods to detect Model Drift
Sometimes monitoring the accuracy of a model isn’t always possible. In certain instances, it becomes much harder in obtaining the predicted and actual paired data. For example, imagine a model that predicts the net income of a public firm. This means that you would only be able to measure the accuracy of the model’s predictions of net income 4 times a year from the firm’s quarterly earnings reports. In the case where you aren’t able to compare predicted values to actual values, there are other alternatives that you can rely on:
- Kolmogorov-Smirnov (K-S) test: The K-S test is a nonparametric test that compares the cumulative distributions of two data sets, in this case, the training data and the post-training data. The null hypothesis for this test states that the distributions from both datasets are identical. If the null is rejected then you can conclude that your model has drifted.
- Population stability Index (PSI): The PSI is a metric used to measure how a variable’s distribution has changed over time. It is a popular metric used for monitoring changes in the characteristics of a population, and thus, detecting model decay.
- Z-score: Lastly, you can compare the feature distribution between the training and live data using the z-score. For example, if a number of live data points of a given variable have a z-score of +/- 3, the distribution of the variable may have shifted.
How to Address Model Drift
Detecting model drift is only the first step – the next step is addressing model drift. There are two main methods in doing so.
The first is to simply retrain your model in a scheduled manner. If you know that a model degrades every six months, then you may decide to retrain your model every five months to ensure that the model’s accuracy never falls below a certain threshold.
Another way to address model drift is through online learning. Online learning simply means to make a machine learning model learn in real time. It does this by taking in data as soon as it becomes available in sequential order rather than training the model with batch data.
Overall, Detecting Model Drift is hard
The truth is that detecting model drift is hard and there’s no universal blueprint to detect and address model drift.
Now imagine having to detect model drift but for a hundred or even a thousand machine learning models. It almost sounds impossible. If this describes a problem that you’re facing, there are some amazing solutions out there like Datatron.
Datatron is a machine learning governance and management platform that was built for the purpose of making model production scalable, more efficient, and simple. In addition to the several features and tools that it encompasses, it also monitors all of your deployed machine learning models and detects model drift for you, so that you don’t have to. If you would like to learn more about Datatron, click here.
Thanks for reading!
7 Spam Email Examples That Will Make You LOL
Productionizing Machine Learning: From Deployment to Drift Detection
A Gentle Introduction to Concept Drift in Machine Learning