On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Data drift

As the distribution of a model's real-world input data changes over time, diverging from the data distribution in the training dataset, the deployed model loses predictive power. The data surrounding the model is said to be drifting, and the model may struggle to adapt to the changes in real-world conditions. By leveraging training data and prediction data (also known as inference data) added to your deployment, the Monitoring > Data drift dashboard helps you monitor a model for potential performance losses due to drift in production.

How does DataRobot track drift?

DataRobot tracks two types of drift:

  • Target drift: DataRobot stores statistics about predictions to monitor how the distribution and values of the target change over time. As a baseline for comparing target distributions, DataRobot uses the distribution of predictions on the holdout.

  • Feature drift: DataRobot stores statistics about predictions to monitor how distributions and values of features change over time. The supported feature data types are numeric, categorical, and text. As a baseline for comparing distributions of features:

    • For training datasets larger than 500MB, DataRobot uses the distribution of a random sample of the training data.

    • For training datasets smaller than 500MB, DataRobot uses the distribution of 100% of the training data.

How many features can DataRobot track?

The following limits apply to tracking and receiving features in DataRobot:

  • Managed AI Platform (SaaS): By default, DataRobot tracks up to 25 features.

  • Self-Managed AI Platform (on-premise): By default, DataRobot tracks up to 25 features; however, self-managed installations can increase the limit to 200 features using the PREDICTION_API_MONITOR_RAW_MAX_FEATURE setting in the DataRobot configuration. In addition, the maximum number of features that DataRobot can receive is set using PREDICTION_API_POST_MAX_FEATURES and the absolute maximum number of features DataRobot can receive is 300. For agent-monitored deployments, the 300 feature limit applies, even if you configure the agent to send more than 300 features using MLOPS_MAX_FEATURES_TO_MONITOR.

Target and feature drift tracking are enabled by default. You can control these drift tracking features by navigating to a deployment's Settings > Data drift tab. If feature drift tracking is turned off, a message displays on the Data drift tab to remind you to enable it.

To receive email notifications on data drift status, configure notifications, schedule monitoring, and configure data drift monitoring settings.

The Data drift dashboard provides four interactive and exportable visualizations that help identify the health of a deployed model over a specified time interval.


The export button allows you to download each chart on the Data drift dashboard as a PNG, CSV, or ZIP file.

Chart Description
1 Drift vs. Importance Plots the importance of a feature in a model against how much the distribution of feature values has changed, or drifted, between one point in time and another.
2 Feature details Plots percentage of records, i.e., the distribution, of the selected feature in the training data compared to the prediction data.
3 Drift over time Illustrates the difference in distribution over time between the training dataset of the deployed model and the datasets used to generate predictions in production. This chart tracks the change in the Population Stability Index (PSI), which is a measure of data drift.
4 Predictions over time Illustrates how the distribution of a model's predictions has changed over time (target drift). The display differs depending on whether the project is regression or binary classification.

In addition to the visualizations above, you can identify drift trends using the Data drift > Drill down tab to compare data drift heat maps across features.

Configure the Data drift dashboard

You can customize how a deployment calculates data drift status by configuring drift and importance thresholds and additional definitions on the Settings > Data drift page. Use the following controls to configure the data drift dashboard as needed:

Control Description
1 Model version selector Updates the dashboard displays to reflect the model you selected from the dropdown.
2 Date slider Limits the range of data displayed on the dashboard (i.e., zooms in on a specific time period).
3 Range (UTC) selector Sets the date range displayed for the deployment date slider. The range selector only allows you to select dates and times between the start date of the deployment's current version of a model and the current date.
4 Resolution selector Sets the time granularity of the deployment date slider. The following resolution settings are available, based on the selected range:
  • Hourly: If the range is less than 7 days.
  • Daily: If the range is between 1-60 days (inclusive).
  • Weekly: If the range is between 1-52 weeks (inclusive).
  • Monthly: If the range is at least 1 month and less than 120 months.
5 Segment Attribute / Segment Value Sets the individual attribute and value to filter the data drift visualizations for segment analysis.
6 Selected Feature Sets the feature displayed on the Feature details chart and the Drift over time chart.
7 Refresh Initiates an on-demand update of the dashboard with new data. Otherwise, DataRobot refreshes the dashboard every 15 minutes.
8 Reset Reverts the dashboard controls to the default settings.

Feature analysis charts

The Feature analysis charts visualize the drift between the training data and the prediction data during the selected time period alongside the data distribution details for the selected feature. The supported feature data types are numeric, categorical, and text. In this area of the dashboard, you can find the following visualizations:

To explore the feature analysis charts, you can switch between two views:

  • : Displays the Drift vs. importance and Feature details charts, side by side.
  • : Displays the combined Feature analysis table.

Features without feature impact

When manually selecting features, if you select any features without a feature impact score (not to be confused with the feature importance score shown while selecting the features), these features appear with N/A as the Importance score.

Drift vs. importance chart

The Drift vs. importance chart monitors the 25 most impactful numerical, categorical, and text-based features in your data. Use the chart to see if data is different at one point in time compared to another. Differences may indicate problems with your model or in the data itself. For example, if users of an auto insurance product are getting younger over time, the data that built the original model may no longer result in accurate predictions for your newer data. Particularly, drift in features with high importance can be a warning flag about model accuracy.

Hover over a point in the chart to identify the feature name and report the precise values for drift (Y-axis) and importance (X-axis). Click the settings icon to adjust the Importance and Drift thresholds.

To select the feature visualized in the Feature details and Drift over time charts, click the marker for that feature in the Drift vs. importance plot:

Feature drift

The Y-axis reports the Drift value for a feature. This value is a calculation of the Population Stability Index (PSI), a measure of the difference in distribution over time.

Drift metric support

While the DataRobot UI only supports the Population Stability Index (PSI) metric, the DataRobot API supports Kullback-Leibler Divergence, Hellinger Distance, Histogram Intersection, and Jensen–Shannon Divergence. In addition, using the Python API client, you can retrieve a list of supported metrics.

Feature importance

The X-axis reports the Importance score for a feature, calculated when ingesting the learning (or training) data. DataRobot calculates feature importance differently depending on the model type. For DataRobot models and custom models, the Importance score is calculated using Permutation Importance. For external models, the importance score is an ACE Score. The dot resting at the Importance value of 1 is the target prediction . The most important feature in the model will also appear at 1 (as a solid green dot).

Interpret the quadrants

The quadrants represented in the chart help to visualize feature-by-feature data drift plotted against the feature's importance. Quadrants can be loosely interpreted as follows:

Quadrant Read as Color indicator
1 High importance feature(s) are experiencing high drift. Investigate immediately. Red
2 Lower importance feature(s) are experiencing drift above the set threshold. Monitor closely. Yellow
3 Lower importance feature(s) are experiencing minimal drift. No action needed. Green
4 High importance feature(s) are experiencing minimal drift. No action needed, but monitor features that approach the threshold. Green


Points on the chart can also be gray or white. Gray circles represent features that have been excluded from drift status calculation, and white circles represent features set to high importance.

If you are the project owner, you can click the settings icon in the upper-right corner of the chart to reset the quadrants. By default, the drift threshold defaults to .15. The Y-axis scales from 0 to the higher of 0.25 and the highest observed drift value. These quadrants can be customized by changing the drift and importance thresholds.

Feature details chart

The Feature details chart provides a histogram that compares the distribution of a selected feature in the training data to the distribution of that feature in the prediction data. To use the Feature details chart, select a feature from the dropdown list. The list, which defaults to the target feature, includes any of the features tracked.


To select a feature for the Feature details chart, you can also click a feature marker on the Drift vs. importance chart to select a feature or set the Selected Feature in the Data Drift Summary controls.

Numeric features

For numeric data, DataRobot computes an efficient and precise approximation of the distribution of each feature. Based on this, drift tracking is conducted by comparing the normalized histogram for the training data to the scoring data using the selected drift metrics.

The chart displays 13 bins for numeric features:

  • 10 bins capture the range of items observed in the training data.

  • Two bins capture very high and very low values—extreme values in the scoring data that fall outside the range of the training data. For example, to define the high and low value bins, the values are compared against the training data ranges, min_training and max_training. The low value bin contains values below the min_training range and the high value bin contains values above the max_training range.

  • One bin for the missing count, containing all records with missing feature values.

How are values added to the histogram bins?

The Data drift tab uses Ben-Haim/Tom-Tov Centroid Histograms.

Categorical features

Unlike numeric data, where binning cutoffs for a histogram result from a data-dependent calculation, categorical data is inherently discrete in form (that is, not continuous), so binning is based on a defined category. Additionally, there could be missing or unseen category levels in the scoring data.

The process for drift tracking of categorical features is to calculate the fraction of rows for each categorical level ("bin") in the training data. This results in a vector of percentages for each level. The 25 most frequent levels are directly tracked—all other levels are aggregated to the others bin. This process is repeated for the scoring data, and the two vectors are compared using the selected drift metric.

For categorical features, in addition to bins for the top categories and the missing category, the chart includes two unique bins:

  • The others bin contains all categorical features outside the 25 most frequent values. This aggregation is performed for drift-tracking purposes; it doesn't represent the model's behavior.

  • The new levels bin only displays after you make predictions with data that has a new value for a feature not in the training data. For example, consider a dataset about housing prices with the categorical feature City. If your prediction data contains the value Boston and your training data does not, the Boston value (and other unseen cities) are represented in the new level bin.

Text features

Text features are a high-cardinality problem, meaning the addition of new words does not have the impact of, for example, new levels found in categorical data. The method DataRobot uses to track drift in text features accounts for the fact that writing is subjective and cultural and may have spelling mistakes. In other words, to identify drift in text fields, it is more important to identify a shift in the whole language rather than in individual words.

Drift tracking for a text feature is conducted by:

  1. Detecting occurrences of the 1000 most frequent words from rows found in the training data.
  2. Calculating the fraction of rows that contain these terms for that feature separately both the training data and scoring data.
  3. Comparing the fraction in the scoring data to that in the training data.

The two vectors of occurrence fractions (one entry per word) are compared with the available drift metrics. Before applying this methodology, DataRobot performs basic tokenization by splitting the text feature into words (or characters in the case of Japanese or Chinese).

For text features, the Feature details bar chart is replaced by a word cloud visualizing data distributions for each token and revealing how much each token contributes to a feature's data drift. To access the feature drift word cloud, in the Feature details chart, select a text feature from the dropdown list. You can also select a text feature from the Selected Feature dropdown list in the Data drift dashboard controls. To interpret the feature drift word cloud for a text feature, hold the pointer over a token to view the following details:


When your pointer is over the word cloud, you can scroll up to zoom in and view the text of smaller tokens.

Chart element Description
Token The tokenized text. Text size represents the token's drift contribution and text color represents the dataset prevalence. Stop words are hidden from this chart.
Drift contribution How much this particular token contributes to the feature's drift value, as reported in the Drift vs. importance and Drift over time charts.
Data distribution How much more often this particular token appears in the training data or the prediction data.
  • Blue: This token appears X% more often in training data.
  • Red: This token appearsX% more often in prediction data.
Disable word cloud view

Next to the Export button, you can click the settings icon and clear the Display text features as word cloud check box to disable the feature drift word cloud and view the standard chart.

Feature analysis table


The feature analysis table is on by default.

Feature flags: Enable Feature Drift Customization

When the Feature analysis section is in table view , you can view feature importance, drift, status, type, and feature details in a combined visualization:

Features without feature impact

When manually selecting features, if you select any features without a feature impact score (not to be confused with the feature importance score shown while selecting the features), these features appear with N/A as the Importance score.

Click the row for a specific feature to view the feature details chart for the selected feature:

The chart displayed here functions in the same way as the Feature details chart in the default view.

Drift over time chart

The Drift over time chart visualizes the difference in distribution over time between the training dataset of the deployed model and the datasets used to generate predictions in production. The drift away from the baseline established with the training dataset is measured using the Population Stability Index (PSI). As a model continues to make predictions on new data, the change in the PSI over time is visualized for each tracked feature, allowing you to identify data drift trends.

As data drift can decrease your model's predictive power, determining when a feature started drifting and monitoring how that drift changes (as your model continues to make predictions on new data) can help you estimate the severity of the issue. You can then compare data drift trends across the features in a deployment to identify correlated drift trends between specific features. In addition, the chart can help you identify seasonal effects (significant for time-aware models). This information can help you identify the cause of data drift in your deployed model, including data quality issues, changes in feature composition, or changes in the context of the target variable. The example below shows the PSI consistently increasing over time, indicating worsening data drift for the selected feature.

The Drift over time chart includes the following elements and controls:

Chart element Description
1 Selected Feature Selects a feature for drift over time analysis, which is then reported in the Drift Over Time chart and the Feature Details chart.
2 Time of Prediction / Sample size
Represents the time range of the predictions used to calculate the corresponding drift value (PSI). Below the X-axis, a bar chart represents the number of predictions made during the corresponding Time of Prediction. For more information on how time of prediction is represented in time series deployments, see the Time of prediction for time series deployments note.
3 Drift
Represents the range of drift values (PSI) calculated for the corresponding Time of Prediction.
4 Training baseline Represents the 0 PSI value of the training baseline dataset.
5 Drift status information Displays the drift status and threshold information for the selected feature. Drift status visualizations are based on the settings configured by the deployment owner. The deployment owner can also set the drift and importance thresholds in the Feature Drift vs Feature Importance chart settings.
The possible drift status classifications are:
  • Healthy (Green): The feature is experiencing minimal drift. No action needed, but monitor features that approach the threshold.
  • At risk (Yellow): A lower importance feature is experiencing drift above the set threshold. Monitor closely.
  • Failing (Red): A high importance feature is experiencing drift above the set threshold. Investigate immediately.
Feature importance is determined by comparing the feature impact score with the importance threshold value. For an important feature, the feature impact score is greater than or equal to the importance threshold.
6 Export Exports the Drift over time chart.
Time of prediction for time series deployments

The default prediction timestamp method for time series deployments is forecast date (i.e., forecast point + forecast distance), not the time of the prediction request. Forecast date allows a common time axis to be used between the training data and the basis of data drift and accuracy statistics. For example, using forecast date, if the prediction data has dates from June 1 to June 10, the forecast point is set to June 10, and the forecast distance is set to +1 - + 7 days, predictions are available and data drift is tracked for June 11 - 17.

You can select from the following prediction timestamp options when deploying a model:

  • Use value from date/time feature: Default. Use the date/time provided as a feature with the prediction data (e.g., forecast date) to determine the timestamp.
  • Use time of prediction request: Use the time you submitted the prediction request to determine the timestamp.

To view additional information on the Drift Over Time chart, hover over a marker in the chart to see the Time of Prediction, PSI, and Sample size:

Drift Over Time and Predictions Over Time comparison

The X-axis of the Drift Over Time chart aligns with the X-axis of the Predictions Over Time chart below to make comparing the two charts easier. In addition, the Sample size data on the Drift Over Time chart is equivalent to the Number of Predictions data from the Predictions Over Time chart.

Predictions over time chart

The Predictions over time chart provides an at-a-glance determination of how the model's predictions have changed over time. For example:

Dave sees that his model is predicting 1 (readmitted) noticeably more frequently over the past month. Because he doesn't know of a corresponding change in the actual distribution of readmissions, he suspects that the model has become less accurate. With this information, he investigates further whether he should consider retraining.

Although the charts for binary classification and regression differ slightly, the takeaway is the same—are the plot lines relatively stable across time? If not, is there a business reason for the anomaly (for example, a blizzard)? One way to check this is to look at the bar chart below the plot. If the point for a binned period is abnormally high or low, check the histogram below to ensure there are enough predictions for this to be a reliable data point.

Time of Prediction

The Time of Prediction value differs between the Data drift and Accuracy tabs and the Service health tab:

  • On the Service health tab, the "time of prediction request" is always the time the prediction server received the prediction request. This method of prediction request tracking accurately represents the prediction service's health for diagnostic purposes.

  • On the Data drift and Accuracy tabs, the "time of prediction request" is, by default, the time you submitted the prediction request, which you can override with the prediction timestamp in the Prediction History and Service Health settings.

Additionally, both charts have Training and Scoring labels across the X-axis. The Training label indicates the section of the chart that shows the distribution of predictions made on the holdout set of training data for the model. It will always have one point on the chart. The Scoring label indicates the section of the chart showing the distribution of predictions made on the deployed model. Scoring indicates that the model is in use to make predictions. It will have multiple points along the chart to indicate how prediction distributions change over time.

For regression projects

The Predictions over time chart for regression projects plots the average predicted value, as well as a visual indicator of the middle 80% range of predicted values for both training and prediction data. Hover over a point on the chart to view its details:

Field Description
Date The starting date of the bin data. Displayed values are based on counts from this date to the next point along the graph. For example, if the date on point A is 01-07 and point B is 01-14, then point A covers everything from 01-07 to 01-13 (inclusive).
Average Predicted Value The average of the values for all points included in the bin.
10th-90th Percentile The percentile of predictions for that time period.
Predictions The number of predictions included in the bin. Compare this value against other points if you suspect anomalous data.
Num. Anomalies If you have enabled prediction warnings for a deployment, the yellow section of the bar chart represents the anomalous predictions for a point in time. To view the number of anomalous predictions for a specific period, hover over the point on the plot corresponding to the flagged predictions in the bar chart. Prediction warnings are only available for regression model deployments.

Training data details

If training data is uploaded, the graph displays both the 10th-90th percentile and the mean value of the target, represented by an open circle . You can also display this information for the mean value of the target by hovering on the point in the training data.

For binary classification projects

The Predictions over time chart for binary classification projects plots the class percentages based on the labels you set when you added the deployment (in this example, 0 and 1). Hover over a data point to see the specific values.

The Predictions over time can display data in Continuous mode and Binary mode:

Continuous mode shows the positive class predictions as probabilities between 0 and 1, without taking the prediction threshold into account.

The following details are available in continuous mode:

Field Description
Date The starting date of the bin data. Displayed values are based on counts from this date to the next point along the graph. For example, if the date on point A is 01-07 and point B is 01-14, then point A covers everything from 01-07 to 01-13 (inclusive).
Average Predicted Value The average of the values for all points included in the bin.
10th-90th Percentile The percentile of predictions for that time period.
Predictions The number of predictions included in the bin. Compare this value against other points if you suspect anomalous data.

Training data details

If training data is uploaded, the graph displays both the 10th-90th percentile and the mean value of the target, represented by an open circle . You can also display this information for the mean value of the target by hovering on the point in the training data.

Binary mode takes the prediction threshold into account and shows, of all predictions made, the percentage for each possible class.

The following additional elements are available on the Predictions over time chart for a classification model in binary mode:

Element Description
1 Display or hide the data for a class label on the predictions over time chart.
2 Switch between continuous and binary modes in the predictions over time chart for a binary classification deployment. Hover over a point on the chart to view its details.
3 View the threshold set for prediction output. The threshold is set when adding your deployment to the inventory and cannot be revised.
4 View the mean value of the target in the training data ( and ).

The following details are available in binary mode:

Field Description
Date The starting date of the bin data. Displayed values are based on counts from this date to the next point along the graph. For example, if the date on point A is 01-07 and point B is 01-14, then point A covers everything from 01-07 to 01-13 (inclusive).
Class label 1 For all points included in the bin, the percentage of those in the "positive" class (0 in this example).
Class label 2 For all points included in the bin, the percentage of those in the "negative" class (1 in this example).
Number of Predictions The number of predictions included in the bin. Compare this value against other points if you suspect anomalous data.

Drill down on the Data drift tab

The Data drift > Drill down tab visualizes the difference in distribution over time between the training dataset of the deployed model and the datasets used to generate predictions in production. The drift away from the baseline established with the training dataset is measured using the Population Stability Index (PSI). As a model continues to make predictions on new data, the change in the drift status over time is visualized as a heat map for each tracked feature, allowing you to identify data drift trends.

Using the Drill down tab, you can compare data drift heat maps across the features in a deployment to identify correlated drift trends. In addition, you can select one or more features from the heat map to view a Feature Drift Comparison chart, comparing the change in a feature's data distribution between a reference time period and a comparison time period to visualize drift. This information helps you identify the cause of data drift in your deployed model, including data quality issues, changes in feature composition, or changes in the context of the target variable.

Configure the drill down display settings

The Drill Down tab includes the following display controls:

Control Description
1 Model Updates the heatmap to display the model you selected from the dropdown.
2 Date slider Limits the range of data displayed on the dashboard (i.e., zooms in on a specific time period).
3 Range (UTC) Sets the date range displayed for the deployment date slider. The range selector only allows you to select dates and times between the start date of the deployment's current version of a model and the current date.
4 Resolution Sets the time granularity of the deployment date slider. The following resolution settings are available, based on the selected range:
  • Hourly: If the range is less than 7 days.
  • Daily: If the range is between 1-60 days (inclusive).
  • Weekly: If the range is between 1-52 weeks (inclusive).
  • Monthly: If the range is at least 1 month and less than 120 months.
5 Reset Reverts the dashboard controls to the default settings.

Use the feature drift heat map

The Feature drift for all features heat map includes the following elements and controls:

Element Description
1 Prediction time
Represents the time range of the predictions used to calculate the corresponding drift value (PSI). Below the X-axis, the Prediction sample size bar chart represents the number of predictions made during the corresponding prediction time range.
2 Feature
Represents the features in a deployment's dataset. Click a feature name to generate the feature drift comparison below.
3 Status heat map Displays the drift status over time for each of a deployment's features. Drift status visualizations are based on the data drift settings. The deployment owner can also set the drift and importance thresholds in the Feature Drift vs Feature Importance chart settings.
The possible drift status classifications are:
  • Healthy (Green): The feature is experiencing minimal drift. No action needed, but monitor features that approach the threshold.
  • At risk (Yellow): A lower importance feature is experiencing drift above the set threshold. Monitor closely.
  • Failing (Red): A high importance feature is experiencing drift above the set threshold. Investigate immediately.
Feature importance is determined by comparing the feature impact score with the importance threshold value. For an important feature, the feature impact score is greater than or equal to the importance threshold.
4 Prediction sample size Displays the number of rows of prediction data used to calculate data drift for the given time period. To view additional information on the prediction sample size, hover over a bin in the chart to see the time of prediction range and the sample size value.

Use the feature drift comparison chart

The Feature drift comparison section includes the following elements and controls:

Element Description
1 Reference period Sets the date range of the period to use as a baseline for the drift comparison charts.
2 Comparison period Sets the date range of the data distribution period to compare against the reference period. You can also select an area of interest on the heat map to serve as the comparison period.
3 Feature values
Represents the range of values in the dataset for the feature in the Feature Drift Comparison chart.
4 Percentage of Records
Represents the percentage of the total dataset represented by a range of values and provides a visual comparison between the selected reference and comparison periods.
5 Add a feature drift comparison chart Generates a Feature Drift Comparison chart for a selected feature.
6 Remove this chart Removes a Feature Drift Comparison chart.
Set the comparison period on the feature drift heat map

To select an area of interest on the heat map to serve as the comparison period, click and drag to select the period you want to target for feature drift comparison:

To view additional information on a Feature Drift Comparison chart, hover over a bar in the chart to see the range of values contained in that bar, the percentage of the total dataset those values represent in the Reference period, and the percentage of the total dataset those values represent in the Comparison period:

Updated October 11, 2024