Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Data Drift tab

As training and production data change over time, a deployed model loses predictive power. The data surrounding the model is said to be drifting. By leveraging the training data and prediction data (also known as inference data) that is added to your deployment, the Data Drift dashboard helps you to analyze a model's performance after it has been deployed. It provides three interactive and exportable visualizations that help identify the health of a deployed model over a specified time interval.

  • The Feature Drift vs. Feature Importance chart (1): plots the importance of a feature in a model against how much the distribution of actual feature values has changed, or drifted, between one point in time and another.

  • The Feature Details chart (2): plots percentage of records, i.e., the distribution, of the selected feature in the training data compared to the inference data.

  • The Predictions Over Time chart (3): illustrates how the distribution of a model's predictions has changed over time (target drift). The display differs depending on whether the project is regression or binary classification.

You can modify the display as needed:

  • The model version selector updates the displays to reflect the model you selected from the dropdown (only available for custom model deployments).

  • The time resolution dropdown sets the time granularity of the deployment date slider.

  • The refresh button refreshes the page with new data on demand. Otherwise, DataRobot refreshes every 15 minutes.

  • The deployment date slider, above the charts, limits the range of data displayed (i.e., zooms in on a specific time period).

  • The export button allows you to download each chart as a PNG, CSV, or ZIP file.

  • You can customize how a deployment calculates data drift status by configuring drift and importance thresholds and editing additional settings.

  • The Data Drift dashboard supports segmented analysis, allowing you to view data drift comparing a subset of training data to predictions data for individual segment attributes and values.

Feature Drift vs Feature Importance chart

The Feature Drift vs. Feature Importance chart monitors the 25 most impactful numerical, categorical, and text-based features in your data.

Use the chart to see if data is different at one point in time compared to another. Differences may indicate problems with your model or in the data itself. For example, if users of an auto insurance product are getting younger over time, the data that built the original model may no longer result in accurate predictions for your newer data. Particularly, drift in features with high importance can be a warning flag about your model accuracy.

The chart's X-axis reports the Importance score it calculated when ingesting the learning (or training) data. The dot resting at the Importance value of 1 is the target prediction . The most important feature in the model will also appear at 1 (as a solid green dot).

The Y-axis reports the drift value. This value is a calculation of the Population Stability Index (PSI), a measure of difference in distribution over time.

Hover over a point in the chart to identify the feature name and report the precise values for drift (Y-axis) and importance (X-axis).

Interpret the quadrants

The quadrants represented in the chart help to visualize feature-by-feature data drift plotted against the feature's importance. Quadrants can be loosely interpreted as follows:

Quadrant Read as... Color indicator
1 High importance feature(s) are experiencing high drift. Investigate immediately. Red
2 Lower importance feature(s) are experiencing drift above the set threshold. Monitor closely. Yellow
3 Lower importance feature(s) are experiencing minimal drift. No action needed. Green
4 High importance feature(s) are experiencing minimal drift. No action needed, but monitor features that approach the threshold. Green

Note that points on the chart can also be gray or white. Gray circles represent features that have been excluded from drift status calculation, and white circles represent features set to high importance.

If you are the project owner, you can click the gear icon in the upper right chart corner to reset the quadrants. By default, the drift threshold defaults to .15. The Y-axis scales from 0 to the higher of 0.25 and the highest observed drift value. These quadrants can be customized by changing the drift and importance thresholds.

Feature Details chart

The Feature Details chart provides a histogram that compares the distribution of a selected feature in the training data to the distribution of that feature in the inference data.

The chart displays 13 bins for numeric features:

  • 10 bins contain the top 25 values based on Feature Impact from your training data.

  • There is a bin for values under the minimum of the top 25 values and another bin for values over the maximum.

  • There is an additional bin for the Missing count, containing all records with missing feature values (that is, NaN as the value of one of the features).

For categorical features, the chart includes two unique bins:

  • The Other bin contains all features other than the 25 most frequent values.

  • The New level bin only displays after you make predictions with data that has a new value for a feature not in the training data. For example, consider a dataset about housing prices with the categorical feature City. If your inference data contains the value Boston and your training data did not, the Boston value (and other unseen cities) are represented in the New level bin.

To use the chart, select a feature from the dropdown. The list, which defaults to the target feature, includes any of the features tracked. Click a point in the Feature Drift vs. Feature Importance chart:

Predictions Over Time chart

The Predictions Over Time chart provides an at-a-glance determination of how the model's predictions have changed over time. For example:

Dave sees that his model is predicting 1 (readmitted) noticeably more frequently over the past month. Because he doesn't know of a corresponding change in the actual distribution of readmissions, he suspects that the model has become less accurate. With this information, he investigates further whether he should consider retraining.

Although the charts for binary classification and regression differ slightly, the take-away is the same—are the plot lines relatively stable across time? If not, is there a business reason for the anomaly (for example, a blizzard)? One way to check this is to look at the bar chart below the plot. If the point for a binned period is abnormally high or low, check the histogram below to make sure there are sufficient number of predictions for this to be a reliable data point.

Additionally, both charts have Training and Scoring labels across the X-axis. The Training label indicates the section of the chart that shows the distribution of predictions made on the holdout set of training data for the model. It will always have one point on the chart. The Scoring label indicates the section of the chart showing the distribution of predictions made on the deployed model. Scoring indicates that the model is in use to make predictions. It will have multiple points along the chart to indicate how prediction distributions change over time.

For regression projects

The Predictions Over Time chart for regression projects plots the average predicted value, as well a visual indicator of the middle 80% range of predicted values, for both training and prediction data. If training data is uploaded, the graph displays both the 10th-90th percentile and the mean value of the target ().

Hover over a point on the chart to view its details:

  • Date: The starting date of the bin data. Displayed values are based on counts from this date to the next point along the graph. For example, if the date on point A is 01-07 and point B is 01-14, then point A covers everything from 01-07 to 01-13 (inclusive).
  • Average Predicted Value: For all points included in the bin, this is the average of their values.
  • Predictions: The number of predictions included in the bin. Compare this value against other points if you suspect anomalous data.
  • 10th-90th Percentile: Percentile of predictions for that time period.

Note that you can also display this information for the mean value of the target by hovering on the point in the training data.

For binary classification projects

The Predictions Over Time chart for binary classification projects plots the class percentages, based on the labels you set when you added the deployment (in this example, 0 and 1). It also reports the threshold set for prediction output. The threshold is set when adding your deployment to the inventory and cannot be revised.

Hover over a point on the chart to view its details:

  • Date: The starting date of the bin data. Displayed values are based on counts from this date to the next point along the graph. For example, if the date on point A is 01-07 and point B is 01-14, then point A covers everything from 01-07 to 01-13 (inclusive).
  • <class-label>: For all points included in the bin, the percentage of those in the "positive" class (0 in this example).
  • <class-label>: For all points included in the bin, the percentage of those in the "negative" class (1 in this example).
  • Number of Predictions: The number of predictions included in the bin. Compare this value against other points if you suspect anomalous data.

Additionally, the chart displays the mean value of the target in the training data. As with all plotted points, you can hover over it to see the specific values.

The chart also includes a toggle in the upper-right corner that allows you to switch between continuous and binary modes (only for binary classification deployments):

Continuous mode shows the positive class predictions as probabilities between 0 and 1, without taking the prediction threshold into account:

Binary mode takes the prediction threshold into account and shows, of all predictions made, the percentage for each possible class:

Prediction warnings integration

If you have enabled prediction warnings for a deployment, any anomalous prediction values that trigger a warning are flagged in the Predictions Over Time bar chart.

Note

Prediction warnings are only available for regression model deployments.

The yellow section of the bar chart represents the anomalous predictions for a point in time.

To view the number of anomalous predictions for a specific time period, hover over the point on the plot corresponding to the flagged predictions in the bar chart.

Use the version selector

You can change the data drift display to analyze the current, or any previous, version of a model in the deployment. Initially, if there has been no model replacement, you only see the Current option. The models listed in the dropdown can also be found in the History section of the Overview tab. This functionality is only supported with deployments made with models or model images.

Use the time range and resolution dropdowns

The Range and Resolution dropdowns help diagnose deployment issues by allowing you to change the granularity of the three deployment monitoring tabs: Data Drift, Service Health, and Accuracy.

Expand the Range dropdown (1) to select the start and end dates for the time range you want to examine. You can specify the time of day for each date (to the nearest hour, rounded down) by editing the value after selecting a date. When you have determined the desired time range, click Update range (2). Select the Range reset icon () (3) to restore the time range to the previous setting.

Note

Note that the date picker only allows you to select dates and times between the start date of the deployment's current version of a model and the current date.

After setting the time range, use the Resolution dropdown to determine the granularity of the date slider. Select from hourly, daily, weekly, and monthly granularity based on the time range selected. If the time range is longer than 7 days, hourly granularity is not available.

When you choose a new value from the Resolution dropdown, the resolution of the date selection slider changes. Then, you can select start and end points on the slider to hone in on the time range of interest.

Note that the selected slider range also carries across the Service Health and Accuracy tabs (but not across deployments).

Use the date slider

The date slider limits the time range used for comparing prediction data to training data. The upper dates displayed in the slider, left and right edges, indicate the range currently used for comparison in the page's visualizations. The lower dates, left and right edges, indicate the full date range of prediction data available. The circles mark the "data buckets," which are determined by the time range.

To use the slider, click a point to move the line or drag the endpoint left or right.

The visualizations use predictions from the starting point of the updated time range as the baseline reference point, comparing them to predictions occurring up to the last date of the selected time range.

You can also move the slider to a different time interval while maintaining the periodicity. Click anywhere on the slider between the two endpoints to drag it (you will see a hand icon on your cursor).

In the example above, you see the slider spans a 3-month time interval. You can drag the slider and maintain the time interval of 3 months for different dates.

By default, the slider is set to display the same date range that is used to calculate and display drift status. For example, if drift status captures the last week, then the default slider range will span from the last week to the current date.

You can move the slider to any date range without affecting the data drift status display on the health dashboard. If you do so, a Reset button appears above the slider. Clicking it will revert the slider to the default date range that matches the range of the drift status.

Class selector

Multiclass deployments offer class-based configuration to modify the data displayed on the Data Drift graphs.

Predictions over Time multiclass graph:

Feature Details multiclass graph:

Customize data drift status

By default, the data drift status for deployments is marked as Failing () when at least one high-importance feature exceeds the set drift metric threshold; it is marked as At Risk () when no high-importance features but at least one low-importance feature exceeds the threshold. Deployment owners can customize the rules used to calculate the drift status for each deployment. Customization happens in a number of ways:

  • Define or override the list of high or low importance features in order to monitor features that are important to you, or put less emphasis on features that are less important.

  • Exclude features expected to drift from drift status calculation and alerting, so you do not get false alarms.

  • Customize what At-Risk and Failing drift statuses mean to make the drift status of each deployment personalized and tailored to your needs.

To customize drift status for a deployment, navigate to the Settings > Monitoring tab.

  1. Adjust the time range of the Comparison period, which compares training data to prediction data. Select a time range from the dropdown.

  2. Configure the thresholds of the Drift metric. The drift metric is used to calculate drift. DataRobot only supports the Population Stability Index (PSI) metric. When drift thresholds are changed, the Feature Drift vs. Feature Importance chart updates to reflect the changes.

    You can exclude features (including the target) from drift status calculation by clicking the excluded features link. A dialog box prompts you to enter the names of features you want to exclude.

    Once added, features do not affect drift status for the deployment but still display on the Feature Drift vs. Feature Importance chart. In the example below, the excluded feature, which appears as a gray circle, would normally change the drift status to Failing (). Because it is excluded, the status remains as Passing.

  3. Configure the thresholds of the Importance metric. The importance metric measures the most impactful features in the training data. DataRobot only supports the Permutation Importance metric. When drift thresholds are changed, the Feature drift vs. Feature importance chart updates to reflect the changes. In the example below, the chart has adjusted the importance and drift thresholds (indicated by the arrows), resulting in more features At Risk and Failing than the chart above.

    You can set features to be treated as high importance even if they were initially assigned low importance by clicking the starred features link. A dialog box prompts you to enter the names of features you want to set.

    Once added, these features are assigned high importance and ignore the importance thresholds, but still display on the Feature Drift vs. Feature Importance chart. In the example below, the starred feature, which appears as a white circle, would normally cause drift status to be At Risk due to its initially low importance. However, since it is assigned high importance, the feature will change the drift status to Failing ().

  4. Configure the values that trigger drift statuses for At Risk () and Failing (). For example, you can configure the rule for a deployment to mark its drift status as At Risk if one of the following is true:

  5. the number of low importance features above the drift threshold is greater than 1,

    OR

  6. the number of high importance features above the drift threshold is greater than 3. Represented as:

When you have made your desired configuration changes, click Save new settings at the top of the page. If you are not satisfied with the configuration or want to restore the default settings, click Reset to defaults. Note that these changes affect the entire history of a deployment, and only affect periods of time in which predictions were made.


Updated November 5, 2021
Back to top