Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Drill down on the Data Drift tab

Availability information

The data drift drill down feature is off by default. Contact your DataRobot representative or administrator for information on enabling this feature.

Feature flag: Enable Drift Drill Down Plot

The Data Drift > Drill Down chart visualizes the difference in distribution over time between the training dataset of the deployed model and the datasets used to generate predictions in production. The drift away from the baseline established with the training dataset is measured using the Population Stability Index (PSI). As a model continues to make predictions on new data, the change in the drift status over time is visualized as a heat map for each tracked feature, allowing you to identify data drift trends.

Because data drift can decrease your model's predictive power, determining when a feature started drifting and monitoring how that drift changes (as your model continues to make predictions on new data) can help you estimate the severity of the issue. Using the Drill Down tab, you can compare data drift heat maps across the features in a deployment to identify correlated drift trends. In addition, you can select one or more features from the heat map to view a Feature Drift Comparison chart, comparing the change in a feature's data distribution between a reference time period and a comparison time period to visualize drift. This information helps you identify the cause of data drift in your deployed model, including data quality issues, changes in feature composition, or changes in the context of the target variable.

Access the Drill Down tab

To access the Drill Down tab:

  1. Click Deployments, and then select a drift-enabled deployment from the Deployments inventory.

  2. In the deployment, click Data Drift, and then click Drill Down:

  3. On the Drill Down tab:

Configure the drill down display settings

The Drill Down tab includes the following display controls:

Control Description
Model Updates the heatmap to display the model you selected from the dropdown.
Date slider Limits the range of data displayed on the dashboard (i.e., zooms in on a specific time period).
Range (UTC) Sets the date range displayed for the deployment date slider.
Resolution Sets the time granularity of the deployment date slider.
Reset Reverts the dashboard controls to the default settings.

Use the feature drift heat map

The Feature Drift for all features heat map includes the following elements and controls:

Element Description
Prediction time
(X-axis)
Represents the time range of the predictions used to calculate the corresponding drift value (PSI). Below the X-axis, the Prediction sample size bar chart represents the number of predictions made during the corresponding prediction time range.
Feature
(Y-axis)
Represents the features in a deployment's dataset. Click a feature name to generate the feature drift comparison below.
Status heat map Displays the drift status over time for each of a deployment's features. Drift status visualizations are based on the monitoring settings configured by the deployment owner. The deployment owner can also set the drift and importance thresholds in the Feature Drift vs Feature Importance chart settings.
The possible drift status classifications are:
  • Healthy (Green): The feature is experiencing minimal drift. No action needed, but monitor features that approach the threshold.
  • At risk (Yellow): A lower importance feature is experiencing drift above the set threshold. Monitor closely.
  • Failing (Red): A high importance feature is experiencing drift above the set threshold. Investigate immediately
Feature importance is determined by comparing the feature impact score with the importance threshold value. For an important feature, the feature impact score is greater than or equal to the importance threshold.
Prediction sample size Displays the number of rows of prediction data used to calculate the data drift for the given time period. To view additional information on the prediction sample size, hover over a bin in the chart to see the time of prediction range and the sample size value.

Use the feature drift comparison chart

The Feature Drift Comparison section includes the following elements and controls:

Element Description
Reference period Sets the date range of the period to use as a baseline for the drift comparison charts.
Comparison period Sets the date range of the period to compare data distribution against the reference period. You can also select an area of interest on the heat map to serve as the comparison period.
Feature values
(X-axis)
Represents the range of values in the dataset for the feature in the Feature Drift Comparison chart.
Percentage of Records
(y-axis)
Represents the percentage of the total dataset represented by a range of values and provides a visual comparison between the selected reference and comparison periods.
Add a feature drift comparison chart Generates a Feature Drift Comparison chart for a selected feature.
Remove this chart Removes a Feature Drift Comparison chart.
Set the comparison period on the feature drift heat map

To select an area of interest on the heat map to serve as the comparison period, click and drag to select the period you want to target for feature drift comparison:

To view additional information on a Feature Drift Comparison chart, hover over a bar in the chart to see the range of values contained in that bar, the percentage of the total dataset those values represent in the Reference period, and the percentage of the total dataset those values represent in the Comparison period:


Updated November 7, 2022
Back to top