Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Individual Prediction Explanations

Note

Prediction Explanations has been renamed Individual Prediction Explanations in Workbench to better communicate the feature’s functionality as a local explanation method that calculates SHAP values for each individual row. Where DataRobot Classic supports both XEMP and SHAP explanations, Workbench supports only SHAP explanations because they provide more transparency due to their open source nature.

SHAP-based explanations help to understand what drives predictions on a row-by-row basis by providing an estimation of how much each feature contributes to a given prediction differing from the average. They answer why a model made a certain prediction—What drives a customer's decision to buy—age? gender? buying habits? Then, they help identify the impact on the decision for each factor. They are intuitive, unbounded (computed for all features), fast, and, due to the open source nature of SHAP, transparent. Not only does SHAP provide the benefit of helping you better understand model behavior—and quickly—it also allows you to easily validate if a model adheres to business rules.

Availability information

Support for the new Individual Prediction Explanations in Workbench is on by default.

Feature flag: Universal SHAP in NextGen

Hover on a bin to see the range of predictions represented by the bin and the number of predictions in the bin.

Insight filters

Use the controls in the insight to change the prediction distribution chart:

Option Description
Data selection Set the partition and source of data to compute explanations for.
Data slice Select, or create (by selecting Manage slices), a data slice to view a subpopulation of a model's data based on feature value.
Prediction range In the Predictions to sample table, view only predictions within a set range.
Export Download individual prediction explanations, in CSV format, based on the settings in the export modal.

For more details about working with Individual Prediction Explanations, see the related considerations and the SHAP reference.

Set the data source

Change the data source from the Data selection dropdown when you want to use alternate data for computing explanations. The data selection is comprised of a dataset and, when using the current training set, a selected partition.

You can change either:

  • A partition in the current training dataset, either training, validation, or holdout. By default, the chart represents the validation partition of the training dataset.

  • An additional, perhaps external, dataset. Use this when you want to use the same model to see explanations for rows that were not in your experiment's training data. DataRobot lists all datasets associated with your Use Case (up to 100), but you can also upload external datasets. Choose:

    • The same dataset again when you want to see a different random sample of rows.
    • A different dataset (be sure to choose a dataset that the model can predict on successfully).

Note that the prediction distribution chart is not available for the training dataset's training partition.

Download explanations

To download explanations in CSV format, click Export, set each limit, and click Download. You can change the settings and download each new version; click Done to dismiss the modal when you are finished.

Option When checked Otherwise
Limit features per prediction Only the specified number of top features are included in the CSV. Enter a value between 1 and the number of computed explanations, with a maximum of 100. Download predictions for all rows.
Limit downloaded explanations with applied filters Only those explanations meeting the filters set in the prediction distribution chart controls are included in the CSV. All explanations (up to 25,000) are included.

Predictions to sample

The sampled rows below the prediction distribution chart are chosen according to percentiles. The display for each sampled row includes a preview of the single most impactful feature for that row. Expand the row to see the top several most impactful features for that row.

Click the pencil icon to change the samples to return. By default, DataRobot returns five samples of predictions, uniformly sampled from across the range of predictions as defined by the filters.

Note

The table of predictions to sample is an on-demand feature; when you click Compute, DataRobot returns details of each individual explanation. Changes to any of the settings (data source, partition, or data slice) will require recomputing the table.

Simple table view

The summary entries provide:

  • A prediction ID (for example, Prediction #1117).
  • A prediction value with colored dot corresponding to the coloring of that value in the prediction distribution chart.
  • The top contributing feature to that prediction result.

Expanded row view

Click on any row in the simple table view to display additional information for its prediction. The expanded view lists, for each prediction, the features that were most impactful, ordered by SHAP score. DataRobot displays the top 10 contributing features by default but you can click Load more explanations to load an additional 10 features with each click.

The expanded view display reports:

Field Description
SHAP score The SHAP value assigned to this feature with respect to the prediction for this row, with both a visual representation and numeric score.
Feature The name of the contributing feature from the dataset.
Value The value of the feature in this row.
Distribution A histogram representation of a feature, showing the distribution of the feature's values. Hover on a bar in the histogram to see bin details.

Set prediction range

The prediction range control defines both the prediction distribution chart display and the predictions to sample output. Click the pencil icon to open a modal for setting the criteria, based on prediction value:

Changes to the displays update immediately:

SHAP considerations

Consider the following when working with SHAP Individual Prediction Explanations in Workbench:

  • Multiclass classification experiments are not supported. That is, they do not return SHAP Individual Prediction Explanations

  • SHAP-based explanations for models trained into Validation and Holdout are in-sample, not stacked.

  • SHAP Individual Prediction Explanations are not supported for any project type not supported in Workbench, as well as:

    • Time-aware (OTV and time series) experiments
  • SHAP does not fully support image feature types. You can use images as features and DataRobot returns SHAP values and SHAP impacts for them. However, the SHAP explanations chart will not show activation maps ("image explanations"); instead, it shows an image thumbnail.

  • When a link function is used, SHAP is additive in the margin space (sum(shap) = link(p)-link(p0)). The recommendation is:

    • When you require additive qualities of SHAP, use blueprints that don’t use a link function (e.g., a tree-based model).
    • When log is used as a link function, you could also explain predictions using exp(shap).
  • When the training partition is chosen as the data selection, the prediction distribution chart is not available. Once explanations are computed, however, the predictions table populates with explanations.


Updated April 16, 2024