Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Individual Prediction Explanations

Tab Description
Explanations Helps to understand what drives predictions by providing an estimation of how much each feature contributes to a given prediction differing from the average.

Two insights are available to provide alternative visualizations of this impact:

Insight Description
SHAP Individual Prediction Explanations (this page) Shows the effect of each feature on prediction on a row-by-row basis.
SHAP Distributions: Per Feature Shows the distribution and density of scores per feature using a violin plot for the visualization.

For SHAP Individual Prediction Explanations, DataRobot offers two methodologies for computing Prediction Explanations: SHAP (based on Shapley Values) and XEMP (eXemplar-based Explanations of Model Predictions). XEMP-based explanations are only available in experiments that don't support SHAP.

As a result, Text Explanations in Workbench are only available in those experiments that leveraged XEMP (i.e., they are not available for SHAP-based experiments). In DataRobot Classic, Text Explanations are available (when text is present) for both SHAP and XEMP projects.

SHAP-based explanations

SHAP-based explanations help to understand what drives predictions on a row-by-row basis by providing an estimation of how much each feature contributes to a given prediction differing from the average. They answer why a model made a certain prediction—What drives a customer's decision to buy—age? gender? buying habits? Then, they help identify the impact on the decision for each factor. They are intuitive, unbounded (computed for all features), fast, and, due to the open source nature of SHAP, transparent. Not only does SHAP provide the benefit of helping you better understand model behavior—and quickly—it also allows you to easily validate if a model adheres to business rules.

Two insights are available to provide alternative visualizations of SHAP explanations:

Insight Description
SHAP Individual Prediction Explanations (this page) Shows the effect of each feature on prediction on a row-by-row basis.
SHAP Distributions: Per Feature Shows the distribution and density of scores per feature using a violin plot for the visualization.

Insight filters

Use the controls in the insight to change the prediction distribution chart:

Option Description
Data selection Set the partition and source of data to compute explanations for.
Data slice Select or create (by selecting Create slice) a data slice to view a subpopulation of a model's data based on feature value.
Prediction range In the Predictions to sample table, view only predictions within a set range.
Export Download individual prediction explanations, in CSV format, based on the settings in the export modal.

For more details about working with Individual Prediction Explanations, see the related considerations and the SHAP reference.

Set the data source

Change the data source from the Data selection dropdown when you want to use alternate data for computing explanations. The data selection is comprised of a dataset and, when using the current training set, a selected partition.

You can change either:

  • A partition in the current training dataset, either training, validation, or holdout. By default, the chart represents the validation partition of the training dataset.

  • An additional, perhaps external, dataset. Use this when you want to use the same model to see explanations for rows that were not in your experiment's training data. DataRobot lists all datasets associated with your Use Case (up to 100), but you can also upload external datasets. Choose:

    • The same dataset again when you want to see a different random sample of rows.
    • A different dataset (be sure to choose a dataset that the model can predict on successfully).

Note that the prediction distribution chart is not available for the training dataset's training partition.

Download explanations

To download explanations in CSV format, click Export, set each limit, and click Download. You can change the settings and download each new version; click Done to dismiss the modal when you are finished.

Option When checked Otherwise
Limit features per prediction Only the specified number of top features are included in the CSV. Enter a value between 1 and the number of computed explanations, with a maximum of 100. Download predictions for all rows.
Limit downloaded explanations with applied filters Only those explanations meeting the filters set in the prediction distribution chart controls are included in the CSV. All explanations (up to 25,000) are included.

Predictions to sample

The sampled rows below the prediction distribution chart are chosen according to percentiles. The display for each sampled row includes a preview of the single most impactful feature for that row. Expand the row to see the top several most impactful features for that row.

Click the pencil icon to change the samples to return. By default, DataRobot returns five samples of predictions, uniformly sampled from across the range of predictions as defined by the filters.

Note

The table of predictions to sample is an on-demand feature; when you click Compute, DataRobot returns details of each individual explanation. Changes to any of the settings (data source, partition, or data slice) will require recomputing the table.

Simple table view

The summary entries provide:

  • A prediction ID (for example, Prediction #1117).
  • A prediction value with colored dot corresponding to the coloring of that value in the prediction distribution chart.
  • The top contributing feature to that prediction result.

Expanded row view

Click on any row in the simple table view to display additional information for its prediction. The expanded view lists, for each prediction, the features that were most impactful, ordered by SHAP score. DataRobot displays the top 10 contributing features by default but you can click Load more explanations to load an additional 10 features with each click.

The expanded view display reports:

Field Description
SHAP score The SHAP value assigned to this feature with respect to the prediction for this row, with both a visual representation and numeric score.
Feature The name of the contributing feature from the dataset.
Value The value of the feature in this row.
Distribution A histogram representation of a feature, showing the distribution of the feature's values. Hover on a bar in the histogram to see bin details.

Set prediction range

The prediction range control defines both the prediction distribution chart display and the predictions to sample output. Click the pencil icon to open a modal for setting the criteria, based on prediction value:

Changes to the displays update immediately:

XEMP-based explanations

XEMP-based explanations are a proprietary DataRobot method, available for all model types. They are univariate, letting you view the distribution of the effect each specific feature has on predictions. (SHAP, by contrast, is multivariate, measuring the effect of varying multiple features at once.) XEMP explanations are only available if SHAP is not supported by a model or experiment type; the appropriate Individual Prediction Explanation type is determined by DataRobot and made available when you select a model.

To access XEMP insights, click a model in the Leaderboard and choose Individual Prediction Explanation (XEMP) to expand the display. If prompted, click Compute Feature Impact.

After successful computation, the preview displays. See the DataRobot Classic documentation for full details on working with the preview, interpreting the display, and computing and downloading explanations.

Considerations

Consider the following when working with SHAP Individual Prediction Explanations in Workbench. See also the associated XEMP considerations.

  • For the following experiment types, SHAP explanations are not supported. That is, they do not return SHAP Individual Prediction Explanations. XEMP explanations are returned instead:

    • Multiclass classification experiments.
    • Time-aware (OTV and time series) experiments.
    • Any project type not supported in Workbench.
  • SHAP-based explanations for models trained into Validation and Holdout are in-sample, not stacked.

  • SHAP does not fully support image feature types. You can use images as features and DataRobot returns SHAP values and SHAP impacts for them. However, the SHAP explanations chart will not show activation maps ("image explanations"); instead, it shows an image thumbnail.

  • When a link function is used, SHAP is additive in the margin space (sum(shap) = link(p)-link(p0)). The recommendation is:

    • When you require additive qualities of SHAP, use blueprints that don’t use a link function (e.g., a tree-based model).
    • When log is used as a link function, you could also explain predictions using exp(shap).
  • When the training partition is chosen as the data selection, the prediction distribution chart is not available. Once explanations are computed, however, the predictions table populates with explanations.


Updated December 10, 2024