SHAP-based Prediction Explanations¶
This section describes SHAP-based Prediction Explanations. See also the general description of Prediction Explanations for an overview of SHAP and XEMP methodologies.
To retrieve SHAP-based Prediction Explanations, you must enable the Include only models with SHAP value support advanced option prior to model building.
SHAP-based Prediction Explanations estimate how much each feature contributes to a given prediction differing from the average. They are intuitive, unbounded (computed for all features), fast, and, due to the open source nature of SHAP, transparent. The benefits of SHAP help not only better understand model behavior—and quickly—but allow you to easily validate if a model adheres to business rules.
Use SHAP to understand, for each model decision, which features are key. What drives a particular customer's decision to buy—age? gender? buying habits?—what is the magnitude on the decision for each factor?
Preview Prediction Explanations¶
DataRobot automatically computes SHAP Prediction Explanations on training data during EDA2. When the model is ready, you can review results from the Prediction Explanations tab. The display previews the top 5 features to provide a general "intuition" of model performance. You can then quickly compute and download explanations for the entire training dataset to perform a deeper analytics. You can also upload external datasets and manually compute (and download) explanations.
Note that you can also access explanations via the API, for both deployed and Leaderboard models.
Interpret SHAP Prediction Explanations¶
Open the Prediction Explanations tab to see an interactive preview of the top five features that contribute most to the difference from the average (base) prediction value. In other words, how much does each feature explain the difference? For example:
- The base prediction value—or average (1)—is 43.11.
- The prediction value for the row (2) is 67.5.
Subtract the base prediction value from the row prediction value to determine the difference from the average, in this case 24.4.
- The contribution (3) indicates, for each of the top 5 features (4), how much each feature explains the difference (the allocation of 24.4 between the features). How much is the feature responsible for pushing the target away from the average?
SHAP is additive which means that the sum of all contributions for all features equals the difference between the base and row prediction values. (See additivity details here.)
Some additional notes on interpreting the visualization:
Contributions can be either positive or negative. Features that push the predictive value to be higher display in red and are positive numbers. Features that reduce the prediction display in blue and are negative numbers.
The arrows on the plot are proportionate to the SHAP values positively and negatively impacting the observed prediction.
The "Sum of all other features" is the sum of features that are not part of the top five contributors.
See the SHAP reference for information on additivity (including possible breakages).
View points in the distribution¶
Use the prediction distribution component to click through a range of prediction values and understand how the top and bottom values are explained. In the chart, the Y-axis shows the prediction value, while the X-axis indicates the frequency.
Notice that if you look at a point near the bottom of the distribution, the contribution values show more blue than red values (more negative than positive contributions). This is because majority of key features are pushing the prediction value to be lower.
Computing and downloading explanations¶
While DataRobot automatically computes the explanations for selected records, you can compute explanations for all records by clicking the calculator () icon. DataRobot computes the remaining explanations and when ready, activates a download button. Click to save the list of explanations as a CSV file. Note that the CSV will only contain the top 100 explanations for each record. To see all explanations, use the API.
Upload a dataset¶
To compute explanations for additional data using the same model, click Upload new dataset:
DataRobot opens the Make Predictions tab where you can upload a new, external dataset. When complete, return to Prediction Explanations, where the new dataset is listed in the download area.
Compute () and then download explanations in the same way as with the training dataset. DataRobot runs computations for the entire external set.