Prediction Explanations illustrate what drives predictions on a row-by-row basis—they provide a quantitative indicator of the effect variables have on the predictions, answering why a given model made a certain prediction. It helps to understand why a model made a particular prediction so that you can then validate whether the prediction makes sense. It's especially important in cases where a human operator needs to evaluate a model decision and also when a model builder needs to confirm that the model works as expected. For example, "why does the model give a 94.2% chance of readmittance?" (See more examples below.)
To access and enable Prediction Explanations, select a model on the Leaderboard and click Understand > Prediction Explanations.
DataRobot offers two methodologies for computing Prediction Explanations: SHAP (SHapley Values) and XEMP (eXemplar-based Explanations of Model Predictions).
To avoid confusion when the same insight is produced yet potentially returns different results, you must enable SHAP in Advanced options prior to project start.
SHAP or XEMP-based methodology?¶
Both SHAP and XEMP methodologies estimate which features have stronger or weaker impact on the target for a particular row. They usually provide similar results. The list below illustrates some differences:
While the results may communicate similar results, the values are different (because methodologies are different).
SHAP values have a simple physical explanation.
SHAP's open source algorithm provides regulators an easy audit path. XEMP uses a well-supported DataRobot proprietary algorithm.
XEMP works for all models. SHAP works for linear models, Keras deep learning models, and tree-based models, including tree ensembles.
XEMP values are computed for up to 10 values in up to the top 50 columns. SHAP has no column or value limits.
SHAP is often 5-20 times faster than XEMP.
SHAP is additive, making it easy to see how much top-N features contribute to a prediction.
Because all blueprints are included in Autopilot, XEMP results may produce slightly higher accuracy. (SHAP supports all key blueprints so often times accuracy is the same.)
While Prediction Explanations provide several quantitative indicators for why a prediction was made, the calculations do not fully explain how a prediction is computed. For that information, use the coefficients with preprocessing information from the Coefficients tab.
A common question when evaluating data is “why is a certain data point considered high-risk (or low-risk) for a certain event”?
A sample case for Prediction Explanations:
Sam is a business analyst at a large manufacturing firm. She does not have a lot of data science expertise, but has been using DataRobot with great success to predict the likelihood of product failures at her manufacturing plant. Her manager is now asking for recommendations for reducing the defect rate, based on these predictions. Sam would like DataRobot to produce Prediction Explanations for the expected product failures so that she can identify the key drivers of product failures based on a higher-level aggregation of explanations. Her business team can then use this report to address the causes of failure.
Other common use cases and possible reasons include:
What are indicators that a transaction could be at high risk for fraud? Possible explanations include transactions out of a cardholder's home area, transactions out of their “normal usage” time range, and transactions that are too large or small.
What are some reasons for setting a higher auto insurance price? The applicant is single, male, age under 30 years, and has received DUI or tickets. A married homeowner may receive a lower rate.
SHAP estimates how much a feature is responsible for a given prediction being different from the average. Consider a credit risk example that builds a simple model with two features—number of credit cards and employment status. The model predicts that an unemployed applicant with 10 credit cards has a 50% probability of default, while the average default rate is 5%. SHAP estimates how each feature contributed to the 50% default risk prediction, determining that 25% is attributed to the number of cards and 20% of risk is due to the customer's lack of a job.