Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Per-Class Bias

Per-Class Bias helps to identify if a model is biased, and if so, how much and who it's biased towards or against. Click Per-Class Bias to view the per-class bias chart.

The Per-Class Bias tab uses the fairness threshold and fairness score of each class to determine if certain classes are experiencing bias in the model's predictive behavior. Any class with a fairness score below the threshold is likely to be experiencing bias. Once these classes have been identified, use the Cross-Class Data Disparity tab to determine where in the training data the model is learning bias.

Per-Class Bias Chart

The Per-Class Bias chart displays individual class values for the selected protected feature on the Y-axis. The class' respective fairness score, calculated using DataRobot's fairness metrics, is displayed on the X-axis. Scores can be viewed as either absolute or relative values.

The blue bar indicates a class is above the fairness threshold; red indicates a class is below that threshold and is therefore likely to be experiencing model bias. A gray bar indicates that there is not enough data for the class due to one of the following reasons:

  • It contains fewer than 100 rows.
  • It contains between 100 and 1,000 rows, but fewer than 10% of the rows belong to the majority class (the class with the most rows of data).

Hover over a class to see additional details, including both absolute and relative fairness scores, the number of values for the class, and a summary of the fairness test results.

Use the information in this chart to identify if there is bias in the outcomes between protected classes. Then, from the Cross-Class Data Disparity tab, evaluate which features are having the largest impact on this bias.

Control the chart display

This chart provides several controls that modify the display, allowing you to focus on information of particular interest.

Prediction threshold

The prediction threshold—as seen in the ROC Curve tab tools —is the dividing line for interpreting results in binary classification models. The default threshold is 0.5, and every prediction above this dividing line has the positive class label.

For imbalanced datasets, a threshold of 0.5 can result in a validation partition without any positive class predictions, preventing the calculation of fairness scores on the Per-Class Bias tab. To recalculate and surface fairness scores, modify the prediction threshold to resolve the dataset imbalance.

All fairness metrics (except prediction balance) use the model's prediction threshold when calculating fairness scores. Changing this value recalculates the fairness scores and updates the chart to display the new values.

Fairness metric

Use the Metric dropdown menu to change which fairness metric DataRobot uses to calculate the fairness score displayed on the X-axis.

Show absolute values

Select Show absolute values to display the raw score each class received for the selected fairness metric.

Show relative values

Select Show relative values to scale the class with the highest fairness score to 1, and scale all other class fairness scores relative to 1.

In this view, the fairness threshold is visible on the chart because DataRobot uses relative fairness scores to check against the fairness threshold.

Protected feature

All protected features configured during project setup are listed on the left. Select a different protected feature to display its individual class values and fairness scores.

Updated September 14, 2021