Multilabel: Per-Label Metrics¶
Availability information
Availability of multilabel modeling is dependent on your DataRobot package. If it is not enabled for your organization, contact your DataRobot representative for more information.
Tab | Description |
---|---|
Performance | Summarizes performance across one, several, or zero different label values of the prediction threshold. |
Multilabel: Per-Label Metrics is a visualization designed specifically for multilabel models. It helps to evaluate a model by summarizing performance across the labels for different values of the prediction threshold (which can be set from the page). Configure multilabel modeling during experiment setup.
In addition to this insight, multilabel-specific modeling insights are available from the following Leaderboard tabs:
Use the Label dropdown to generate the insight for a selected label:
Overview¶
The Per-Label metrics chart depicts binary performance metrics, treating each label as a binary feature. Specifically it:
-
Displays average and per-label model performance, based on the prediction threshold, for a selectable metric.
-
Helps to assess the number of labels performing well versus the number of labels performing badly.
The table below describes the areas of the Multilabel: Per-Label Metrics chart. See also detailed descriptions of the ROC Curve metrics and graph interpretation.
Component | Description | |
---|---|---|
1 | Metric value table | Displays model performance for each target label. Changing the display or prediction threshold updates the table. |
2 | Threshold selector | Sets whether to display values for the display or prediction thresholds. Changing either value updates the metric value table and chart. |
3 | Metric value chart and metric selector | Displays graphed results based on the set display threshold. Use the dropdown to select the performance metric to display in the chart. |
4 | Average performance report | The macro-averaged model performance, over all labels, for each metric. Metrics are defined in the deep dive below. |
5 | Label and data selectors | Sets the data partition—validation, cross validation, or holdout (if unlocked)—to report per-label values for. Display all or only pinned (selected) labels. |
Deep dive: metrics explained
The following table provides a brief description of each statistic, using a classification use case described in the ROC Curve documentation to illustrate.
Statistic | Description | Sample (from use cases) | Calculation |
---|---|---|---|
F1 Score | A measure of the model's accuracy, computed based on precision and recall. | N/A | ![]() |
True Positive Rate (TPR) | Sensitivity or recall. The ratio of true positives (correctly predicted as positive) to all actual positives. | What percentage of diabetics did the model correctly identify as diabetics? | ![]() |
False Positive Rate (FPR) | Fallout. The ratio of false positives to all actual negatives. | What percentage of healthy patients did the model incorrectly identify as diabetics? | ![]() |
True Negative Rate (TNR) | Specificity. The ratio of true negatives (correctly predicted as negative) to all actual negatives. | What percentage of healthy patients did the model correctly predict as healthy? | ![]() |
Positive Predictive Value (PPV) | Precision. For all the positive predictions, the percentage of cases in which the model was correct. | What percentage of the model’s predicted diabetics are actually diabetic? | ![]() |
Negative Predictive Value (NPV) | For all the negative predictions, the percentage of cases in which the model was correct. | What percentage of the model’s predicted healthy patients are actually healthy? | ![]() |
The percentage of correctly classified instances. | What is the overall percentage of the time that the model makes a correct prediction? | ![]() |
|
Matthews Correlation Coefficient | Measure of model quality when the classes are of very different sizes (unbalanced). | N/A | formula |
Average Profit | Estimates the business impact of a model. Displays the average profit based on the payoff matrix at the current display threshold. If a payoff matrix is not selected, displays N/A. | What is the business impact of readmitting a patient? | formula |
Total Profit | Estimates the business impact of a model. Displays the total profit based on the payoff matrix at the current display threshold. If a payoff matrix is not selected, displays N/A. | What is the business impact of readmitting a patient? | formula |
Metric value table¶
The metric value table reports a model's performance for each target label (considered as a binary feature). The metrics in the table correspond to the Display threshold; change the threshold value to view label metrics at different threshold values.
Set the metric value table to All labels to see metric values for each label in the experiment. Use the controls at the bottom of the table to page through the display and explore all labels. Additionally, change the table view as follows:
Action | |
---|---|
1 | Use the search field to modify the table to display only those labels that match the search criteria. |
2 | Click on a column header to change the sort order of labels in the table. |
3 | Click the Show option to include (or remove) a specific label's results from the metric value chart. The option works whether you are displaying all or only pinned labels. |
4 | Click the pin to include (or remove) the selected label from the chart display to the left. |
The ID column (#) is static and allows you to assess, together with sorting, the labels for which the metric of interest is above or below a given value.
Threshold selector¶
The threshold section provides a point for inputting both a Display threshold and a Prediction threshold.
Use | To |
---|---|
Display threshold | Set the threshold level. Changes to the value both update the display and the metric value table to the right, which shows average model performance. |
Prediction threshold | Set the model prediction threshold, which is applied when making predictions. |
Arrows | Swap values for the current display and prediction thresholds. |
Note that only Use Case owners can update the prediction threshold.
Metric value chart¶
The chart consists of a graphed results and a metric selector:
The X-axis in the diagram represents different values of the prediction threshold. The Y-axis plots values for the selected metric. Overall, the diagram illustrates the average model performance curve, based on the selected metric. The threshold value set in the Display threshold is indicated by round, unfilled point on the line. Changes to the threshold and/or metric update the graph
Display label metrics¶
By default, the metric value chart displays the average value, as a white line, across all labels for the selected metric. You can highlight one or more labels to compare their metric values against the average. The color of the label name changes to match its line entry in the chart.
Show option¶
Select Show next to a label to add the individual results for that label to the chart.
For example, consider a project with 100 labels. If measuring for accuracy above 0.7, sort by accuracy and look at the row index with the last accuracy value above 0.7. You can determine the percentage of labels with that accuracy or above from the row index with relation to the total number of rows.
When you pin a label, Show is automatically enabled. Click the eye again to remove the label.
Pinning labels¶
Use the pin option to select particular labels for display in the chart. Pinning a label automatically enables the Show option for that label, adding its metric value to the chart. After pinning labels, use the the Pinned labels tab to show only those labels you selected.
Toggling back to All labels preserves the label's entry on the chart.