Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Display and predicton thresholds

A threshold for a classification model is the point that sets the class boundary for a predicted value. The model classifies an observation below the threshold as a "false," and an observation above the threshold as a "true." In other words, DataRobot automatically assigns the positive class label to any prediction exceeding the threshold.

There are two thresholds you can modify:

Display threshold

The display threshold is the basis for several visualizations on the ROC Curve tab. The threshold you set updates the Prediction Distribution graph, as well as the Chart, Matrix, and Metrics panes described in the following sections. Experiment with the threshold to meet your modeling goals.

Select from the Display Threshold settings described below.

Element Description
Display Threshold Displays the threshold value you set. Click to select the threshold settings. Note that you can also update the display threshold by clicking in the Prediction Distribution graph. The Display Threshold defaults to maximize F1.

If you switch to a different model, the Display Threshold updates to maximize F1 for the new model. This allows you to easily compare classification results between models. If you select a different data source (by selecting Holdout, Cross Validation, or Validation in the Data Selection list), the Display Threshold updates to maximize F1 for the new data.
Threshold Drag the slider or enter a display threshold value; the visualization tools update accordingly.
Maximize option Select a threshold that maximizes the F1 score, MCC (Matthews Correlation Coefficient), or profit. To maximize for profit, first set a payoff by clicking +Add payoff on the Matrix pane.
Use as Prediction Threshold Click to set the Prediction Threshold to the current value of the Display Threshold. By doing so, at prediction time, the threshold value serves as the boundary between positive and negative classifications—observations above the threshold receive the positive class's label and those below the threshold receive the negative class's label. The Prediction Threshold is used when you generate profit curves and when you make predictions before or after deployment.
Threshold Type Select Top % of highest predictions or a Prediction value (0 - 1). See Threshold Type for details.

In this example, the Display Threshold is set to 0.3151, which maximizes the F1 score. You can then view the resulting values in the Chart, Matrix, and Metrics panes.

Threshold Type

You have a choice of two bases for the display threshold—a prediction value (0-1) or a prediction percentage. The prediction value represents the numeric value used to determine the class boundary. The percentage option allows you to set the top or bottom n% of records that are categorized as one class or another. You may want to do this, for example, to filter top predictions and compute recall using that boundary. Then, you can use the value as a comparison metric or to simply inspect the top percentage of records.

Note

The display represents the closest data point to the specified threshold (i.e., if you entered 20%, the display might actually be something like 20.7%). The box reports the exact value after you enter return.

Set the display threshold

Set thresholds using one of the methods below.

  1. On the ROC Curve tab, click the Display Threshold dropdown menu.

  2. Use the slider or enter a value to set the display threshold.

    If the Threshold Type is Top %, enter a value between 0 and 100 (which will update to the exact point after entry). If the Threshold Type is Prediction value, enter a number between 0.0 and 1.0. If the input is not valid, a warning appears to the right.

  3. Click outside of the dropdown to view the effects of the display threshold on the visualization tools.

  1. Select a metric maximum to use for the display threshold. Choose from F1, MCC, or profit. The metrics' maximum values display:

    Note

    You must set the Matrix pane to a Payoff Matrix to be able to maximize profit. Otherwise, the Maximize profit option is greyed out.

  2. Click outside of the dropdown to view the effects of the display threshold on the visualization tools.

  1. Hover over the Prediction Distribution graph until a "ghost" line appears with the corresponding value above it.

  2. Click to automatically update the display threshold to the new selected value.

Valid input for the Display Threshold changes the following page elements:

  • Updated values are displayed in the Metrics pane and the confusion matrix (in the Matrix pane).
  • The dividing line on the Prediction Distribution graph moves to the selected value and is marked with a circle.
  • On the current curve displayed in the Charts pane—for example, a ROC curve or a profit curve—the new point is selected (indicated by a circle). Some curves also have line intercepts corresponding to the point.

Prediction threshold

Prediction requests for binary classification models return both a probability of the positive class and a label. Although DataRobot automatically calculates a threshold (the display threshold), when applying the label at prediction time, the threshold value defaults to 0.5. In the resulting predictions, records with values above the threshold will have the positive class's label (in addition to the probability) based on this threshold. If this value causes a need for post-processing predictions to apply the actual threshold label, you can bypass that step by changing the prediction threshold.

Set the prediction threshold

  1. On the ROC Curve tab, click the Display Threshold dropdown menu.

  2. Update the display threshold if necessary.

  3. Select Use as Prediction Threshold.

    Once deployed, all predictions made with this model that fall above the new threshold will return the positive class label.

The Prediction Threshold value set here is also saved to the following tabs:

Changing the value in any of these tabs writes the new value back to all the tabs. Once a model is deployed, the threshold cannot be changed within that deployment.


Updated October 7, 2021
Back to top