Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Prediction intervals for regression projects

Availability information

Prediction intervals for regression projects is off by default. Contact your DataRobot representative or administrator for information on enabling the feature.

Feature flag: Prediction intervals for regression projects

Introduction to prediction intervals

For many use cases involving a regression problem, there is interest not only in predicting a numerical target variable but also in quantifying the accuracy of that prediction.

Consider the problem of predicting the average temperature Y for tomorrow, given today’s weather conditions x. On a given day, a model may predict the average temperature of 15℃ and additionally state that the temperature will likely be between 14℃ and 16℃. This is a much more useful statement than predicting the temperature of 15℃ with a confidence range between 10℃ and 20℃.

The ranges of temperatures [14, 16] and [10, 20] mentioned above, are called Prediction Intervals. A prediction interval is always associated with a confidence probability p. Typical values for confidence probabilities are 0.8, 0.9, or, 0.99. A particular choice of the probability depends on the use-case at hand and is driven by the business considerations.

For a given probability p and the associated prediction interval [L, U], you expect that the actual observation of the target Y will fall into [L, U] with probability p. This statement has the following frequentist counterpart: randomly sampling from the feature distribution X, calculating a prediction y for Y, and calculating the prediction interval [L, U], will lead to Y landing in [L, U] for 100·p percent of the samples. Note that, for the remaining 100·(1-p) percent of the samples, the actual observation Y will lie outside of the prediction interval [L, U].

DataRobot uses conformal inference methods to estimate the prediction intervals for regression projects.

Prediction intervals in MLOps deployments

The predictions of regression models deployed in DataRobot MLOps can be optionally supplemented with prediction intervals. Once enabled, every newly created regression model deployment will also produce prediction intervals.

The prediction intervals settings can be viewed and modified by selecting a deployment from the MLOps deployments list and navigating to the Humility > Prediction Intervals tab.

The configuration section of the Prediction Intervals tab allows you to enable or disable the feature, choose the estimation method, and choose the set of confidence probabilities. DataRobot supports one prediction intervals estimation method: "Online Conformal”.

The trumpet chart

The trumpet chart displayed after the configuration section visualizes the lengths of prediction intervals calculated for the holdout set (or, if no holdout is available, for an equivalent data set). For each data point in the holdout set, DataRobot calculates a prediction interval and a prediction. Then, the prediction intervals are normalized by subtracting the predicted value. This centers the normalized prediction intervals around zero. Finally, the prediction intervals are normalized by length. The x-axis of the chart is the percentage; the y-axis is the length of the normalized prediction intervals.

Currently, the Online Conformal method produces prediction intervals of constant length. For this reason, this chart displays a rectangle.

Updated December 11, 2021
Back to top