Output format¶
DataRobot returns predictions in a columnar table format. Each example value is followed by the data type it belongs to. The columns returned are determined by model type, as described below.
Note
DataRobot allows prediction output to many different databases that all have unique versions of a string (e.g., some may call it TEXT
while others may call it VARCHAR
).
As a result, DataRobot cannot provide implementation-specific data types.
Regression models¶
Prediction label | |
---|---|
Column name | <target_name>_PREDICTION |
Data type | Numeric |
Example name | revenue_PREDICTION |
Example value | 493822.12 |
Description | The predicted value. |
Binary classification models¶
Positive label | |
---|---|
Column name | <target_name>_<positive_label>_PREDICTION |
Data type | Numeric |
Example name | isbadbuy_1_PREDICTION |
Example value | 0.28 |
Description | The float probability of the positive label. |
Negative label | |
---|---|
Column name | <target_name>_<negative_label>_PREDICTION |
Data type | Numeric |
Example name | isbadbuy_0_PREDICTION |
Example value | 0.72 |
Description | The float probability of the negative label. |
Prediction label | |
---|---|
Column name | <target_name>_PREDICTION |
Data type | Text |
Example name | isbadbuy_PREDICTION |
Example value | 0 |
Description | The predicted label of the classification. |
Threshold label | |
---|---|
Column name | THRESHOLD |
Data type | Numeric |
Example name | THRESHOLD |
Example value | 0.5 |
Description | The float prediction threshold used for determining the label. |
Positive class label | |
---|---|
Column name | POSITIVE_CLASS |
Data type | Text |
Example name | POSITIVE_CLASS |
Example value | 1 |
Description | The label configured as the positive class. |
Multiclass classification models¶
Prediction label | |
---|---|
Column name | <target_name>_PREDICTION |
Data type | Text |
Example name | species_PREDICTION |
Example value | lion |
Description | The predicted label of the classification. |
Prediction class label (for each class) | |
---|---|
Column name | <target_name>_<class_label>_PREDICTION |
Data type | Numeric |
Description | The float probability for each class. |
Example classifications | |
---|---|
Example name | Example value |
species_cat_PREDICTION | 0.28 |
species_lion_PREDICTION | 0.24 |
species_lynx_PREDICTION | 0.48 |
Time series models¶
Note
These output columns are available for time series regression, classification, and anomaly detection models.
Time series model columns | Description | Data type |
---|---|---|
<SERIES_ID_COLUMN_NAME> | Contains the series ID the row belongs to. Functions as a passthrough column and returns the unaltered column name and values provided in the scoring data. |
Text |
FORECAST_POINT | Contains the forecast point timestamp. Unless you request historical time series predictions, the output value is the same for all rows with the same forecast point (but different for each unique forecast distance). |
Date |
<TIME_COLUMN_NAME> | Contains the time series timestamp. Functions as a passthrough column and returns the unaltered column name and values provided in the scoring data. (This returns the same value as the originalFormatTimestamp field returned by time series models.) |
Date |
FORECAST_DISTANCE | Contains the numeric forecast distance returned by time series models. | Numeric |
Prediction status¶
Prediction status label | |
---|---|
Column name | prediction_status |
Data type | Text |
Description | A row-by-row status containing either OK or a string error message describing why the prediction did not succeed. |
Example value | Could not convert date field to date format YYYY-MM-DD |
Example value | OK |
Prediction warnings¶
If prediction warnings are enabled for your job, DataRobot returns an additional column.
Prediction warnings label | |
---|---|
Column name | IS_OUTLIER_PREDICTION |
Data type | Text |
Description | Whether the prediction is outside the calculated prediction boundaries. |
Example values | |
---|---|
Column | Example value |
Data type | Text |
IS_OUTLIER_PREDICTION | True |
IS_OUTLIER_PREDICTION | False |
Deployment approval status¶
If the approval workflow is enabled for your deployment, the output schema will contain an extra column showing the deployment approval status.
Deployment status label | |
---|---|
Column name | DEPLOYMENT_APPROVAL_STATUS |
Data type | Text/td> |
Description | Whether the deployment was approved. |
Example value | PENDING |
Prediction Explanations¶
You can request Prediction Explanations be returned with your predictions by setting the maxExplanations
job parameter to a non-zero value. You can also set thresholds for computing explanations. If you do not configure a threshold, DataRobot computes explanations for every row.
Prediction Explanation parameters | |||
---|---|---|---|
Job parameter | Description | Example value | Data type |
maxExplanations | (Optional) Compute up to this number of explanations. | 10 | Integer |
thresholdHigh | (Optional) Limit explanations to predictions above this threshold. | 0.5 | Float |
thresholdLow | (Optional) Limit explanations to predictions below this threshold. | 0.15 | Float |
If Prediction Explanations are requested, DataRobot returns four extra columns for each explanation in the format EXPLANATION_<n>_IDENTIFIER
(where n
is the feature explanation index, from 1 to the maximum number of explanations requested). The returned columns are:
Prediction Explanation columns | ||
---|---|---|
Column | Description | Data type |
EXPLANATION_ |
The feature name this explanation covers. | Text |
EXPLANATION_ |
The feature strength as a float. | Numeric |
EXPLANATION_ |
The feature strength as a string, a plus or minus indicator from +++ to --- . |
Text |
EXPLANATION_ |
The feature associated with this explanation. | Text |
Prediction Explanation examples¶
Name | Value |
---|---|
EXPLANATION_1_FEATURE_NAME | loan_status |
EXPLANATION_1_ACTUAL_VALUE | Charged Off |
EXPLANATION_1_STRENGTH | 1.380291221709652 |
EXPLANATION_1_QUALITATIVE_STRENGTH | +++ |
Name | Value |
---|---|
EXPLANATION_1_FEATURE_NAME | loan_status |
EXPLANATION_1_ACTUAL_VALUE | Fully Paid |
EXPLANATION_1_STRENGTH | -1.2145340858375335 |
EXPLANATION_1_QUALITATIVE_STRENGTH | --- |
Passthrough columns¶
Passthrough columns you request are passed verbatim. If they conflict with any of the above names, the job is rejected.
Association ID¶
If your deployment was configured with an association ID for accuracy, all result sets will have that column passed through from the source data automatically.
Output filters¶
Use the following job configuration properties to control whether to display only specific class probabilities or none at all.
Output filter parameters | |||
---|---|---|---|
Job parameter | Description | Example value | Data type |
includeProbabilities | (Optional) Include probabilities for all classes; defaults to true . |
true | Boolean |
includeProbabilitiesClasses | (Optional) Include only probabilities for classes listed in the given array; defaults to an empty array [] . |
['setosa', 'versicolor'] | Boolean |
includePredictionStatus | (Optional) Include the prediction_status column in the output; defaults to false. |
true | Boolean |
Note
For binary classification, includeProbabilities
also controls the THRESHOLD
and POSITIVE_CLASS
columns.
Column name remapping¶
If your use case has a strict output schema that does not match the DataRobot output, you can rename and remove any columns from the output using the columnNamesRemapping
job configuration property.
Output column name remapping parameters | ||
---|---|---|
Job parameter | Description | Example value |
columnNamesRemapping | (Optional) Provide a list of items to remap (rename or remove columns from) the output from this job. Set an outputName for the column to null or false to ignore it. | [{'inputName': 'isbadbuy_1_PREDICTION', 'outputName':'prediction'}, {'inputName': 'isbadbuy_0_PREDICTION', 'outputName': null}] |