Includes only jobs that have the status value that matches this flag. Repeat the parameter for filtering on multiple statuses.
source
query
any
false
Includes only jobs that have the source value that matches this flag. Repeat the parameter for filtering on multiple statuses.Prefix values with a dash (-) to exclude those sources.
deploymentId
query
string
false
Includes only jobs for this particular deployment
modelId
query
string
false
ID of leaderboard model which is used in job for processing predictions dataset
jobId
query
string
false
Includes only job by specific id
orderBy
query
string
false
Sort order which will be applied to batch prediction list. Prefix the attribute name with a dash to sort in descending order, e.g. "-created".
allJobs
query
boolean
false
[DEPRECATED - replaced with RBAC permission model] - No effect
cutoffHours
query
integer
false
Only list jobs created at most this amount of hours ago.
startDateTime
query
string(date-time)
false
ISO-formatted datetime of the earliest time the job was added (inclusive). For example "2008-08-24T12:00:00Z". Will ignore cutoffHours if set.
endDateTime
query
string(date-time)
false
ISO-formatted datetime of the latest time the job was added (inclusive). For example "2008-08-24T12:00:00Z".
batchPredictionJobDefinitionId
query
string
false
Includes only jobs for this particular definition
hostname
query
any
false
Includes only jobs for this particular prediction instance hostname
batchJobType
query
any
false
Includes only jobs that have the batch job type that matches this flag. Repeat the parameter for filtering on multiple types.
intakeType
query
any
false
Includes only jobs for these particular intakes type
outputType
query
any
false
Includes only jobs for these particular outputs type
If the job is running, it will be aborted. Then it will be removed, meaning all underlying data will be deleted and the job is removed from the list of jobs.
Create a Batch Prediction Job definition. A configuration for a Batch Prediction job which can either be executed manually upon request or on scheduled intervals, if enabled. The API payload is the same as for /batchPredictions along with optional enabled and schedule items.
Includes only jobs that have the status value that matches this flag. Repeat the parameter for filtering on multiple statuses.
source
query
any
false
Includes only jobs that have the source value that matches this flag. Repeat the parameter for filtering on multiple statuses.Prefix values with a dash (-) to exclude those sources.
deploymentId
query
string
false
Includes only jobs for this particular deployment
modelId
query
string
false
ID of leaderboard model which is used in job for processing predictions dataset
jobId
query
string
false
Includes only job by specific id
orderBy
query
string
false
Sort order which will be applied to batch prediction list. Prefix the attribute name with a dash to sort in descending order, e.g. "-created".
allJobs
query
boolean
false
[DEPRECATED - replaced with RBAC permission model] - No effect
cutoffHours
query
integer
false
Only list jobs created at most this amount of hours ago.
startDateTime
query
string(date-time)
false
ISO-formatted datetime of the earliest time the job was added (inclusive). For example "2008-08-24T12:00:00Z". Will ignore cutoffHours if set.
endDateTime
query
string(date-time)
false
ISO-formatted datetime of the latest time the job was added (inclusive). For example "2008-08-24T12:00:00Z".
batchPredictionJobDefinitionId
query
string
false
Includes only jobs for this particular definition
hostname
query
any
false
Includes only jobs for this particular prediction instance hostname
intakeType
query
any
false
Includes only jobs for these particular intakes type
outputType
query
any
false
Includes only jobs for these particular outputs type
If the job is running, it will be aborted. Then it will be removed, meaning all underlying data will be deleted and the job is removed from the list of jobs.
If a job has finished execution regardless of the result, it can have parameters changed to ensure better filtering in the job list upon retrieval. Another case: updating job scoring status externally.
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/projects/{projectId}/models/{modelId}/predictionExplanationsInitialization/¶
Retrieve the current PredictionExplanationsInitialization.
A PredictionExplanationsInitialization is a pre-requisite for successfully computing prediction explanations using a particular model, and can be used to preview the prediction explanations that would be generated for a complete dataset.
If provided, only jobs with the same status will be included in the results; otherwise, queued and inprogress jobs (but not errored jobs) will be returned.
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
POST /api/v2/projects/{projectId}/predictionExplanations/¶
Create a new PredictionExplanations object (and its accompanying PredictionExplanationsRecord).
In order to successfully create PredictionExplanations for a particular model and dataset, you must first
- Compute feature impact for the model via POST /api/v2/projects/{projectId}/models/{modelId}/featureImpact/
- Compute a PredictionExplanationsInitialization for the model via POST /api/v2/projects/{projectId}/models/{modelId}/predictionExplanationsInitialization/
- Compute predictions for the model and dataset via POST /api/v2/projects/{projectId}/predictions/thresholdHigh and thresholdLow are optional filters applied to speed up computation. When at least one is specified, only the selected outlier rows will have prediction explanations computed. Rows are considered to be outliers if their predicted value (in case of regression projects) or probability of being the positive class (in case of classification projects) isless than thresholdLow or greater than thresholdHigh. If neither is specified, prediction explanations will be computed for all rows.
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/projects/{projectId}/predictionExplanations/{predictionExplanationsId}/¶
Retrieve stored Prediction Explanations.
Each PredictionExplanationsRow retrieved corresponds to a row of the prediction dataset, although some rows may not have had prediction explanations computed depending on the thresholds selected.
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/projects/{projectId}/predictionExplanationsRecords/¶
List PredictionExplanationsRecord objects for a project.
These contain metadata about the computed prediction explanations and the location at which the PredictionExplanations can be retrieved.
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/projects/{projectId}/predictionExplanationsRecords/{predictionExplanationsId}/¶
Retrieve a PredictionExplanationsRecord object.
A PredictionExplanationsRecord contains metadata about the computed prediction explanations and the location at which the PredictionExplanations can be retrieved.
There are two ways of making predictions. The recommended way is to first upload your
dataset to the project, and then using the corresponding datasetId, predict against
that dataset. To follow that pattern, send the json request body.
Note that requesting prediction intervals will automatically trigger backtesting if
backtests were not already completed for this model.
The legacy method which is deprecated is to send the file
directly with the predictions request. If you need to predict against a file 10MB in
size or larger, you will be required to use the above workflow for uploaded datasets.
However, the following multipart/form-data can be used with small files:
:form file: a dataset to make predictions on
:form modelId: the model to use to make predictions
.. note:: If using the legacy method of uploading data to this endpoint, a new dataset
will be created behind the scenes. For performance reasons, it would be much better
to utilize the workflow of creating the dataset first and using the supported method
of making predictions of this endpoint. However, to preserve the functionality of
existing workflows, the legacy method still exists.
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/projects/{projectId}/predictions/{predictionId}/¶
Retrieve predictions that have previously been computed.
Training predictions encoded either as JSON or CSV.
If CSV output was requested, the returned CSV data will contain the following columns:
For regression projects: row_id and prediction.
For binary classification projects: row_id, prediction,
class_<positive_class_label> and class_<negative_class_label>.
For multiclass projects: row_id, prediction and a
class_<class_label> for each class.
For multilabel projects: row_id and for each class
prediction_<class_label> and class_<class_label>.
For time-series, these additional columns will be added: forecast_point,
forecast_distance, timestamp, and series_id.
.. minversion:: v2.21
* If `explanationAlgorithm` = 'shap', these additional columns will be added:
triplets of (`Explanation_<i>_feature_name`,
`Explanation_<i>_feature_value`, and `Explanation_<i>_strength`) for `i` ranging
from 1 to `maxExplanations`, `shap_remaining_total` and `shap_base_value`. Binary
classification projects will also have `explained_class`, the class for which
positive SHAP values imply an increased probability.
The number of scheduled jobs to skip. Defaults to 0.
limit
query
integer
true
The number of scheduled jobs (max 100) to return. Defaults to 20
orderBy
query
string
false
The order to sort the scheduled jobs. Defaults to order by last successful run timestamp in descending order.
search
query
string
false
Case insensitive search against scheduled jobs name or type name.
deploymentId
query
string
false
Filter by the prediction integration deployment ID. Ignored for non prediction integration type ID.
typeId
query
string
false
filter by scheduled job type ID.
queryByUser
query
string
false
Which user field to filter with.
filterEnabled
query
string
false
Filter jobs using the enabled field. If true, only enabled jobs are returned, otherwise if false, only disabled jobs are returned. The default returns both enabled and disabled jobs.
For Parquet directory-scoring only. The column names of the intake data of which to partition the dataset. Columns are partitioned in the order they are given. At least one value is required if scoring to a directory (meaning the output url ends with a slash ("/").
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
Use the specified credential to access the url
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
format
string
false
Type of output file format
partitionColumns
[string]
false
maxItems: 100
For Parquet directory-scoring only. The column names of the intake data of which to partition the dataset. Columns are partitioned in the order they are given. At least one value is required if scoring to a directory (meaning the output url ends with a slash ("/").
Id of the monitoring batch created by this job. Only present if the job runs on a deployment with batch monitoring enabled.
percentageCompleted
number
true
maximum: 100 minimum: 0
Indicates job progress which is based on number of already processed rows in dataset
queuePosition
integer,null
false
minimum: 0
To ensure a dedicated prediction instance is not overloaded, only one job will be run against it at a time. This is the number of jobs that are awaiting processing before this job start running. May not be available in all environments.
queued
boolean
true
The job has been put on the queue for execution.
resultsDeleted
boolean
false
Indicates if the job was subject to garbage collection and had its artifacts deleted (output files, if any, and scoring data on local storage)
scoredRows
integer
true
minimum: 0
Number of rows that have been used in prediction computation
skippedRows
integer
true
minimum: 0
Number of rows that have been skipped during scoring. May contain non-zero value only in time-series predictions case if provided dataset contains more than required historical rows.
ID of deployment which is used in job for processing predictions dataset
disableRowLevelErrorHandling
boolean
true
Skip row by row error handling
explanationAlgorithm
string
false
Which algorithm will be used to calculate prediction explanations
explanationClassNames
[string]
false
maxItems: 10 minItems: 1
List of class names that will be explained for each row for multiclass. Mutually exclusive with explanationNumTopClasses. If neither specified - we assume explanationNumTopClasses=1
explanationNumTopClasses
integer
false
maximum: 10 minimum: 1
Number of top predicted classes for each row that will be explained for multiclass. Mutually exclusive with explanationClassNames. If neither specified - we assume explanationNumTopClasses=1
includePredictionStatus
boolean
true
Include prediction status column in the output
includeProbabilities
boolean
true
Include probabilities for all classes
includeProbabilitiesClasses
[string]
true
maxItems: 100
Include only probabilities for these specific class names.
Override the default prediction instance from the deployment when scoring this job.
predictionWarningEnabled
boolean,null
false
Enable prediction warnings.
redactedFields
[string]
true
A list of qualified field names from intake- and/or outputSettings that was redacted due to permissions and sharing settings. For example: intakeSettings.dataStoreId
skipDriftTracking
boolean
true
Skip drift tracking for this job.
thresholdHigh
number
false
Compute explanations for predictions above this threshold
thresholdLow
number
false
Compute explanations for predictions below this threshold
timeseriesSettings
any
false
Time Series settings included of this job is a Time Series job.
Used for forecast predictions in order to override the inferred forecast point from the dataset.
relaxKnownInAdvanceFeaturesCheck
boolean
false
If activated, missing values in the known in advance features are allowed in the forecast window at prediction time. If omitted or false, missing values are not allowed.
type
string
true
Forecast mode makes predictions using forecastPoint or rows in the dataset without target.
Used for historical predictions in order to override date to which predictions should be calculated. By default value will be inferred automatically from the dataset.
predictionsStartDate
string(date-time)
false
Used for historical predictions in order to override date from which predictions should be calculated. By default value will be inferred automatically from the dataset.
relaxKnownInAdvanceFeaturesCheck
boolean
false
If activated, missing values in the known in advance features are allowed in the forecast window at prediction time. If omitted or false, missing values are not allowed.
type
string
true
Historical mode enables bulk predictions which calculates predictions for all possible forecast points and forecast distances in the dataset within the predictionsStartDate/predictionsEndDate range.
ID of deployment which is used in job for processing predictions dataset
disableRowLevelErrorHandling
boolean
true
Skip row by row error handling
explanationAlgorithm
string
false
Which algorithm will be used to calculate prediction explanations
explanationClassNames
[string]
false
maxItems: 10 minItems: 1
List of class names that will be explained for each row for multiclass. Mutually exclusive with explanationNumTopClasses. If neither specified - we assume explanationNumTopClasses=1
explanationNumTopClasses
integer
false
maximum: 10 minimum: 1
Number of top predicted classes for each row that will be explained for multiclass. Mutually exclusive with explanationClassNames. If neither specified - we assume explanationNumTopClasses=1
includePredictionStatus
boolean
true
Include prediction status column in the output
includeProbabilities
boolean
true
Include probabilities for all classes
includeProbabilitiesClasses
[string]
true
maxItems: 100
Include only probabilities for these specific class names.
Override the default prediction instance from the deployment when scoring this job.
predictionThreshold
number
false
maximum: 1 minimum: 0
Threshold is the point that sets the class boundary for a predicted value. The model classifies an observation below the threshold as FALSE, and an observation above the threshold as TRUE. In other words, DataRobot automatically assigns the positive class label to any prediction exceeding the threshold. This value can be set between 0.0 and 1.0.
predictionWarningEnabled
boolean,null
false
Enable prediction warnings.
secondaryDatasetsConfigId
string
false
Configuration id for secondary datasets to use when making a prediction.
skipDriftTracking
boolean
true
Skip drift tracking for this job.
thresholdHigh
number
false
Compute explanations for predictions above this threshold
thresholdLow
number
false
Compute explanations for predictions below this threshold
timeseriesSettings
any
false
Time Series settings included of this job is a Time Series job.
ID of deployment which is used in job for processing predictions dataset
disableRowLevelErrorHandling
boolean
true
Skip row by row error handling
explanationAlgorithm
string
false
Which algorithm will be used to calculate prediction explanations
explanationClassNames
[string]
false
maxItems: 10 minItems: 1
List of class names that will be explained for each row for multiclass. Mutually exclusive with explanationNumTopClasses. If neither specified - we assume explanationNumTopClasses=1
explanationNumTopClasses
integer
false
maximum: 10 minimum: 1
Number of top predicted classes for each row that will be explained for multiclass. Mutually exclusive with explanationClassNames. If neither specified - we assume explanationNumTopClasses=1
includePredictionStatus
boolean
true
Include prediction status column in the output
includeProbabilities
boolean
true
Include probabilities for all classes
includeProbabilitiesClasses
[string]
true
maxItems: 100
Include only probabilities for these specific class names.
Override the default prediction instance from the deployment when scoring this job.
predictionWarningEnabled
boolean,null
false
Enable prediction warnings.
redactedFields
[string]
true
A list of qualified field names from intake- and/or outputSettings that was redacted due to permissions and sharing settings. For example: intakeSettings.dataStoreId
skipDriftTracking
boolean
true
Skip drift tracking for this job.
thresholdHigh
number
false
Compute explanations for predictions above this threshold
thresholdLow
number
false
Compute explanations for predictions below this threshold
timeseriesSettings
any
false
Time Series settings included of this job is a Time Series job.
ID of deployment which is used in job for processing predictions dataset
disableRowLevelErrorHandling
boolean
true
Skip row by row error handling
enabled
boolean
false
If this job definition is enabled as a scheduled job. Optional if no schedule is supplied.
explanationAlgorithm
string
false
Which algorithm will be used to calculate prediction explanations
explanationClassNames
[string]
false
maxItems: 10 minItems: 1
List of class names that will be explained for each row for multiclass. Mutually exclusive with explanationNumTopClasses. If neither specified - we assume explanationNumTopClasses=1
explanationNumTopClasses
integer
false
maximum: 10 minimum: 1
Number of top predicted classes for each row that will be explained for multiclass. Mutually exclusive with explanationClassNames. If neither specified - we assume explanationNumTopClasses=1
includePredictionStatus
boolean
true
Include prediction status column in the output
includeProbabilities
boolean
true
Include probabilities for all classes
includeProbabilitiesClasses
[string]
true
maxItems: 100
Include only probabilities for these specific class names.
Override the default prediction instance from the deployment when scoring this job.
predictionThreshold
number
false
maximum: 1 minimum: 0
Threshold is the point that sets the class boundary for a predicted value. The model classifies an observation below the threshold as FALSE, and an observation above the threshold as TRUE. In other words, DataRobot automatically assigns the positive class label to any prediction exceeding the threshold. This value can be set between 0.0 and 1.0.
ID of deployment which is used in job for processing predictions dataset
disableRowLevelErrorHandling
boolean
false
Skip row by row error handling
enabled
boolean
false
If this job definition is enabled as a scheduled job. Optional if no schedule is supplied.
explanationAlgorithm
string
false
Which algorithm will be used to calculate prediction explanations
explanationClassNames
[string]
false
maxItems: 10 minItems: 1
List of class names that will be explained for each row for multiclass. Mutually exclusive with explanationNumTopClasses. If neither specified - we assume explanationNumTopClasses=1
explanationNumTopClasses
integer
false
maximum: 10 minimum: 1
Number of top predicted classes for each row that will be explained for multiclass. Mutually exclusive with explanationClassNames. If neither specified - we assume explanationNumTopClasses=1
includePredictionStatus
boolean
false
Include prediction status column in the output
includeProbabilities
boolean
false
Include probabilities for all classes
includeProbabilitiesClasses
[string]
false
maxItems: 100
Include only probabilities for these specific class names.
Override the default prediction instance from the deployment when scoring this job.
predictionThreshold
number
false
maximum: 1 minimum: 0
Threshold is the point that sets the class boundary for a predicted value. The model classifies an observation below the threshold as FALSE, and an observation above the threshold as TRUE. In other words, DataRobot automatically assigns the positive class label to any prediction exceeding the threshold. This value can be set between 0.0 and 1.0.
Id of the monitoring batch created by this job. Only present if the job runs on a deployment with batch monitoring enabled.
percentageCompleted
number
true
maximum: 100 minimum: 0
Indicates job progress which is based on number of already processed rows in dataset
queuePosition
integer,null
false
minimum: 0
To ensure a dedicated prediction instance is not overloaded, only one job will be run against it at a time. This is the number of jobs that are awaiting processing before this job start running. May not be available in all environments.
queued
boolean
true
The job has been put on the queue for execution.
resultsDeleted
boolean
false
Indicates if the job was subject to garbage collection and had its artifacts deleted (output files, if any, and scoring data on local storage)
scoredRows
integer
true
minimum: 0
Number of rows that have been used in prediction computation
skippedRows
integer
true
minimum: 0
Number of rows that have been skipped during scoring. May contain non-zero value only in time-series predictions case if provided dataset contains more than required historical rows.
ID of deployment which is used in job for processing predictions dataset
disableRowLevelErrorHandling
boolean
true
Skip row by row error handling
explanationAlgorithm
string
false
Which algorithm will be used to calculate prediction explanations
explanationClassNames
[string]
false
maxItems: 10 minItems: 1
List of class names that will be explained for each row for multiclass. Mutually exclusive with explanationNumTopClasses. If neither specified - we assume explanationNumTopClasses=1
explanationNumTopClasses
integer
false
maximum: 10 minimum: 1
Number of top predicted classes for each row that will be explained for multiclass. Mutually exclusive with explanationClassNames. If neither specified - we assume explanationNumTopClasses=1
includePredictionStatus
boolean
true
Include prediction status column in the output
includeProbabilities
boolean
true
Include probabilities for all classes
includeProbabilitiesClasses
[string]
true
maxItems: 100
Include only probabilities for these specific class names.
Override the default prediction instance from the deployment when scoring this job.
predictionThreshold
number
false
maximum: 1 minimum: 0
Threshold is the point that sets the class boundary for a predicted value. The model classifies an observation below the threshold as FALSE, and an observation above the threshold as TRUE. In other words, DataRobot automatically assigns the positive class label to any prediction exceeding the threshold. This value can be set between 0.0 and 1.0.
predictionWarningEnabled
boolean,null
false
Enable prediction warnings.
redactedFields
[string]
true
A list of qualified field names from intake- and/or outputSettings that was redacted due to permissions and sharing settings. For example: intakeSettings.dataStoreId
secondaryDatasetsConfigId
string
false
Configuration id for secondary datasets to use when making a prediction.
skipDriftTracking
boolean
true
Skip drift tracking for this job.
thresholdHigh
number
false
Compute explanations for predictions above this threshold
thresholdLow
number
false
Compute explanations for predictions below this threshold
timeseriesSettings
any
false
Time Series settings included of this job is a Time Series job.
Used for forecast predictions in order to override the inferred forecast point from the dataset.
relaxKnownInAdvanceFeaturesCheck
boolean
false
If activated, missing values in the known in advance features are allowed in the forecast window at prediction time. If omitted or false, missing values are not allowed.
type
string
true
Forecast mode makes predictions using forecastPoint or rows in the dataset without target.
If activated, missing values in the known in advance features are allowed in the forecast window at prediction time. If omitted or false, missing values are not allowed.
type
string
true
Forecast mode makes predictions using forecastPoint or rows in the dataset without target.
Used for historical predictions in order to override date to which predictions should be calculated. By default value will be inferred automatically from the dataset.
predictionsStartDate
string(date-time)
false
Used for historical predictions in order to override date from which predictions should be calculated. By default value will be inferred automatically from the dataset.
relaxKnownInAdvanceFeaturesCheck
boolean
false
If activated, missing values in the known in advance features are allowed in the forecast window at prediction time. If omitted or false, missing values are not allowed.
type
string
true
Historical mode enables bulk predictions which calculates predictions for all possible forecast points and forecast distances in the dataset within the predictionsStartDate/predictionsEndDate range.
If activated, missing values in the known in advance features are allowed in the forecast window at prediction time. If omitted or false, missing values are not allowed.
type
string
true
Forecast mode used for making predictions on subsets of training data.
Number of bytes in the intake dataset for this job
jobOutputSize
integer,null
false
Number of bytes in the output dataset for this job
logs
[string]
false
The job log.
scoredRows
integer
false
Number of rows that have been used in prediction computation
skippedRows
integer
false
Number of rows that have been skipped during scoring. May contain non-zero value only in time-series predictions case if provided dataset contains more than required historical rows.
For time series projects only. Actual value column name, valid for the prediction files if the project is unsupervised and the dataset is considered as bulk predictions dataset. This value is optional.
datasetId
string
true
The dataset to compute predictions for - must have previously been uploaded.
explanationAlgorithm
string
false
If set to shap, the response will include prediction explanations based on the SHAP explainer (SHapley Additive exPlanations). Defaults to null (no prediction explanations).
forecastPoint
string(date-time)
false
For time series projects only. The time in the dataset relative to which predictions are generated. This value is optional. If not specified the default value is the value in the row with the latest specified timestamp. Specifying this value for a project that is not a time series project will result in an error.
includeFdwCounts
boolean
false
For time series projects with partial history only. Indicates if feature derivation window counts featureDerivationWindowCounts will be part of the response.
includePredictionIntervals
boolean
false
Specifies whether prediction intervals should be calculated for this request. Defaults to True if predictionIntervalsSize is specified, otherwise defaults to False.
maxExplanations
integer
false
maximum: 100 minimum: 1
Specifies the maximum number of explanation values that should be returned for each row, ordered by absolute value, greatest to least. In the case of 'shap': If not set, explanations are returned for all features. If the number of features is greater than the 'maxExplanations', the sum of remaining values will also be returned as 'shapRemainingTotal'. Defaults to null for datasets narrower than 100 columns, defaults to 100 for datasets wider than 100 columns. Cannot be set if 'explanationAlgorithm' is omitted.
modelId
string
true
The model to make predictions on.
predictionIntervalsSize
integer
false
maximum: 100 minimum: 1
Represents the percentile to use for the size of the prediction intervals. Defaults to 80 if includePredictionIntervals is True.
predictionThreshold
number
false
maximum: 1 minimum: 0
Threshold used for binary classification in predictions. Accepts values from 0.0 to 1.0. If not specified, model default prediction threshold will be used.
predictionsEndDate
string(date-time)
false
The end date for bulk predictions, exclusive. Used for time series projects only. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsStartDate, and cannot be provided with the forecastPoint parameter.
predictionsStartDate
string(date-time)
false
The start date for bulk predictions. Used for time series projects only. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsEndDate, and cannot be provided with the forecastPoint parameter.
Subset of data predicted on: The value "all" returns predictions for all rows in the dataset including data used for training, validation, holdout and any rows discarded. This is not available for large datasets or projects created with Date/Time partitioning. The value "validationAndHoldout" returns predictions for the rows used to calculate the validation score and the holdout score. Not available for large projects or Date/Time projects for models trained into the validation set. The value "holdout" returns predictions for the rows used to calculate the holdout score. Not available for projects created without a holdout or for models trained into holdout for large datasets or created with Date/Time partitioning. The value "allBacktests" returns predictions for the rows used to calculate the backtesting scores for Date/Time projects. The value "validation" returns predictions for the rows used to calculate the validation score.
explanationAlgorithm
string
false
If set to "shap", the response will include prediction explanations based on the SHAP explainer (SHapley Additive exPlanations). Defaults to null (no prediction explanations)
maxExplanations
integer
false
maximum: 100 minimum: 1
Specifies the maximum number of explanation values that should be returned for each row, ordered by absolute value, greatest to least. In the case of "shap": If not set, explanations are returned for all features. If the number of features is greater than the "maxExplanations", the sum of remaining values will also be returned as "shapRemainingTotal". Defaults to null for datasets narrower than 100 columns, defaults to 100 for datasets wider than 100 columns. Cannot be set if "explanationAlgorithm" is omitted.
If true, known-in-advance features in this dataset have missing values in the forecast window. Absence of the known-in-advance values can negatively impact prediction quality. Only applies for time series projects.
insufficientRowsForEvaluatingModels
boolean
false
If true, the dataset has a target column present indicating it can be used to evaluate model performance but too few rows to be trustworthy in so doing. If false, either it has no target column at all or it has sufficient rows for model evaluation. Only applies for regression, binary classification, multiclass classification projects and time series unsupervised projects.
singleClassActualValueColumn
boolean
false
If true, actual value column has only one class and such insights as ROC curve can not be calculated. Only applies for binary classification projects or unsupervised projects.
The Google Cloud Platform (GCP) key. Output is the downloaded JSON resulting from creating a service account User Managed Key (in the IAM & admin > Service accounts section of GCP).Required if googleConfigId/configId is not specified.Cannot include this parameter if googleConfigId/configId is specified.
For Parquet directory-scoring only. The column names of the intake data of which to partition the dataset. Columns are partitioned in the order they are given. At least one value is required if scoring to a directory (meaning the output url ends with a slash ("/").
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
Use the specified credential to access the url
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
format
string
false
Type of input file format
partitionColumns
[string]
false
maxItems: 100
For Parquet directory-scoring only. The column names of the intake data of which to partition the dataset. Columns are partitioned in the order they are given. At least one value is required if scoring to a directory (meaning the output url ends with a slash ("/").
The Google Cloud Platform (GCP) key. Output is the downloaded JSON resulting from creating a service account User Managed Key (in the IAM & admin > Service accounts section of GCP).Required if googleConfigId/configId is not specified.Cannot include this parameter if googleConfigId/configId is specified.
googleConfigId
string
false
ID of Secure configurations shared by admin. This is deprecated.Please use configId instead. If specified, cannot include gcpKey.
The name of the specified database catalog to read input data from.
credentialId
any
false
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
The ID of the credential holding information about a user with read access to the JDBC data source.
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
dataStoreId
any
true
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string
false
ID of the data store to connect to
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
fetchSize
integer
false
maximum: 1000000 minimum: 1
A user specified fetch size. Changing it can be used to balance throughput and memory usage. Deprecated and ignored since v2.21.
query
string
false
A self-supplied SELECT statement of the dataset you wish to score. Helpful for supplying a more fine-grained selection of data not achievable through specification of "table" and/or "schema" parameters exclusively.If this job is executed with a job definition, then template variables are available which will be substituted for timestamps: {{ current_run_timestamp }}, {{ last_completed_run_time }}, {{ last_scheduled_run_time }}, {{ next_scheduled_run_time }}, {{ current_run_time }}
schema
string
false
The name of the specified database schema to read input data from.
table
string
false
The name of the specified database table to read input data from.
The name of the specified database catalog to read input data from.
credentialId
string,null
false
The ID of the credential holding information about a user with read access to the JDBC data source.
dataStoreId
string
true
ID of the data store to connect to
fetchSize
integer
false
maximum: 1000000 minimum: 1
A user specified fetch size. Changing it can be used to balance throughput and memory usage. Deprecated and ignored since v2.21.
query
string
false
A self-supplied SELECT statement of the dataset you wish to score. Helpful for supplying a more fine-grained selection of data not achievable through specification of "table" and/or "schema" parameters exclusively.If this job is executed with a job definition, then template variables are available which will be substituted for timestamps: {{ current_run_timestamp }}, {{ last_completed_run_time }}, {{ last_scheduled_run_time }}, {{ next_scheduled_run_time }}, {{ current_run_time }}
schema
string
false
The name of the specified database schema to read input data from.
table
string
false
The name of the specified database table to read input data from.
The name of the specified database catalog to write output data to.
commitInterval
integer
false
maximum: 86400 minimum: 0
Defines a time interval in seconds between each commit is done to the JDBC source. If set to 0, the batch prediction operation will write the entire job before committing.
createTableIfNotExists
boolean
false
Attempt to create the table first if no existing one is detected, before writing data with the strategy defined in the statementType parameter.
credentialId
string,null
false
The ID of the credential holding information about a user with write access to the JDBC data source.
dataStoreId
string
true
ID of the data store to connect to
schema
string
false
The name of the specified database schema to write the results to.
statementType
string
true
The statement type to use when writing the results. Deprecation Warning: Use of create_table is now discouraged. Use one of the other possibilities along with the parameter createTableIfNotExists set to true.
table
string
true
The name of the specified database table to write the results to.If this job is executed with a job definition, then template variables are available which will be substituted for timestamps: {{ current_run_timestamp }}, {{ last_completed_run_time }}, {{ last_scheduled_run_time }}, {{ next_scheduled_run_time }}, {{ current_run_time }}
type
string
true
Type name for this intake type
updateColumns
[string]
false
maxItems: 100
The column names to be updated if statementType is set to either update or upsert.
whereColumns
[string]
false
maxItems: 100
The column names to be used in the where clause if statementType is set to update or upsert.
The name of the specified database catalog to write output data to.
commitInterval
integer
false
maximum: 86400 minimum: 0
Defines a time interval in seconds between each commit is done to the JDBC source. If set to 0, the batch prediction operation will write the entire job before committing.
createTableIfNotExists
boolean
false
Attempt to create the table first if no existing one is detected, before writing data with the strategy defined in the statementType parameter.
credentialId
any
false
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
The ID of the credential holding information about a user with write access to the JDBC data source.
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
dataStoreId
any
true
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string
false
ID of the data store to connect to
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
schema
string
false
The name of the specified database schema to write the results to.
statementType
string
true
The statement type to use when writing the results. Deprecation Warning: Use of create_table is now discouraged. Use one of the other possibilities along with the parameter createTableIfNotExists set to true.
table
string
true
The name of the specified database table to write the results to.If this job is executed with a job definition, then template variables are available which will be substituted for timestamps: {{ current_run_timestamp }}, {{ last_completed_run_time }}, {{ last_scheduled_run_time }}, {{ next_scheduled_run_time }}, {{ current_run_time }}
type
string
true
Type name for this intake type
updateColumns
[string]
false
maxItems: 100
The column names to be updated if statementType is set to either update or upsert.
whereColumns
[string]
false
maxItems: 100
The column names to be used in the where clause if statementType is set to update or upsert.
Customize if forecast point based on job run time needs to be shifted.
type
string
true
Type of the forecast point policy. Forecast point will be based on the scheduled run time of the job or the current moment in UTC if job was launched manually. Run time can be adjusted backwards or forwards.
Offset to apply to scheduled run time of the job in a ISO-8601 format toobtain a relative forecast point. Example of the positive offset 'P2DT5H3M', example of the negative offset '-P2DT5H4M'
The default behavior (async: true) will still submit the job to the queue and start processing as soon as the upload is started.Setting it to false will postpone submitting the job to the queue until all data has been uploaded.This is helpful if the user is on a bad connection and bottlednecked by the upload speed. Instead of blocking the queue this will allow others to submit to the queue until the upload has finished.
multipart
boolean
false
specify if the data will be uploaded in multiple parts instead of a single file
The default behavior (async: true) will still submit the job to the queue and start processing as soon as the upload is started.Setting it to false will postpone submitting the job to the queue until all data has been uploaded.This is helpful if the user is on a bad connection and bottlednecked by the upload speed. Instead of blocking the queue this will allow others to submit to the queue until the upload has finished.
multipart
boolean
false
specify if the data will be uploaded in multiple parts instead of a single file
The ID of the latest version of the catalog entry.
password
string
true
The password (in cleartext) for database authentication. The password will be encrypted on the server side in scope of HTTP request and never saved or stored.
url
string
false
The link to retrieve more detailed information about the entity that uses this catalog dataset.
For regression problems this will be the name of the target column, 'Anomaly score' or ignored field. For classification projects this will be the name of the class.
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
xor
Name
Type
Required
Restrictions
Description
» anonymous
number
false
none
continued
Name
Type
Required
Restrictions
Description
threshold
number
false
maximum: 1 minimum: 0
Threshold used in multilabel classification for this class.
value
number
true
The predicted probability of the class identified by the label.
For time series projects only. The time in the dataset relative to which predictions are generated. This value is optional. If not specified the default value is the value in the row with the latest specified timestamp. Specifying this value for a project that is not a time series project will result in an error.
password
string
false
The password (in cleartext) for database authentication. The password will be encrypted on the server side in scope of HTTP request and never saved or stored. DEPRECATED: please use credentialId or credentialData instead.
predictionsEndDate
string(date-time)
false
The end date for bulk predictions, exclusive. Used for time series projects only. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsStartDate, and cannot be provided with the forecastPoint parameter.
predictionsStartDate
string(date-time)
false
The start date for bulk predictions. Used for time series projects only. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsEndDate, and cannot be provided with the forecastPoint parameter.
relaxKnownInAdvanceFeaturesCheck
boolean
false
For time series projects only. If true, missing values in the known in advance features are allowed in the forecast window at the prediction time. This value is optional. If omitted or false, missing values are not allowed.
secondaryDatasetsConfigId
string
false
For feature discovery projects only. The ID of the alternative secondary dataset config to use during prediction.
useKerberos
boolean
false
If true, use kerberos authentication for database authentication. Default is false.
user
string
false
The username for database authentication. DEPRECATED: please use credentialId or credentialData instead.
Optional, only available for unsupervised projects, in case dataset was uploaded with actual value column specified. Name of the column which will be used to calculate the classification metrics and insights.
catalogId
string,null
true
The ID of the AI catalog entry used to create the prediction, dataset or None if not created from the AI catalog.
catalogVersionId
string,null
true
The ID of the AI catalog version used to create the prediction dataset, or None if not created from the AI catalog.
containsTargetValues
boolean,null
false
If True, dataset contains target values and can be used to calculate the classification metrics and insights. Only applies for supervised projects.
created
string(date-time)
true
The date string of when the dataset was created, of the formatYYYY-mm-ddTHH:MM:SS.ssssssZ, like 2016-06-09T11:32:34.170338Z.
dataEndDate
string(date-time)
false
Only available for time series projects, a date string representing the maximum primary date of the prediction dataset.
Only available for unsupervised projects, a list of detected actualValueColumnInfo objects which can be used to calculate the classification metrics and insights.
forecastPoint
string,null
true
The date string of the forecastPoint of this prediction dataset. Only non-null for time series projects.
forecastPointRange
[string]
false
Only available for time series projects, the start and end of the range of dates available for use as the forecast point, detected based on the uploaded prediction dataset.
id
string
true
The ID of this dataset.
maxForecastDate
string(date-time)
false
Only available for time series projects, a date string representing the maximum forecast date of this prediction dataset.
name
string
true
The name of the dataset when it was uploaded.
numColumns
integer
true
The number of columns in this dataset.
numRows
integer
true
The number of rows in this dataset.
predictionsEndDate
string,null(date-time)
true
The date string of the prediction end date of this prediction dataset. Used for bulk predictions. Note that this parameter is for generating historical predictions using the training data. Only non-null for time series projects.
predictionsStartDate
string,null(date-time)
true
The date string of the prediction start date of this prediction dataset. Used for bulk predictions. Note that this parameter is for generating historical predictions using the training data. Only non-null for time series projects.
projectId
string
true
The project ID that owns this dataset.
secondaryDatasetsConfigId
string
false
Only available for Feature discovery projects. Id of the secondary dataset config used by the dataset for the prediction.
The name of the feature contributing to the prediction.
featureValue
string
true
The value the feature took on for this row. For image features, this value is the URL of the input image (New in v2.21).
imageExplanationUrl
string,null
true
For image features, the URL of the image containing the input image overlaid by the activation heatmap. For non-image features, this field is null.
label
string
true
Describes what this model output corresponds to. For regression projects, it is the name of the target feature. For classification projects, it is a level from the target feature. For Anomaly Detection models it is an Anomaly Score.
For text features, an array of JSON object containing the per ngram based text prediction explanations.
qualitativateStrength
string
true
A human-readable description of how strongly the feature affected the prediction. A large positive effect is denoted '+, medium', small '+', very small '<+'. A large negative effect is denoted '---', medium '--', small '-', very small '<-'.
strength
number
true
The amount this feature's value affected the prediction.
List of class names that will be explained for each row for multiclass. Mutually exclusive with numTopClasses. If neither specified - we assume numTopClasses=1.
datasetId
string
true
The dataset ID.
maxExplanations
integer
false
maximum: 10 minimum: 0
The maximum number of prediction explanations to supply per row of the dataset.
modelId
string
true
The model ID.
numTopClasses
integer
false
maximum: 10 minimum: 1
Number of top predicted classes for each row that will be explained for multiclass. Mutually exclusive with classNames. If neither specified - we assume numTopClasses=1.
thresholdHigh
number,null
false
The high threshold, above which a prediction must score in order for prediction explanations to be computed. If neither thresholdHigh nor thresholdLow is specified, prediction explanations will be computed for all rows.
thresholdLow
number,null
false
The lower threshold, below which a prediction must score in order for prediction explanations to be computed for a row in the dataset. If neither thresholdHigh nor thresholdLow is specified, prediction explanations will be computed for all rows.
The maximum number of prediction explanations to supply per row of the dataset.
thresholdHigh
number,null
false
The high threshold, above which a prediction must score in order for prediction explanations to be computed. If neither thresholdHigh nor thresholdLow is specified, prediction explanations will be computed for all rows.
thresholdLow
number,null
false
The lower threshold, below which a prediction must score in order for prediction explanations to be computed for a row in the dataset. If neither thresholdHigh nor thresholdLow is specified, prediction explanations will be computed for all rows.
Each is a PredictionExplanationsRow. They represent a small sample of prediction explanations that could be generated for a particular dataset. They will have the same schema as the data array in the response from GET /api/v2/projects/{projectId}/predictionExplanations/{predictionExplanationsId}/. As of v2.21 only difference is that there is no forecastPoint in response for time series projects.
Will be present only if explanationAlgorithm = 'shap' and maxExplanations is nonzero. The total of SHAP values for features beyond the maxExplanations. This can be identically 0 in all rows, if maxExplanations is greater than the number of features and thus all features are returned.
The name of the feature contributing to the prediction.
featureValue
any
true
The value the feature took on for this row. The type corresponds to the feature (bool, int, float, str, etc.).
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
integer
false
none
xor
Name
Type
Required
Restrictions
Description
» anonymous
boolean
false
none
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
xor
Name
Type
Required
Restrictions
Description
» anonymous
number
false
none
continued
Name
Type
Required
Restrictions
Description
label
any
true
Describes what output was driven by this prediction explanation. For regression projects, it is the name of the target feature. For classification projects, it is the class whose probability increasing would correspond to a positive strength of this prediction explanation. For predictions made using anomaly detection models, it is the Anomaly Score.
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
xor
Name
Type
Required
Restrictions
Description
» anonymous
number
false
none
continued
Name
Type
Required
Restrictions
Description
strength
number,null
false
Algorithm-specific explanation value attributed to feature in this row. If explanationAlgorithm = shap, this is the SHAP value.
Describes what this model output corresponds to. For regression projects, it is the name of the target feature. For classification projects, it is a level from the target feature. For Anomaly Detection models it is an Anomaly Score.
value
number
true
The output of the prediction. For regression projects, it is the predicted value of the target. For classification projects, it is the predicted probability the row belongs to the class identified by the label.
Timestamp referencing when computation for these prediction explanations finished.
id
string
true
The PredictionExplanationsRecord ID.
maxExplanations
integer
true
The maximum number of codes generated per prediction.
modelId
string
true
The model ID.
numColumns
integer
true
The number of columns prediction explanations were computed for.
predictionExplanationsLocation
string
true
Where to retrieve the prediction explanations.
predictionThreshold
number,null
true
The threshold value used for binary classification prediction.
projectId
string
true
The project ID.
thresholdHigh
number,null
true
The prediction explanation high threshold. Predictions must be above this value (or below the thresholdLow value) to have PredictionExplanations computed.
thresholdLow
number,null
true
The prediction explanation low threshold. Predictions must be below this value (or above the thresholdHigh value) to have PredictionExplanations computed.
'exposureNormalized' (for regression projects with exposure) or 'N/A' (for classification projects) The value of 'exposureNormalized' indicates that prediction outputs are adjusted (or divided) by exposure. The value of 'N/A' indicates that no adjustments are applied to the adjusted predictions and they are identical to the unadjusted predictions.
count
integer
true
How many rows of prediction explanations were returned.
Actual value column name, valid for the prediction files if the project is unsupervised and the dataset is considered as bulk predictions dataset.
credentials
string
false
A list of credentials for the secondary datasets used in feature discovery project
file
string(binary)
true
The dataset file to upload for prediction.
forecastPoint
string(date-time)
false
For time series projects only. The time in the dataset relative to which predictions are generated. If not specified the default value is the value in the row with the latest specified timestamp. Specifying this value for a project that is not a time series project will result in an error.
predictionsEndDate
string(date-time)
false
Used for time series projects only. The end date for bulk predictions. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsStartDate, and cannot be provided with the forecastPoint parameter.
predictionsStartDate
string(date-time)
false
Used for time series projects only. The start date for bulk predictions. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsEndDate, and cannot be provided with the forecastPoint parameter.
relaxKnownInAdvanceFeaturesCheck
string
false
A boolean flag. If true, missing values in the known in advance features are allowed in the forecast window at the prediction time. If omitted or false, missing values are not allowed. For time series projects only.
secondaryDatasetsConfigId
string
false
Optional, for feature discovery projects only. The Id of the alternative secondary dataset config to use during prediction.
The ID of the dataset entry to use for prediction dataset.
datasetVersionId
string
false
The ID of the dataset version to use for the prediction dataset. If not specified - uses latest version associated with datasetId.
forecastPoint
string(date-time)
false
For time series projects only. The time in the dataset relative to which predictions are generated. This value is optional. If not specified the default value is the value in the row with the latest specified timestamp. Specifying this value for a project that is not a time series project will result in an error.
password
string
false
The password (in cleartext) for database authentication. The password will be encrypted on the server side in scope of HTTP request and never saved or stored.DEPRECATED: please use credentialId or credentialData instead.
predictionsEndDate
string(date-time)
false
The end date for bulk predictions, exclusive. Used for time series projects only. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsStartDate, and cannot be provided with the forecastPoint parameter.
predictionsStartDate
string(date-time)
false
The start date for bulk predictions. Used for time series projects only. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsEndDate, and cannot be provided with the forecastPoint parameter.
relaxKnownInAdvanceFeaturesCheck
boolean
false
For time series projects only. If True, missing values in the known in advance features are allowed in the forecast window at the prediction time. If omitted or False, missing values are not allowed.
secondaryDatasetsConfigId
string
false
For feature discovery projects only. The Id of the alternative secondary dataset config to use during prediction.
useKerberos
boolean
false
If true, use kerberos authentication for database authentication. Default is false.
user
string
false
The username for database authentication. DEPRECATED: please use credentialId or credentialData instead.
In the case of an unsupervised time series project with a dataset using predictionsStartDate and predictionsEndDate for bulk predictions and a specified actual value column, the predictions will be a json array in the same format as with a forecast point with one additional element - actualValues. It is the actual value in the row.
forecastDistance
integer,null
false
(if time series project) The number of time units this prediction is away from the forecastPoint. The unit of time is determined by the timeUnit of the datetime partition column.
forecastPoint
string,null(date-time)
false
(if time series project) The forecastPoint of the predictions. Either provided or inferred.
originalFormatTimestamp
string
false
The timestamp of this row in the prediction dataset. Unlike the timestamp field, this field will keep the same DateTime formatting as the uploaded prediction dataset. (This column is shown if enabled by your administrator.)
positiveProbability
number,null
false
minimum: 0
For binary classification, the probability the row belongs to the positive class.
prediction
any
true
The prediction of the model.
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
number
false
If using a regressor model, will be the numeric value of the target.
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
If using a binary or muliclass classifier model, will be the predicted class.
xor
Name
Type
Required
Restrictions
Description
» anonymous
[string]
false
If using a multilabel classifier model, will be a list of predicted classes.
Array contains predictionExplanation objects. The total elements in the array are bounded by maxExplanations and feature count. It will be present only if explanationAlgorithm is not null (prediction explanations were requested).
predictionIntervalLowerBound
number
false
Present if includePredictionIntervals is True. Indicates a lower bound of the estimate of error based on test data.
predictionIntervalUpperBound
number
false
Present if includePredictionIntervals is True. Indicates an upper bound of the estimate of error based on test data.
predictionThreshold
number
false
maximum: 1 minimum: 0
Threshold used for binary classification in predictions.
The row in the prediction dataset this prediction corresponds to.
segmentId
string
false
The ID of the segment value for a segmented project.
seriesId
string,null
false
The ID of the series value for a multiseries project. For time series projects that are not a multiseries this will be a NaN.
target
string,null
false
In the case of a time series project with a dataset using predictionsStartDate and predictionsEndDate for bulk predictions, the predictions will be a json array in the same format as with a forecast point with one additional element - target. It is the target value in the row.
timestamp
string,null(date-time)
false
(if time series project) The timestamp of this row in the prediction dataset.
For time series unsupervised projects only. Will be present only if the prediction dataset has an actual value column. The name of the column with actuals that was used to calculate the scores and insights.
explanationAlgorithm
string,null
false
The selected algorithm to use for prediction explanations. At present, the only acceptable value is 'shap', which selects the SHapley Additive exPlanations (SHAP) explainer. Defaults to null (no prediction explanations).
featureDerivationWindowCounts
integer,null
false
For time series projects with partial history only. Indicates how many points were used during feature derivation in feature derivation window.
includesPredictionIntervals
boolean
false
For time series projects only. Indicates if prediction intervals will be part of the response. Defaults to False.
maxExplanations
integer,null
false
The maximum number of prediction explanations values to be returned with each row in the predictions json array. Null indicates 'no limit'. Will be present only if explanationAlgorithm was set.
positiveClass
any
true
For binary classification, the class of the target deemed the positive class. For all other project types this field will be null.
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
xor
Name
Type
Required
Restrictions
Description
» anonymous
integer
false
none
xor
Name
Type
Required
Restrictions
Description
» anonymous
number
false
none
continued
Name
Type
Required
Restrictions
Description
predictionIntervalsSize
integer,null
false
For time series projects only. Will be present only if includePredictionIntervals is True. Indicates the percentile used for prediction intervals calculation. Defaults to 80.
The json array of predictions. The predictions in the response will have slightly different formats, depending on the project type.
shapBaseValue
number,null
false
Will be present only if explanationAlgorithm = 'shap'. The model's average prediction over the training data. SHAP values are deviations from the base value.
Actual value column name, valid for the prediction files if the project is unsupervised and the dataset is considered as bulk predictions dataset. This value is optional.
credentials
[oneOf]
false
maxItems: 30
A list of credentials for the secondary datasets used in feature discovery project
For time series projects only. The time in the dataset relative to which predictions are generated. If not specified the default value is the value in the row with the latest specified timestamp. Specifying this value for a project that is not a time series project will result in an error.
predictionsEndDate
string(date-time)
false
Used for time series projects only. The end date for bulk predictions, exclusive. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsStartDate, and cannot be provided with the forecastPoint parameter.
predictionsStartDate
string(date-time)
false
Used for time series projects only. The start date for bulk predictions. Note that this parameter is used for generating historical predictions using the training data, not for future predictions. If not specified, the dataset is not considered as a bulk predictions dataset. This parameter should be provided in conjunction with a predictionsEndDate, and cannot be provided with the forecastPoint parameter.
relaxKnownInAdvanceFeaturesCheck
boolean
false
For time series projects only. If true, missing values in the known in advance features are allowed in the forecast window at the prediction time. This value is optional. If omitted or false, missing values are not allowed.
secondaryDatasetsConfigId
string
false
For feature discovery projects only. The ID of the alternative secondary dataset config to use during prediction.
For time series unsupervised projects only. Actual value column can be used to calculate the classification metrics and insights.
datasetId
string,null
false
Deprecated alias for predictionDatasetId.
explanationAlgorithm
string,null
false
The selected algorithm to use for prediction explanations. At present, the only acceptable value is shap, which selects the SHapley Additive exPlanations (SHAP) explainer. Defaults to null (no prediction explanations).
featureDerivationWindowCounts
integer,null
false
For time series projects with partial history only. Indicates how many points were used in during feature derivation.
forecastPoint
string,null(date-time)
false
For time series projects only. The time in the dataset relative to which predictions were generated.
id
string
true
The id of the prediction record.
includesPredictionIntervals
boolean
true
Whether the predictions include prediction intervals.
maxExplanations
integer,null
false
The maximum number of prediction explanations values to be returned with each row in the predictions json array. Null indicates no limit. Will be present only if explanationAlgorithm was set.
modelId
string
true
The model id used for predictions.
predictionDatasetId
string,null
false
The dataset id where the prediction data comes from. The field is available via /api/v2/projects/<projectId>/predictionsMetadata/ route and replaced on datasetIdin deprecated /api/v2/projects/<projectId>/predictions/ endpoint.
predictionIntervalsSize
integer,null
true
For time series projects only. If prediction intervals were computed, what percentile they represent. Will be None if includePredictionIntervals is False.
predictionThreshold
number,null
false
Threshold used for binary classification in predictions.
predictionsEndDate
string,null(date-time)
false
For time series projects only. The end date for bulk predictions, exclusive. Note that this parameter was used for generating historical predictions using the training data, not for future predictions.
predictionsStartDate
string,null(date-time)
false
For time series projects only. The start date for bulk predictions. Note that this parameter was used for generating historical predictions using the training data, not for future predictions.
Endpoint URL for the S3 connection (omit to use the default)
format
string
false
Type of output file format
partitionColumns
[string]
false
maxItems: 100
For Parquet directory-scoring only. The column names of the intake data of which to partition the dataset. Columns are partitioned in the order they are given. At least one value is required if scoring to a directory (meaning the output url ends with a slash ("/").
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
Use the specified credential to access the url
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
endpointUrl
string(url)
false
Endpoint URL for the S3 connection (omit to use the default)
format
string
false
Type of output file format
partitionColumns
[string]
false
maxItems: 100
For Parquet directory-scoring only. The column names of the intake data of which to partition the dataset. Columns are partitioned in the order they are given. At least one value is required if scoring to a directory (meaning the output url ends with a slash ("/").
The date(s) of the month that the job will run. Allowed values are either [1 ... 31] or ["*"] for all days of the month. This field is additive with dayOfWeek, meaning the job will run both on the date(s) defined in this field and the day specified by dayOfWeek (for example, dates 1st, 2nd, 3rd, plus every Tuesday). If dayOfMonth is set to ["*"] and dayOfWeek is defined, the scheduler will trigger on every day of the month that matches dayOfWeek (for example, Tuesday the 2nd, 9th, 16th, 23rd, 30th). Invalid dates such as February 31st are ignored.
dayOfWeek
[number,string]
true
maxItems: 7
The day(s) of the week that the job will run. Allowed values are [0 .. 6], where (Sunday=0), or ["*"], for all days of the week. Strings, either 3-letter abbreviations or the full name of the day, can be used interchangeably (e.g., "sunday", "Sunday", "sun", or "Sun", all map to [0]. This field is additive with dayOfMonth, meaning the job will run both on the date specified by dayOfMonth and the day defined in this field.
hour
[number,string]
true
maxItems: 24
The hour(s) of the day that the job will run. Allowed values are either ["*"] meaning every hour of the day or [0 ... 23].
minute
[number,string]
true
maxItems: 60
The minute(s) of the day that the job will run. Allowed values are either ["*"] meaning every minute of the day or[0 ... 59].
month
[number,string]
true
maxItems: 12
The month(s) of the year that the job will run. Allowed values are either [1 ... 12] or ["*"] for all months of the year. Strings, either 3-letter abbreviations or the full name of the month, can be used interchangeably (e.g., "jan" or "october"). Months that are not compatible with dayOfMonth are ignored, for example {"dayOfMonth": [31], "month":["feb"]}.
Position of the job in the queue Job. The value will show 0 if the job is about to run, otherwise, the number will be greater than 0 if currently queued, or None if the job is not currently running.
running
boolean
true
true or false depending on whether the job is currently running.
The server-side encryption algorithm used when storing this object in Amazon S3 (for example, AES256, aws:kms).
customerAlgorithm
string
false
Specifies the algorithm to use to when encrypting the object (for example, AES256).
customerKey
string
false
Specifies the customer-provided encryption key for Amazon S3 to use in encrypting data. This value is used to store the object and then it is discarded; Amazon S3 does not store the encryption key. The key must be appropriate for use with the algorithm specified in customerAlgorithm. The key must be sent as an base64 encoded string.
kmsEncryptionContext
string
false
Specifies the Amazon Web Services KMS Encryption Context to use for object encryption. The value of this header is a base64-encoded UTF-8 string holding JSON with the encryption context key-value pairs.
kmsKeyId
string
false
Specifies the ID of the symmetric customer managed key to use for object encryption.
The name of the specified database catalog to read input data from.
cloudStorageCredentialId
any
false
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
The ID of the credential holding information about a user with read access to the cloud storage.
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
cloudStorageType
string
false
Type name for cloud storage
credentialId
any
false
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
The ID of the credential holding information about a user with read access to the Snowflake data source.
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
dataStoreId
any
true
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string
false
ID of the data store to connect to
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
externalStage
string
true
External storage
query
string
false
A self-supplied SELECT statement of the dataset you wish to score. Helpful for supplying a more fine-grained selection of data not achievable through specification of "table" and/or "schema" parameters exclusively.If this job is executed with a job definition, then template variables are available which will be substituted for timestamps: {{ current_run_timestamp }}, {{ last_completed_run_time }}, {{ last_scheduled_run_time }}, {{ next_scheduled_run_time }}, {{ current_run_time }}
schema
string
false
The name of the specified database schema to read input data from.
table
string
false
The name of the specified database table to read input data from.
The name of the specified database catalog to read input data from.
cloudStorageCredentialId
string,null
false
The ID of the credential holding information about a user with read access to the cloud storage.
cloudStorageType
string
false
Type name for cloud storage
credentialId
string,null
false
The ID of the credential holding information about a user with read access to the Snowflake data source.
dataStoreId
string
true
ID of the data store to connect to
externalStage
string
true
External storage
query
string
false
A self-supplied SELECT statement of the dataset you wish to score. Helpful for supplying a more fine-grained selection of data not achievable through specification of "table" and/or "schema" parameters exclusively.If this job is executed with a job definition, then template variables are available which will be substituted for timestamps: {{ current_run_timestamp }}, {{ last_completed_run_time }}, {{ last_scheduled_run_time }}, {{ next_scheduled_run_time }}, {{ current_run_time }}
schema
string
false
The name of the specified database schema to read input data from.
table
string
false
The name of the specified database table to read input data from.
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
The ID of the Azure credential holding information about a user with read access to the cloud storage.
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
credentialId
any
false
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string,null
false
The ID of the credential holding information about a user with read access to the JDBC data source.
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
dataStoreId
any
true
Either the populated value of the field or [redacted] due to permission settings
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
string
false
ID of the data store to connect to
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
none
continued
Name
Type
Required
Restrictions
Description
externalDataSource
string
true
External datasource name
query
string
false
A self-supplied SELECT statement of the dataset you wish to score. Helpful for supplying a more fine-grained selection of data not achievable through specification of "table" and/or "schema" parameters exclusively.If this job is executed with a job definition, then template variables are available which will be substituted for timestamps: {{ current_run_timestamp }}, {{ last_completed_run_time }}, {{ last_scheduled_run_time }}, {{ next_scheduled_run_time }}, {{ current_run_time }}
schema
string
false
The name of the specified database schema to read input data from.
table
string
false
The name of the specified database table to read input data from.
The ID of the Azure credential holding information about a user with read access to the cloud storage.
credentialId
string,null
false
The ID of the credential holding information about a user with read access to the JDBC data source.
dataStoreId
string
true
ID of the data store to connect to
externalDataSource
string
true
External datasource name
query
string
false
A self-supplied SELECT statement of the dataset you wish to score. Helpful for supplying a more fine-grained selection of data not achievable through specification of "table" and/or "schema" parameters exclusively.If this job is executed with a job definition, then template variables are available which will be substituted for timestamps: {{ current_run_timestamp }}, {{ last_completed_run_time }}, {{ last_scheduled_run_time }}, {{ next_scheduled_run_time }}, {{ current_run_time }}
schema
string
false
The name of the specified database schema to read input data from.
table
string
false
The name of the specified database table to read input data from.
(if time series project) The number of time units this prediction is away from the forecastPoint. The unit of time is determined by the timeUnit of the datetime partition column.
forecastPoint
string,null(date-time)
false
(if time series project) The forecastPoint of the predictions. Either provided or inferred.
partitionId
string
true
The partition used for the prediction record
prediction
any
true
The prediction of the model.
oneOf
Name
Type
Required
Restrictions
Description
» anonymous
number
false
If using a regressor model, will be the numeric value of the target.
xor
Name
Type
Required
Restrictions
Description
» anonymous
string
false
If using a binary or muliclass classifier model, will be the predicted class.
xor
Name
Type
Required
Restrictions
Description
» anonymous
[string]
false
If using a multilabel classifier model, will be a list of predicted classes.
Array contains predictionExplanation objects. The total elements in the array are bounded by maxExplanations and feature count. It will be present only if explanationAlgorithm is not null (prediction explanations were requested).
predictionThreshold
number
false
maximum: 1 minimum: 0
Threshold used for binary classification in predictions.
The additional information necessary to understand shap based prediction explanations. Only present if explanationAlgorithm="shap" was added in compute request.
timestamp
string,null(date-time)
false
(if time series project) The timestamp of this row in the prediction dataset.
The additional information necessary to understand shap based prediction explanations. Only present if explanationAlgorithm="shap" was added in compute request.
The model's average prediction over the training data. SHAP values are deviations from the base value.
shapRemainingTotal
integer
true
The total of SHAP values for features beyond the maxExplanations. This can be identically 0 in all rows, if maxExplanations is greater than the number of features and thus all features are returned.
The method used for calculating prediction explanations
id
string
true
ID of the training prediction job
maxExplanations
integer,null
false
maximum: 100 minimum: 0
the number of top contributors that are included in prediction explanations. Defaults to null for datasets narrower than 100 columns, defaults to 100 for datasets wider than 100 columns