# Portable batch predictions

> Portable batch predictions - How to use the portable batch predictions (PBP) with PPS and score data
> in a batch in an isolated environment.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.620669+00:00` (UTC).

## Primary page

- [Portable batch predictions](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html): Full documentation for this topic (HTML).

## Sections on this page

- [Scoring methods](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#scoring-methods): In-page section heading.
- [Job definitions](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#job-definitions): In-page section heading.
- [Credentials environment variables](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#credentials-environment-variables): In-page section heading.
- [Run portable batch predictions](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#run-portable-batch-predictions): In-page section heading.
- [More examples](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#more-examples): In-page section heading.
- [Filesystem scoring with single-model mode PPS](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#filesystem-scoring-with-single-model-mode-pps): In-page section heading.
- [Filesystem scoring with multi-model mode PPS](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#filesystem-scoring-with-multi-model-mode-pps): In-page section heading.
- [Filesystem scoring with multi-model mode PPS and integration with DR job status tracking](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#filesystem-scoring-with-multi-model-mode-pps-and-integration-with-dr-job-status-tracking): In-page section heading.
- [JDBC scoring with single-model mode PPS](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#jdbc-scoring-with-single-model-mode-pps): In-page section heading.
- [S3 scoring with single-model mode PPS](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#s3-scoring-with-single-model-mode-pps): In-page section heading.
- [Snowflake scoring with multi-model mode PPS](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#snowflake-scoring-with-multi-model-mode-pps): In-page section heading.
- [Time series scoring over Azure Blob with multi-model mode PPS](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#ts-azure-scoring-with-multi-model-mode-pps): In-page section heading.

## Related documentation

- [Classic UI documentation](https://docs.datarobot.com/en/docs/classic-ui/index.html): Linked from this page.
- [Predictions](https://docs.datarobot.com/en/docs/classic-ui/predictions/index.html): Linked from this page.
- [Portable prediction methods](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/index.html): Linked from this page.
- [Portable Prediction Server](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/index.html): Linked from this page.
- [Portable Prediction Server](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-pps.html): Linked from this page.
- [A JDBC driver](https://docs.datarobot.com/en/docs/platform/admin/manage-cluster/manage-drivers.html): Linked from this page.
- [batch predictions](https://docs.datarobot.com/en/docs/api/reference/public-api/batch_predictions.html#post-apiv2batchpredictions): Linked from this page.
- [intake options](https://docs.datarobot.com/en/docs/api/reference/batch-prediction-api/intake-options.html): Linked from this page.
- [output options](https://docs.datarobot.com/en/docs/api/reference/batch-prediction-api/output-options.html): Linked from this page.
- [Prediction Explanations](https://docs.datarobot.com/en/docs/classic-ui/modeling/analyze-models/understand/pred-explain/predex-overview.html): Linked from this page.
- [XEMP Prediction Explanations](https://docs.datarobot.com/en/docs/classic-ui/modeling/analyze-models/understand/pred-explain/xemp-pe.html): Linked from this page.

## Documentation content

# Portable batch predictions

> [!NOTE] Availability information
> The Portable Prediction Server is a premium feature exclusive to DatRobot MLOps. Contact your DataRobot representative or administrator for information on enabling this feature.

Portable batch predictions (PBP) let you score large amounts of data on disconnected environments. Before you can use portable batch predictions, you need to configure the [Portable Prediction Server](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-pps.html) (PPS), a DataRobot execution environment for DataRobot model packages ( `.mlpkg` files) distributed as a self-contained Docker image. Portable batch predictions use the same Docker image as the PPS but run it in a different mode.

## Scoring methods

Portable batch predictions can use the following adapters to score datasets:

- Filesystem
- JDBC
- AWS S3
- Azure Blob
- GCS
- Snowflake
- Synapse

To run portable batch predictions, you need the following artifacts:

**SaaS:**
Portable Prediction Server Docker image
A defined batch prediction job
An ENV config file with credentials
(optional)

**Self-Managed:**
A Portable Prediction Server Docker image
A defined batch prediction job
An ENV config file with credentials
(optional)
A JDBC driver
(optional)


After you prepare these artifacts, you can [run portable batch predictions](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#run-portable-batch-predictions). See also [additional examples](https://docs.datarobot.com/en/docs/classic-ui/predictions/port-pred/pps/portable-batch-predictions.html#more-examples) of running portable batch predictions.

## Job definitions

You can define jobs using a `JSON` config file in which you describe `prediction_endpoint`, `intake_settings`, `output_settings`, `timeseries_settings` (optional) for time series scoring, and `jdbc_settings` (optional) for JDBC scoring.

The `prediction_endpoint` describes how to access the PPS and is constructed as `<schema>://<hostname>:<port>`, where you define the following parameters:

| Parameter | Type | Description |
| --- | --- | --- |
| schema | string | http or https |
| hostname | string | The hostname of the instance where your PPS is running. |
| port | string | The port of the prediction API running inside the PPS. |

The `jdbc_setting` has the following attributes:

| Parameter | Type | Description |
| --- | --- | --- |
| url | string | The URL to connect via the JDBC interface. |
| class_name | string | The class name used as an entry point for JDBC communication. |
| driver_path | string | The path to the JDBC driver on your filesystem (available inside the PBP container). |
| template_name | string | The name of the template in case of write-back. To obtain the names of the support templates, contact your DataRobot representative. |

The other parameters are similar to those available for standard [batch predictions](https://docs.datarobot.com/en/docs/api/reference/public-api/batch_predictions.html#post-apiv2batchpredictions), however, they are in `snake_case`, not `camelCase`:

| Parameter | Type | Description |
| --- | --- | --- |
| abort_on_error | boolean | Enable or disable cancelling the portable batch prediction job if an error occurs. Example: true |
| chunk_size | string | Chunk the dataset for scoring in sequence as asynchronous tasks. In most cases, the default value will produce the best performance. Bigger chunks can be used to score very fast models and smaller chunks can be used to score very slow models.Example: "auto" |
| column_names_remapping | array | Rename or remove columns from the output for this job. Set an output_name for the column to null or false to remove it.Example: [{'input_name': 'isbadbuy_1_PREDICTION', 'output_name':'prediction'}, {'input_name': 'isbadbuy_0_PREDICTION', 'output_name': null}] |
| csv_settings | object | Set the delimiter, character encoding, and quote character for comma separated value (CSV) files. Example: { "delimiter": ",", "encoding": "utf-8", "quotechar": "\"" } |
| deployment_id | string | Define the ID of the deployment associated with the portable batch predictions.Example: 61f05aaf5f6525f43ed79751 |
| disable_row_level_error_handling | boolean | Enable or disable error handling by prediction row. Example: false |
| include_prediction_status | boolean | Enable or disable including the prediction_status column in the output; defaults to false. Example: false |
| include_probabilities | boolean | Enable or disable returning probabilities for all classes. Example: true |
| include_probabilities_classes | array | Define the classes to provide class probabilities for.Example: [ 'setosa', 'versicolor', 'virginica' ] |
| intake_settings | object | Set the intake options required for the input type.Example: { "type": "localFile" } |
| num_concurrent | integer | Set the maximum number chunks to score concurrently on the prediction instance specified by the deployment.Example: 1 |
| output_settings | object | Set the output options required for the output type.Example: { "credential_id": "string", "format": "csv", "partitionColumns": [ "string" ], "type": "azure", "url": "string" } |
| passthrough_columns | array | Define the scoring dataset columns to include in the prediction response. This option is mutually exclusive with passthrough_columns_set. Example: [ "column1", "column2" ] |
| passthrough_columns_set | string | Enable including all scoring dataset columns in the prediction response. The only option is all. This option is mutually exclusive with passthrough_columns. Example: "all" |
| prediction_warning_enabled | boolean | Enable or disable prediction warnings. Example: true |
| skip_drift_tracking | boolean | Enable or disable drift tracking for this batch of predictions. This allows you to make test predictions without affecting deployment stats.Example: false |
| timeseries_settings | object | Define the settings required for time series predictions. Example: { "forecast_point": "2019-08-24T14:15:22Z", "relax_known_in_advance_features_check": false, "type": "forecast" } |

You can also configure [Prediction Explanations](https://docs.datarobot.com/en/docs/classic-ui/modeling/analyze-models/understand/pred-explain/predex-overview.html) for portable batch predictions:

| Parameter | Type | Description |
| --- | --- | --- |
| max_explanations | int/str | Set the number of explanations returned by the prediction server. For SHAP explanations, a special constant all is also accepted. Example: 1 |
| explanation_algorithm | string | Define the algorithm used for Prediction Explanations, either SHAP or XEMP.Example: "shap" |
| explanation_class_names | array | Define the class names to explain for each row. This setting is only applicable to XEMP Prediction Explanations for multiclass models and it is mutually exclusive with explanation_num_top_classes.Example: [ "class1", "class2" ] |
| explanation_num_top_classes | integer | Set the number of top predicted classes, by prediction value, to explain for each row. This setting is only applicable to XEMP Prediction Explanations for multiclass models and it is mutually exclusive with explanation_class_names.Example: 1 |
| threshold_low | float | Set the lower threshold for requiring a Prediction Explanation. Predictions must be below this value (or above the threshold_high value) for Prediction Explanations to compute. Example: 0.678 |
| threshold_high | float | Set the upper threshold for requiring a Prediction Explanation. Predictions must be above this value (or below the threshold_low value) for Prediction Explanations to compute. Example: 0.345 |

The following outlines a JDBC example that scores to and from Snowflake using single-mode PPS running locally and can be defined as a `job_definition_jdbc.json` file:

```
{
    "prediction_endpoint": "http://127.0.0.1:8080",
    "intake_settings": {
        "type": "jdbc",
        "table": "SCORING_DATA",
        "schema": "PUBLIC"
    },
    "output_settings": {
        "type": "jdbc",
        "table": "SCORED_DATA",
        "statement_type": "create_table",
        "schema": "PUBLIC"
    },
    "passthrough_columns_set": "all",
    "include_probabilities": true,
    "jdbc_settings": {
        "url": "jdbc:snowflake://my_account.snowflakecomputing.com/?warehouse=WH&db=DB&schema=PUBLIC",
        "class_name": "net.snowflake.client.jdbc.SnowflakeDriver",
        "driver_path": "/tmp/portable_batch_predictions/jdbc/snowflake-jdbc-3.12.0.jar",
        "template_name": "Snowflake"
    }
}
```

## Credentials environment variables

If you are using JDBC or private containers in cloud storage, you can specify the required
credentials as environment variables. The following table shows which variables names are used:

| Name | Type | Description |
| --- | --- | --- |
| AWS_ACCESS_KEY_ID | string | AWS Access key ID |
| AWS_SECRET_ACCESS_KEY | string | AWS Secret access key |
| AWS_SESSION_TOKEN | string | AWS token |
| GOOGLE_STORAGE_KEYFILE_PATH | string | Path to GCP credentials file |
| AZURE_CONNECTION_STRING | string | Azure connection string |
| JDBC_USERNAME | string | Username for JDBC |
| JDBC_PASSWORD | string | Password for JDBC |
| SNOWFLAKE_USERNAME | string | Username for Snowflake |
| SNOWFLAKE_PASSWORD | string | Password for Snowflake |
| SYNAPSE_USERNAME | string | Username for Azure Synapse |
| SYNAPSE_PASSWORD | string | Password for Azure Synapse |

Here's an example of the `credentials.env` file used for JDBC scoring:

```
export JDBC_USERNAME=TEST_USER
export JDBC_PASSWORD=SECRET
```

## Run portable batch predictions

Portable batch predictions run inside a Docker container. You need to mount job definitions, files, and datasets (if you are going to score from a host filesystem and set a path inside the container) onto Docker. Using a JDBC job definition and credentials from previous examples, the following outlines a complete example of how to start a portable batch predictions job to score to and from Snowflake.

```
docker run --rm \
    -v /host/filesystem/path/job_definition_jdbc.json:/docker/container/filesystem/path/job_definition_jdbc.json \
    --network host \
    --env-file /host/filesystem/path/credentials.env \
    datarobot-portable-predictions-api batch /docker/container/filesystem/path/job_definition_jdbc.json
```

Here is another example of how to run a complete end-to-end flow, including PPS and a write-back
job status into the DataRobot platform for monitoring progress.

```
#!/bin/bash

# This snippet starts both the PPS service and PBP job using the same PPS docker image
# available from Developer Tools.

#################
# Configuration #
#################

# Specify path to directory with mlpkg(s) which you can download from deployment
MLPKG_DIR='/host/filesystem/path/mlpkgs'
# Specify job definition path
JOB_DEFINITION_PATH='/host/filesystem/path/job_definition.json'
# Specify path to file with credentials if needed (for cloud storage adapters or JDBC)
CREDENTIALS_PATH='/host/filesystem/path/credentials.env'
# For DataRobot integration, specify API host and Token
API_HOST='https://app.datarobot.com'
API_TOKEN='XXXXXXXX'

# Run PPS service in the background
PPS_CONTAINER_ID=$(docker run --rm -d -p 127.0.0.1:8080:8080 -v $MLPKG_DIR:/opt/ml/model datarobot/datarobot-portable-prediction-api:<version>)
# Wait some time before PPS starts up
sleep 15
# Run PPS in batch mode to start PBP job
docker run --rm -v $JOB_DEFINITION_PATH:/tmp/job_definition.json \
    --network host \
    --env-file $CREDENTIALS_PATH \
    datarobot/datarobot-portable-prediction-api:<version> batch /tmp/job_definition.json
        --api_host $API_HOST --api_token $API_TOKEN
# Stop PPS service
docker stop $PPS_CONTAINER_ID
```

## More examples

In all of the following examples, assume that PPS is running locally on port `8080`, and the filesystem structure has the following format:

```
/host/filesystem/path/portable_batch_predictions/
├── job_definition.json
├── credentials.env
├── datasets
|   └── intake_dataset.csv
├── output
└── jdbc
    └── snowflake-jdbc-3.12.0.jar
```

### Filesystem scoring with single-model mode PPS

`job_definition.json` file:

```
{
    "prediction_endpoint": "http://127.0.0.1:8080",
    "intake_settings": {
        "type": "filesystem",
        "path": "/tmp/portable_batch_predictions/datasets/intake_dataset.csv"
    },
    "output_settings": {
        "type": "filesystem",
        "path": "/tmp/portable_batch_predictions/output/results.csv"
    }
}
```

```
#!/bin/bash

docker run --rm \
    --network host \
    -v /host/filesystem/path/portable_batch_predictions:/tmp/portable_batch_predictions \
    datarobot/datarobot-portable-prediction-api:<version> batch \
        /tmp/portable_batch_predictions/job_definition.json
```

### Filesystem scoring with multi-model mode PPS

`job_definition.json` file:

```
{
    "prediction_endpoint": "http://127.0.0.1:8080",
    "deployment_id": "lending_club",
    "intake_settings": {
        "type": "filesystem",
        "path": "/tmp/portable_batch_predictions/datasets/intake_dataset.csv"
    },
    "output_settings": {
        "type": "filesystem",
        "path": "/tmp/portable_batch_predictions/output/results.csv"
    }
}
```

```
#!/bin/bash

docker run --rm \
    --network host \
    -v /host/filesystem/path/portable_batch_predictions:/tmp/portable_batch_predictions \
    datarobot/datarobot-portable-prediction-api:<version> batch \
        /tmp/portable_batch_predictions/job_definition.json
```

### Filesystem scoring with multi-model mode PPS and integration with DR job status tracking

`job_definition.json` file:

```
{
    "prediction_endpoint": "http://127.0.0.1:8080",
    "deployment_id": "lending_club",
    "intake_settings": {
        "type": "filesystem",
        "path": "/tmp/portable_batch_predictions/datasets/intake_dataset.csv"
    },
    "output_settings": {
        "type": "filesystem",
        "path": "/tmp/portable_batch_predictions/output/results.csv"
    }
}
```

For the PPS MLPKG, in `config.yaml`, specify the deployment ID of the deployment for which you are running the portable batch prediction job.

```
#!/bin/bash

docker run --rm \
    --network host
    -v /host/filesystem/path/portable_batch_predictions:/tmp/portable_batch_predictions \
    datarobot/datarobot-portable-prediction-api:<version> batch \
        /tmp/portable_batch_predictions/job_definition.json \
        --api_host https://app.datarobot.com --api_token XXXXXXXXXXXXXXXXXXX
```

### JDBC scoring with single-model mode PPS

`job_definition.json` file:

```
{
    "prediction_endpoint": "http://127.0.0.1:8080",
    "deployment_id": "lending_club",
    "intake_settings": {
        "type": "jdbc",
        "table": "INTAKE_TABLE"
    },
    "output_settings": {
        "type": "jdbc",
        "table": "OUTPUT_TABLE",
        "statement_type": "create_table"
    },
    "passthrough_columns_set": "all",
    "include_probabilities": true,
    "jdbc_settings": {
        "url": "jdbc:snowflake://your_account.snowflakecomputing.com/?warehouse=SOME_WH&db=MY_DB&schema=MY_SCHEMA",
        "class_name": "net.snowflake.client.jdbc.SnowflakeDriver",
        "driver_path": "/tmp/portable_batch_predictions/jdbc/snowflake-jdbc-3.12.0.jar",
        "template_name": "Snowflake"
    }
}
```

`credentials.env` file:

```
JDBC_USERNAME=TEST
JDBC_PASSWORD=SECRET
```

```
#!/bin/bash

docker run --rm \
    --network host \
    -v /host/filesystem/path/portable_batch_predictions:/tmp/portable_batch_predictions \
    --env-file /host/filesystem/path/credentials.env \
    datarobot/datarobot-portable-prediction-api:<version> batch \
        /tmp/portable_batch_predictions/job_definition.json
```

### S3 scoring with single-model mode PPS

`job_definition.json` file:

```
{
    "prediction_endpoint": "http://127.0.0.1:8080",
    "intake_settings": {
        "type": "s3",
        "url": "s3://intake/dataset.csv",
        "format": "csv"
    },
    "output_settings": {
        "type": "s3",
        "url": "s3://output/result.csv",
        "format": "csv"
    }
}
```

`credentials.env` file:

```
AWS_ACCESS_KEY_ID=XXXXXXXXXXXX
AWS_SECRET_ACCESS_KEY=XXXXXXXXXXX
```

```
#!/bin/bash

docker run --rm \
    --network host \
    -v /host/filesystem/path/portable_batch_predictions:/tmp/portable_batch_predictions \
    --env-file /path/to/credentials.env \
    datarobot/datarobot-portable-prediction-api:<version> batch \
        /tmp/portable_batch_predictions/job_definition.json
```

### Snowflake scoring with multi-model mode PPS

`job_definition.json` file:

```
{
    "prediction_endpoint": "http://127.0.0.1:8080",
    "deployment_id": "lending_club",
    "intake_settings": {
        "type": "snowflake",
        "table": "INTAKE_TABLE",
        "schema": "MY_SCHEMA",
        "external_stage": "MY_S3_STAGE_IN_SNOWFLAKE"
    },
    "output_settings": {
        "type": "snowflake",
        "table": "OUTPUT_TABLE",
        "schema": "MY_SCHEMA",
        "external_stage": "MY_S3_STAGE_IN_SNOWFLAKE",
        "statement_type": "insert"
    },
    "passthrough_columns_set": "all",
    "include_probabilities": true,
    "jdbc_settings": {
        "url": "jdbc:snowflake://your_account.snowflakecomputing.com/?warehouse=SOME_WH&db=MY_DB&schema=MY_SCHEMA",
        "class_name": "net.snowflake.client.jdbc.SnowflakeDriver",
        "driver_path": "/tmp/portable_batch_predictions/jdbc/snowflake-jdbc-3.12.0.jar",
        "template_name": "Snowflake"
    }
}
```

`credentials.env` file:

```
# Snowflake creds for JDBC connectivity
SNOWFLAKE_USERNAME=TEST
SNOWFLAKE_PASSWORD=SECRET
# AWS creds needed to access external stage
AWS_ACCESS_KEY_ID=XXXXXXXXXXXX
AWS_SECRET_ACCESS_KEY=XXXXXXXXXXX
```

```
#!/bin/bash

docker run --rm \
    --network host \
    -v /host/filesystem/path/portable_batch_predictions:/tmp/portable_batch_predictions \
    --env-file /host/filesystem/path/credentials.env \
    datarobot/datarobot-portable-prediction-api:<version> batch \
        /tmp/portable_batch_predictions/job_definition.json
```

### Time series scoring over Azure Blob with multi-model mode PPS

`job_definition.json` file:

```
{
    "prediction_endpoint": "http://127.0.0.1:8080",
    "deployment_id": "euro_date_ts_mlpkg",
    "intake_settings": {
        "type": "azure",
        "url": "https://batchpredictionsdev.blob.core.windows.net/datasets/euro_date.csv",
        "format": "csv"
    },
    "output_settings": {
        "type": "azure",
        "url": "https://batchpredictionsdev.blob.core.windows.net/results/output_ts.csv",
        "format": "csv"
    },
    "timeseries_settings":{
        "type": "forecast",
        "forecast_point": "2007-11-14",
        "relax_known_in_advance_features_check": true
    }
}
```

`credentials.env` file:

```
# Azure Blob connection string
AZURE_CONNECTION_STRING='DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=XXX;EndpointSuffix=core.windows.net'
```

```
#!/bin/bash

docker run --rm \
    --network host \
    -v /host/filesystem/path/portable_batch_predictions:/tmp/portable_batch_predictions
    --env-file /host/filesystem/path/credentials.env
    datarobot/datarobot-portable-prediction-api:<version> batch \
        /tmp/portable_batch_predictions/job_definition.json
```