Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Portable Prediction Server

The Portable Prediction Server (PPS) is a DataRobot execution environment for DataRobot model packages (.mlpkg files) distributed as a self-contained Docker image.

Availability information

The Portable Prediction Server is a feature exclusive to DataRobot MLOps. Contact your DataRobot representative for information on enabling it.

After you configure the Portable Prediction Server, you can begin running portable batch prediction jobs.

Configure the Portable Prediction Server

The sections below describe the steps to configure the Portable Prediction Server for a deployment.

Note

In order to set up the Portable Prediction Server, you must first add an external prediction environment.

The general configuration steps are:

  • Download the Docker image.
  • Download the model package.
  • Copy the Docker snippet DataRobot provides to run the Portable Prediction Server in your Docker container.

To configure the Portable Prediction Server, open a deployment on the Deployments tab (the deployment inventory) and navigate to the Predictions > Portable Predictions tab:

Element Description
Portable Prediction Server Helps you configure a REST API-based prediction server as a Docker image.
Portable Prediction Server Usage Links to the Developer Tools tab where you can obtain the Portable Prediction Server Docker image.
Download model package (.mlpkg) Downloads the model package for your deployed model. Alternatively, you can download the model package from the Leaderboard.
Docker snippet After you download your model package, use the Docker snippet to launch the Portable Prediction Server for the model with monitoring enabled. You will need to specify your API key, local filenames, paths, and monitoring before launching.
Copy to clipboard Copies the Docker snippet to your clipboard so that you can paste it on the command line.

After completing the set up you can use the Docker snippet to run portable batch predictions. See also additional examples for prediction jobs using PPS.

See also these PPS topics:

PPS can be run disconnected from main installation environments. Once started, the image serves HTTP API via the :8080 port.

Obtain the PPS Docker image

Navigate to the Develop Tools tab to download the Portable Prediction Server Docker image. Depending on your DataRobot environment and version, options for accessing the latest image may differ, as described in the table below.

Deployment type Software version Access method
On-premise or private/hybrid cloud v6.3 or older Contact your DataRobot representative. The image will be provided upon request.
On-premise or private/hybrid cloud v7.0 or later Download the image from Developer Tools; install as described below. If the image is not available contact your DataRobot representative.
Managed AI Cloud Jan 2021 and later Download the image from Developer Tools; install as described below.

Load the image to Docker

Warning

DataRobot is working to reduce image size; however, the compressed Docker image can exceed 6GB (Docker-loaded image layers can exceed 14GB). Consider these sizes when downloading and importing PPS images.

Before proceeding, download the image from Developer Tools. It is a tar-gzipped file that can be loaded by Docker.

Once downloaded and the file checksum is verified, use docker load to load the image. Note that you do not have to uncompress the downloaded file because Docker supports loading tar-gzipped images natively:

docker load < datarobot-portable-prediction-api-7.0.0-r1736.tar.gz
b990432f3b4b: Loading layer [==================================================>]    343kB/343kB
789a8e70b147: Loading layer [==================================================>]   5.12kB/5.12kB
7db3df02f5f7: Loading layer [==================================================>]  1.775MB/1.775MB
4742b98a91bf: Loading layer [==================================================>]  4.096kB/4.096kB
112ec58ef2cf: Loading layer [==================================================>]  4.165GB/4.165GB
5f9c034e5766: Loading layer [==================================================>]  9.728kB/9.728kB
2f4cf6c33976: Loading layer [==================================================>]  71.17kB/71.17kB
01812df8f7a8: Loading layer [==================================================>]  3.072kB/3.072kB
87bfa8db37b4: Loading layer [==================================================>]  5.978MB/5.978MB
993ea9726377: Loading layer [==================================================>]  1.958GB/1.958GB
e2d9aaaf8de6: Loading layer [==================================================>]  70.66kB/70.66kB
005786d38507: Loading layer [==================================================>]  4.096kB/4.096kB
85b6c0dfcb4e: Loading layer [==================================================>]  4.096kB/4.096kB
f0524ee4003d: Loading layer [==================================================>]  3.584kB/3.584kB
92f123a1e860: Loading layer [==================================================>]  3.072kB/3.072kB
Loaded image: datarobot/datarobot-portable-prediction-api:7.0.0-r1736

Optionally, to save disk space, delete the compressed image archive datarobot-portable-prediction-api-<version>.tar.gz after successful load.

Download the model package

You can download a model package (.mlpkg files) for a DataRobot model running on remote prediction environments directly from the model's deployment or via the model Leaderboard. You can then run prediction jobs with a portable prediction server outside of DataRobot.

Deployment download

Before proceeding, ensure that your deployment supports model package downloads. The deployment must have a DataRobot build environment and an external prediction environment, which you can verify using the Governance Lens in the deployment inventory:

In the Predictions > Portable Predictions tab. Click Download model package. The download appears in the downloads bar when complete.

Once downloaded, use the provided code snippet to launch the Portable Prediction Server with the downloaded model package.

The Portable Prediction Server will monitor your model for performance and track prediction statistics. Copy the snippet and specify your API key, local file names, and file paths.

Leaderboard download

Availability information

The ability to download a model package from the Leaderboard depends on the MLOps configuration for your organization.

If you have built a model with AutoML and want to download its model package for use with the Portable Prediction Server, navigate to the model in the Leaderboard and select the Predict > Portable Predictions tab.

Click Download .mlpkg. Once downloaded, use the provided code snippet to launch the Portable Prediction Server with the downloaded model package.

Running modes

There are two model modes supported by the server: single-model (SM) and multi-model (MM). Use SM mode when only a single model package has been mounted into the Docker container inside the /opt/ml/model directory. Use MM mode in all other cases. Despite being compatible predictions-wise, SM mode provides a simplified HTTP API that does not require a model package to be identified on disk and preloads a model into memory on start.

The Docker container Filesystem directory should match the following layouts.

For SM mode:

/opt/ml/model/
└── model_5fae9a023ba73530157ebdae.mlpkg

For MM mode:

/opt/ml/model/
├── fraud
|   └── model_5fae9a023ba73530157ebdae.mlpkg
└── revenue
    ├── config.yml
    └── revenue-estimate.mlpkg

HTTP API (single-model)

When running in single-model mode, the Docker image exposes three HTTP endpoints:

  • POST /predictions scores a given dataset.
  • GET /info returns information about the loaded model.
  • GET /ping ensures the tech stack is up and running.

Note

Prediction routes only support comma delimited CSV and JSON records scoring datasets. The maximum payload size is 50 MB.

curl -X POST http://<ip>:8080/predictions \
    -H "Content-Type: text/csv" \
    --data-binary @path/to/scoring.csv
{
  "data": [
    {
      "predictionValues": [
        {"value": 0.250833758, "label": "yes"},
        {"value": 0.749166242, "label": "no"},
      ],
      "predictionThreshold": 0.5,
      "prediction": 0.0,
      "rowId": 0
    }
  ]
}

If CSV is the preferred output, request it using the Accept: text/csv HTTP header.

curl -X POST http://<ip>:8080/predictions \
    -H "Accept: text/csv" \
    -H "Content-Type: text/csv" \
    --data-binary @path/to/scoring.csv
<target>_yes_PREDICTION,<target>_no_PREDICTION,<target>_PREDICTION,THRESHOLD,POSITIVE_CLASS
0.250833758,0.749166242,0,0.5,yes

HTTP API (multi-model)

In multi-model mode, the Docker image exposes the following endpoints:

  • POST /deployments/:id/predictions scores a given dataset.
  • GET /deployments/:id/info returns information about the loaded model.
  • POST /deployments/:id uploads a model package to the container.
  • DELETE /deployments/:id deletes a model package from the container.
  • GET /deployments returns a list of model packages that are in the container.
  • GET /ping ensures the tech stack is up and running.

The :id included in the /deployments routes above refers to the unique identifier for model packages on the disk. The ID is the directory name containing the model package. Therefore, if you have the following /opt/ml/model layout:

/opt/ml/model/
├── fraud
|   └── model_5fae9a023ba73530157ebdae.mlpkg
└── revenue
    ├── config.yml
    └── revenue-estimate.mlpkg

You may use fraud and revenue instead of :id in the /deployments set of routes.

Note

Prediction routes only support comma delimited CSV and JSON records scoring datasets. The maximum payload size is 50 MB.

curl -X POST http://<ip>:8080/deployments/revenue/predictions \
    -H "Content-Type: text/csv" \
    --data-binary @path/to/scoring.csv
{
  "data": [
    {
      "predictionValues": [
        {"value": 0.250833758, "label": "yes"},
        {"value": 0.749166242, "label": "no"},
      ],
      "predictionThreshold": 0.5,
      "prediction": 0.0,
      "rowId": 0
    }
  ]
}

Monitoring

Note

Before proceeding, be sure to configure monitoring for the PPS container. See the Environment Variables and Examples sections for details. To use the monitoring agent, you need to configure the agent spoolers as well.

You can monitor prediction statistics such as data drift and accuracy by creating an external deployment in DataRobot's deployment inventory.

In order to connect your model package to a certain deployment, provide the deployment ID of the deployment you want to host your prediction statistics.

If you're in Single Model (SM) mode, the deployment ID has to be provided via the MLOPS_DEPLOYMENT_ID environment variable. In Multi Model (MM) mode, a special config.yml should be prepared and dropped alongside the model package with the desired deployment_id value:

deployment_id: 5fc92906ad764dde6c3264fa

If you want to track accuracy, configure it for the deployment, and then provide extra settings for the running model:

For SM mode, set the following environment variables:

  • MLOPS_ASSOCIATION_ID_COLUMN=transaction_country (required)
  • MLOPS_ASSOCIATION_ID_ALLOW_MISSING_VALUES=false (optional, default=false)

For MM mode, set the following properties in config.yml:

association_id_settings:
  column_name: transaction_country
  allow_missing_values: false

HTTPS support

Availability information

If you are running PPS images that were downloaded previously, these parameters will not be available until the PPS image is manually updated:

  • Managed AI Cloud: starting Aug 2021
  • On-premise or private/hybrid cloud: starting v7.2

By default, PPS serves predictions over an insecure listener on port 8080 (clear text HTTP over TCP). You can also serve predictions over a secure listener port 8443 (HTTP over TLS/SSL, or simply HTTPS). When the secure listener is enabled, the insecure listener becomes unavailable.

Note

You cannot configure PPS to be available on both ports simultaneously; it is either HTTP on 8080 or HTTPS on 8443.

The configuration is accomplished using the environment variables described below:

  • PREDICTION_API_TLS_ENABLED: The master flag that enables HTTPS listener on port 8443 and disables HTTP listener on port 8080.

    • Default: false (HTTPS disabled)
    • Valid values (case-insensitive):

      Parameter value Interpretation
      true, t, yes, y, 1 true
      false, f, no, n, 0 false

    Note

    The flag value must be interpreted as true to enable TLS. All other PREDICTION_API_TLS_* environment variables (if passed) are ignored if this setting is not enabled.

  • PREDICTION_API_TLS_CERTIFICATE: PEM-formatted content of the TLS/SSL certificate.

  • PREDICTION_API_TLS_CERTIFICATE_KEY: PEM-formatted content of the secret certificate key of the TLS/SSL certificate key.

  • PREDICTION_API_TLS_CERTIFICATE_KEY_PASSWORD: Passphrase for the secret certificate key passed in PREDICTION_API_TLS_CERTIFICATE_KEY.

    • Required: Yes, only if a certificate key was created with a passphrase.
  • PREDICTION_API_TLS_PROTOCOLS: Encryption protocol implementation(s) to use.

    • Default: TLSv1.2 TLSv1.3
    • Valid values: SSLv2|SSLv3|TLSv1|TLSv1.1|TLSv1.2|TLSv1.3, or any space-separated combination of these values.

    Warning

    As of August 2021, all implementations except TLSv1.2 and TLSv1.3 are considered deprecated and/or insecure. DataRobot highly recommends using only these implementations. New installations may consider using TLSv1.3 exclusively as it is the most recent and secure TLS version.

  • PREDICTION_API_TLS_CIPHERS: List of cipher suites to use.

    Warning

    TLS support is an advanced feature. The cipher suites list has been carefully selected to follow the latest recommendations and current best practices. DataRobot does not recommend overriding it.

Environment variables

Variable Description Default
PREDICTION_API_WORKERS Sets the number of workers to spin up. This option controls the number of HTTP requests the Prediction API can process simultaneously. Typically, set this to the number of CPU cores available for the container. 1
PREDICTION_API_MODEL_REPOSITORY_PATH Sets the path to the directory where DataRobot should look for model packages. If the PREDICTION_API_MODEL_REPOSITORY_PATH points to a directory containing a single model package in its root, the single-model running mode is assumed by PPS. Multi-model mode is assumed otherwise. /opt/ml/model/
PREDICTION_API_PRELOAD_MODELS_ENABLED Requires every worker to proactively preload all mounted models on start. This should help to eliminate the problem of cache misses for the first requests after the server starts and the cache is still "cold." See also PREDICTION_API_SCORING_MODEL_CACHE_MAXSIZE to completely eliminate the cache misses.
  • false for multi-model mode
  • true to single-model mode
PREDICTION_API_SCORING_MODEL_CACHE_MAXSIZE The maximum number of scoring models to keep in each worker's RAM cache to avoid loading them on demand for each request. In practice, the default setting is low. If the server running PPS has enough RAM, you should set this to a value greater than the total number of premounted models to fully leverage caching and avoid cache misses. Note that each worker's cache is independent, so each model will be copied to each worker's cache. Also consider enabling PREDICTION_API_PRELOAD_MODELS_ENABLED for multi-model mode to avoid cache misses. 4
PREDICTION_API_DEPLOYED_MODEL_RESOLVER_CACHE_TTL_SEC By default, the PPS will periodically attempt to read deployment information from an mplkg in case the package was re-uploaded via HTTP. If you are not planning to update the mplkg after the PPS starts, consider setting this to 0 to disable deployment info cache invalidation. This will help reduce latency for some requests. 60
PREDICTION_API_MONITORING_ENABLED Sets whether DataRobot offloads data monitoring. If true, the Prediction API will offload monitoring data to the monitoring agent. false
PREDICTION_API_MONITORING_SETTINGS Controls how to offload monitoring data from the Prediction API to the monitoring agent. Specify a list of spooler configuration settings in key=value pairs separated by semicolons.

Example for a Filesystem spooler:
PREDICTION_API_MONITORING_SETTINGS="spooler_type=filesystem;directory=/tmp;max_files=50;file_max_size=102400000"

Example for an SQS spooler:
PORTABLE_PREDICTION_API_MONITORING_SETTINGS="spooler_type=sqs;sqs_queue_url=<SQS_URL>"

For single-model mode of PPS, the MLOPS_DEPLOYMENT_ID and MLOPS_MODEL_ID variables are required; they are not required for multi-model mode.
None
MONITORING_AGENT Sets whether the monitoring agent runs alongside the Prediction API. To use the monitoring agent, you need to configure the agent spoolers. false
MONITORING_AGENT_DATAROBOT_APP_URL Sets the URI to the DataRobot installation (e.g., https://app.datarobot.com). None
MONITORING_AGENT_DATAROBOT_APP_TOKEN Sets a user token to be used with the DataRobot API. None
PREDICTION_API_TLS_ENABLED Sets the TLS listener master flag. Must be activated for the TLS listener to work. false
PREDICTION_API_TLS_CERTIFICATE Adds inline content of the certificate, in PEM format. None
PREDICTION_API_TLS_CERTIFICATE_KEY Adds inline content of the certificate key, in PEM format. None
PREDICTION_API_TLS_CERTIFICATE_KEY_PASSWORD Adds plaintext passphrase for the certificate key file. None
PREDICTION_API_TLS_PROTOCOLS Overrides the TLS/SSL protocols. TLSv1.2 TLSv1.3
PREDICTION_API_TLS_CIPHERS Overrides default cipher suites. Mandatory TLSv1.3, recommended TLSv1.2
PREDICTION_API_RPC_DUAL_COMPUTE_ENABLED Requires that the PPS run Python2 and Python3 interpreters. Then, the PPS automatically determines the version requirement based on which Python version the model was trained on. When this setting is enabled, PYTHON3_SERVICES is redundant and ignored. Note that this requires additional RAM to run both versions of the interpreter. True
PYTHON3_SERVICES Only enable this setting when the PREDICTION_API_RPC_DUAL_COMPUTE_ENABLED setting is disabled and each model was trained on Python3. You can save approximately 400MB of RAM by excluding the Python2 interpreter service from the container. None

Important

As of June 2022, the PPS runs in "dual-compute mode" by default, running both Python2 and Python3 interpreter services and automatically routing prediction requests to the appropriate interpreter. As a result of this change, setting PYTHON3_SERVICES to true is no longer necessary to support Python3 models; however, this configuration requires an extra 400MB of RAM. If you want to reduce the RAM footprint (and all models are either Python2 or Python3), you can disable dual-compute mode (PREDICTION_API_RPC_DUAL_COMPUTE_ENABLED='false'). Next, if all models are trained on Python3, enable Python3 services (PYTHON3_SERVICES='true''). If all models are trained on Python2, there is no need to configure an additional environment variable, as the default interpreter is still Python2.

Request parameters

Headers

The PPS does not support authorization; therefore, Datarobot-key and Authorization are not needed.

Key Type Description Example(s)
Content-Type string Required. Defines the request format.
  • textplain; charset=UTF-8
  • text/csv
  • application/JSON
  • multipart/form-data (For files with data, i.e., .csv, .txt files)
Content-Encoding string Optional. Currently supports only gzip-encoding with the default data extension. gzip
Accept string Optional. Controls the shape of the response schema. Currently JSON (default) and CSV are supported. See examples.
  • application/json (default)
  • text/csv (for CSV output)

Query arguments

The predictions routes (POST /predictions (single-model mode) and POST /deployments/:id/predictions) have the same query arguments and HTTP headers as their standard route counterparts, with a few exceptions. As with regular Dedicated Predictions API, the exact list of supported arguments depends on the deployed model. Below is the list of general query arguments supported by every deployment.

Key Type Description Example(s)
passthroughColumns list of strings Optional. Controls which columns from a scoring dataset to expose (or to copy over) in a prediction response.

The request may contain zero, one, or more columns. (There’s no limit on how many column names you can pass.) Column names must be passed as UTF-8 bytes and must be percent-encoded (see the HTTP standard for this requirement). Make sure to use the exact name of a column as a value.
/v1.0/deployments/<deploymentId>/predictions?passthroughColumns=colA&passthroughColumns=colB
passthroughColumnsSet string Optional. Controls which columns from a scoring dataset to expose (or to copy over) in a prediction response. The only possible option is all and, if passed, all columns from a scoring dataset are exposed. /v1.0/deployments/deploymentId/predictions?passthroughColumnsSet=all
decimalsNumber integer Optional. Configures the precision of floats in prediction results. Sets the number of digits after the decimal point.

If there are no digits after the decimal point, rather than adding zeros, the float precision will be less than decimalsNumber.
?decimalsNumber=15

Note the following:

  • You can't pass the passthroughColumns and passthroughColumnsSet parameters in the same request.
  • While there is no limit on the number of column names you can pass with the passthroughColumns query parameter, there is a limit on the size of the HTTP request line (currently 8192 bytes).

Prediction Explanation parameters

You can parametrize the Prediction Explanations prediction request with the following query parameters:

Note

To trigger prediction explanations maxExplanations=N, where N is greater than 0 must be sent.

Key Type Description Example(s)
maxExplanations int OR string Optional. Limits the number of explanations returned by server. Previously called maxCodes (deprecated). For SHAP explanations only a special constant all is also accepted.
  • ?maxExplanations=5
  • ?maxExplanations=all
thresholdLow float Optional. Prediction Explanation low threshold. Predictions must be below this value (or above the thresholdHigh value) for Prediction Explanations to compute. ?thresholdLow=0.678
thresholdHigh float Optional. Prediction Explanation high threshold. Predictions must be above this value (or below the thresholdLow value) for Prediction Explanations to compute. ?thresholdHigh=0.345
excludeAdjustedPredictions bool Optional. Includes or excludes exposure-adjusted predictions in prediction responses if exposure was used during model building. The default value is true (exclude exposure-adjusted predictions). ?excludeAdjustedPredictions=true
explanationNumTopClasses int Optional. Multiclass models only;

Number of top predicted classes for each row that will be explained. Only for multiclass explanations. Defaults to 1. Mutually exclusive with explanationClassNames.
?explanationNumTopClasses=5
explanationClassNames list of string types Optional. Multiclass models only. A list of class names that will be explained for each row. Only for multiclass explanations. Class names must be passed as UTF-8 bytes and must be percent-encoded (see the HTTP standard for this requirement). This parameter is mutually exclusive with explanationNumTopClasses. By default, explanationNumTopClasses=1 is assumed. ?explanationClassNames=classA&explanationClassNames=classB

Time series parameters

You can parametrize the time series prediction request using the following query parameters:

Key Type Description Example(s)
forecastPoint ISO-8601 string An ISO 8601 formatted DateTime string, without timezone, representing the forecast point. This parameter cannot be used if predictionsStartDate and predictionsEndDate are passed. ?predictionsStartDate=2013-12-20T01:30:00Z
relaxKnownInAdvanceFeaturesCheck bool true or false. When true, missing values for known-in-advance features are allowed in the forecast window at prediction time. The default value is false. Note that the absence of known-in-advance values can negatively impact prediction quality. ?relaxKnownInAdvanceFeaturesCheck=true
predictionsStartDate ISO-8601 string The time in the dataset when bulk predictions begin generating. This parameter must be defined together with predictionsEndDate. The forecastPoint parameter cannot be used if predictionsStartDate and predictionsEndDate are passed. ?predictionsStartDate=2013-12-20T01:30:00Z&predictionsEndDate=2013-12-20T01:40:00Z
predictionsEndDate ISO-8601 string The time in the dataset when bulk predictions stop generating. This parameter must be defined together with predictionsStartDate. The forecastPoint parameter cannot be used if predictionsStartDate and predictionsEndDate are passed. See above.

External configuration

You can also use the Docker image to read and set the configuration options listed in the table above (from /opt/ml/config). The file must contain <key>=<value> pairs, where each key name is a corresponding environment variable.

Examples

  1. Run with two workers:

    docker run \
        -v /path/to/mlpkgdir:/opt/ml/model \
        -e PREDICTION_API_WORKERS=2 \
        -e PREDICTION_API_SCORING_MODEL_CACHE_MAXSIZE=32 \
        -e PREDICTION_API_PRELOAD_MODELS_ENABLED='true' \
        -e PREDICTION_API_DEPLOYED_MODEL_RESOLVER_CACHE_TTL_SEC=0 \
        datarobot/datarobot-portable-prediction-api:<version>
    
  2. Run with external monitoring configured:

    docker run \
        -v /path/to/mlpkgdir:/opt/ml/model \
        -e PREDICTION_API_MONITORING_ENABLED='true' \
        -e PREDICTION_API_MONITORING_SETTINGS='<settings>' \
        datarobot/datarobot-portable-prediction-api:<version>
    
  3. Run with internal monitoring configured:

    docker run \
        -v /path/to/mlpkgdir:/opt/ml/model \
        -e PREDICTION_API_MONITORING_ENABLED='true' \
        -e PREDICTION_API_MONITORING_SETTINGS='<settings>' \
        -e MONITORING_AGENT='true' \
        -e MONITORING_AGENT_DATAROBOT_APP_URL='https://app.datarobot.com/' \
        -e MONITORING_AGENT_DATAROBOT_APP_TOKEN='<token>' \
        datarobot/datarobot-portable-prediction-api:<version>
    
  4. Run with HTTPS support using default protocols and ciphers:

    docker run \
        -v /path/to/mlpkgdir:/opt/ml/model \
        -p 8443:8443 \
        -e PREDICTION_API_TLS_ENABLED='true' \
        -e PREDICTION_API_TLS_CERTIFICATE="$(cat /path/to/cert.pem)" \
        -e PREDICTION_API_TLS_CERTIFICATE_KEY="$(cat /path/to/key.pem)" \
        datarobot/datarobot-portable-prediction-api:<version>
    
  5. Run with Python3 interpreter only to minimize RAM footprint:

    docker run \
        -v /path/to/my_python3_model.mlpkg:/opt/ml/model \
        -e PREDICTION_API_RPC_DUAL_COMPUTE_ENABLED='false' \
        -e PYTHON3_SERVICES='true' \
        datarobot/datarobot-portable-prediction-api:<version>
    
  6. Run with Python2 interpreter only to minimize RAM footprint:

    docker run \
        -v /path/to/my_python2_model.mlpkg:/opt/ml/model \
        -e PREDICTION_API_RPC_DUAL_COMPUTE_ENABLED='false' \
        datarobot/datarobot-portable-prediction-api:<version>
    

Updated June 28, 2022
Back to top