Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Make batch predictions with deployed models

Use the Deployments > Make Predictions tab to efficiently score datasets with a deployed model by making batch predictions. Batch predictions are a method of making predictions with large datasets, in which you pass input data and get predictions for each row; predictions are written to output files. You can also schedule batch prediction by specifying the prediction data source and destination and determining when the predictions will be run. For more information about batch predictions and how to use them, reference the batch prediction API documentation.

This section explains how to make batch predictions for standard AutoML deployments and for time series deployments.

Make predictions with a deployment

To make batch predictions with a deployed model:

  1. Navigate to the deployment's Predictions > Make Predictions tab.

  2. Upload the data to be scored by the model. You can drag-and-drop a file onto the screen, click Choose File to browse locally, or select a dataset stored in the AI Catalog.

    Note

    When uploading a prediction dataset, it is automatically stored in the AI Catalog after it is fully uploaded. Be sure to not navigate away from the page before the upload is complete, or the dataset will not be stored in the Catalog. If the dataset is still processing after the upload, then DataRobot is running EDA on it before it becomes available for use.

  3. Once the file is uploaded, configure the Prediction options.

    You can choose to:

    • Add input features. These are columns from the prediction dataset that, once selected, write to the output file alongside predictions. You can only append a column that was present in the original dataset, although the column does not have to have been part of the feature list used to build the model. Enter the column name and select the feature to include it in the prediction results.
    • Include prediction explanations.
    • Choose to include warnings for outlier prediction values (only available for regression model deployments).
    • Choose to track data drift and accuracy (if enabled for the deployment).
  4. Once configured, click Compute and download predictions to start scoring the data. When scoring completes, the predictions become available for download for the next 48 hours.

Tip

If you choose to abandon a prediction job, click the orange "X" while the job is running to cancel it. Once cancelled, you can click the arrow to view the logs for the job.

Make predictions with a time series deployment

To make batch predictions with a deployed time-series model:

  1. Navigate to the deployment's Predictions > Make Predictions tab.

  2. Upload the data to be scored by the model. You can drag-and-drop a file onto the screen, click Choose File to browse locally, or select a dataset stored in the AI Catalog.

  3. After uploading the scoring data, configure the Time Series options by choosing the prediction method: forecast point or forecast range.

    • Select Forecast point to choose the specific date from which you want to begin making predictions. You can select the automatically determined forecast point (chosen by DataRobot based on the scoring data) or select Manually to expose a date selector and choose a date.

    • Select Forecast range if you intend to make bulk, historical predictions (instead of forecasting future rows from the forecast point). By default, predictions will use all forecast distances within the selected time range. Alternatively, you can specify a specific date range using the date selector.

  4. Configure the Prediction options.

    You can choose to:

    • Add input features. These are columns from the prediction dataset that, once selected, write to the output file alongside predictions. You can only append a column that was present in the original dataset, although the column does not have to have been part of the feature list used to build the model. To add, toggle the option on and begin typing to enter specific feature names. Or, select All features to include every feature from the dataset. Note that derived features are not included.
    • Include Prediction Explanations.
    • Choose to include warnings for outlier prediction values (only available for regression model deployments).
    • Choose to track data drift and accuracy (if enabled for the deployment).

    You can also access advanced options for the prediction:

    • Adjust the chunk size. The size is automatically calculated for each prediction, and DataRobot only recommends modifying this setting if advised by your DataRobot representative.
    • Limit the number of concurrent prediction requests. By default, prediction jobs utilize all available prediction server cores. To reserve bandwidth for real-time predictions, set a cap for the maximum number of concurrent prediction requests.
  5. Once configured, click Compute and download predictions to start scoring the data. When scoring completes, the predictions become available for download for the next 48 hours.

Considerations

  • The Make Predictions tab is not available for external deployments.

  • If you score larger datasets (up to 5GB), there will be a longer wait time for the predictions to become available, as multiple prediction jobs must be run. If you choose to navigate away from the predictions interface, the jobs will continue to run.

  • If you want to write back your predictions to a cloud location or database, you must use the Prediction API.


Updated December 6, 2021
Back to top