Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Schedule recurring batch prediction jobs

You might want to make a one-time batch prediction, but you might also want to schedule regular batch prediction jobs. This section shows how to create and schedule batch prediction jobs.

Create a prediction job definition

Job definitions are flexible templates for creating batch prediction jobs. You can store definitions inside DataRobot and run new jobs with a single click, API call, or automatically via a schedule. Scheduled jobs do not require you to provide connection, authentication, and prediction options for each request.

To create a job definition for a deployment, navigate to the Predictions > Job Definitions tab. The following table describes the information and actions available on the New Prediction Job Definition tab.

Field name Description
Prediction job definition name Enter the name of the prediction job that you are creating for the deployment.
Prediction source Set the source type and define the connection for the data to be scored.
Prediction options Configure the prediction options.
Time series options Specify and configure a time series prediction method.
Prediction destination Indicate the output destination for predictions. Set the destination type and define the connection.
Jobs schedule Toggle whether to run the job immediately and whether to schedule the job.
Save prediction job definition Click this button to save the job definition. The button changes to Save and run prediction job definition if the Run this job immediately toggle is turned on. Note that this button is disabled if there are validation errors.

Once fully configured, click Save prediction job definition (or Save and run prediction job definition if Run this job immediately is enabled.

Note

Completing the New Prediction Job Definition tab configures the details required by the Batch Prediction API. Reference the Batch Prediction API documentation for details.

Set up prediction sources

Select a prediction source (also called an intake adapter):

To set a prediction source, complete the appropriate authentication workflow for the source type.

For AI Catalog sources, the job definition displays the modification date, the user that set the source, and a badge that represents the state of the asset (in this case, STATIC).

After you set your prediction source, DataRobot validates that the data is applicable for the deployed model:

Note

DataRobot validates that a data source is applicable with the deployed model when possible but not in all cases. DataRobot validates for AI Catalog, most JDBC connections, Snowflake, and Synapse.

Source connection types

Select a connection type below to view field descriptions.

Note

When browsing for connections, invalid adapters are not shown.

Database connections

Cloud Storage Connections

Data Warehouse Connections

Other

For information about supported data sources, see Data sources supported for batch predictions.

Set prediction options

Specify what information to include in the prediction results:

Element Description
Include input features Writes input features to the prediction results file alongside predictions. To add specific features, enable the Include input features toggle, select Specific features, and type feature names to filter for and then select features. To include every feature from the dataset, select All features. You can only append a feature (column) present in the original dataset, although the feature does not have to have been part of the feature list used to build the model. Derived features are not included.
Include prediction explanations Adds columns for prediction explanations to your prediction output.
Include prediction outlier warning Includes warnings for outlier prediction values (only available for regression model deployments).
Track data drift, accuracy, and fairness for predictions Tracks data drift, accuracy, and fairness (if enabled for the deployment).
Chunk size Adjusts the chunk size selection strategy. By default, DataRobot automatically calculates the chunk size; only modify this setting if advised by your DataRobot representative. For more information, see What is chunk size?
Concurrent prediction requests Limits the number of concurrent prediction requests. By default, prediction jobs utilize all available prediction server cores. To reserve bandwidth for real-time predictions, set a cap for the maximum number of concurrent prediction requests.
Include prediction status Adds a column containing the status of the prediction.
Use default prediction instance Lets you change the prediction instance. Turn the toggle off to select a prediction instance.
What is chunk size?

The batch prediction process chunks your data into smaller pieces and scores those pieces one by one, allowing DataRobot to score large batches. The Chunk size setting determines the strategy DataRobot uses to chunk your data. DataRobot recommends the default setting of Auto chunking, as it performs the best overall; however, other options are available:

  • Fixed: DataRobot identifies an initial, effective chunk size and continues to use it for the rest of the model scoring process.

  • Dynamic: DataRobot increases the chunk size while model scoring speed is acceptable and decreases the chunk size if the scoring speed falls.

  • Custom: A data scientist sets the chunk size, and DataRobot continues to use it for the rest of the model scoring process.

Set time series options

Configure the Time series options by choosing the prediction method: forecast point or forecast range.

  • Select Forecast point to choose the specific date from which you want to begin making predictions.

    • If you choose Automatically, DataRobot selects the forecast point for you based on the scoring data.
    • If you choose Manual, you can select the forecast point date.

  • Select Forecast range if you intend to make bulk, historical predictions (instead of forecasting future rows from the forecast point). By default, predictions use all forecast distances within the selected time range. Alternatively, you can specify a specific date range using the date selector.

Set up prediction destinations

Select a prediction destination (also called an output adapter):

Complete the appropriate authentication workflow for the destination type.

Destination connection types

Select a connection type below to view field descriptions.

Note

When browsing for connections, invalid adapters are not shown.

Database connections

Cloud Storage Connections

Data Warehouse Connections

Other

Schedule prediction jobs

You can schedule prediction jobs to run automatically on a schedule. When outlining a job definition, toggle the jobs schedule on. Specify the frequency (daily, hourly, monthly, etc.) and time of day to define the schedule on which the job runs.

For further granularity, select Use advanced scheduler. You can specify the exact time for the prediction job to run, down to the minute.

After setting all applicable options, click Save prediction job definition.

Manage prediction job definitions

To view and manage the job definitions, select a deployment on the Deployments tab and navigate to the Predictions > Job Definitions tab.

Click the action menu for a job definition and select one of the actions described below:

Element Description
View job history Displays the Deployments > Prediction Jobs tab listing all prediction jobs generated from the job definition.
Run now Runs the job definition immediately. Go to the Deployments > Prediction Jobs tab to view progress.
Edit definition Displays the job definition so that you can update and save it.
Disable definition Suspends a job definition. Any scheduled batch runs from the job definition are suspended. From the action menu of a job definition, click Disable definition. After you select Disable definition, the menu item becomes Enable definition. Click Enable definition to re-enable batch runs from this job description.
Clone definition Creates a new job definition populated with the values from an existing job definition. From the action menu of the existing job definition, click Clone definition, update the fields as needed, and click Save prediction job definition. Note that the Jobs schedule settings are turned off by default.
Delete definition Deletes the job definition. Click Delete definition, and in the confirmation window, click Delete defintion again. All scheduled jobs are cancelled.

Manage prediction jobs

To view prediction jobs, navigate to Deployments > Prediction Jobs. You can view all jobs that are currently running or have completed. Any predictions made on deployments appear on this page. Filter jobs by status, type, start and end time, deployment, job definition ID, job ID, and prediction environment.

The following table describes the information displayed in the Prediction Jobs list.

Category Description
Job definition The job definition used to create the prediction job.
Job type Specifies the type of job—Make Predictions, Scheduled Run, Manual Run, Integration, Ad hoc API, Insights, Portable, and Challengers.
Added to queue Time at which the prediction job was initialized.
Created by User who triggered the job.
Status State of the job.
Source Intake adapter for this prediction job.
Destination Output adapter for the prediction job.

To manage a prediction job, select from the action menu on the right:

Element Definition When to use
View logs Displays the log in progress and lets you copy the log to your clipboard. Jobs that do not use streaming intake
Run again Restarts the run. Jobs that have finished running
Go to deployment Opens the Overview tab for the deployment. Any job—completed successfully, aborted, or in progress
Edit job definition Opens the Edit Prediction Job Definition tab. Update and save the job definition. Any job
Create job definition Creates a new job definition populated with the settings from the existing prediction job. The new job definition displays, and you can edit and save it. (Alternatively, you can select the Clone definition command for a job on the Job Definitions tab.) Any job—except Challenger jobs

Filter prediction jobs

To filter the prediction jobs:

  1. Select the Filter link on the Prediction Jobs tab:

  2. Set filters and click Apply filters. Click Clear filters to reset the fields.

    Element Description
    Status Select job status types to filter by: Queued, Running, Succeeded, Aborted, and Failed.
    Job type Select types of jobs to filter by:
    Added to queue Filter by a time range: Before or After a date you select.
    Deployment Select a deployment to filter by. Start typing and select a deployment from the dropdown list.
    Job Definition ID Filter by the jobs generated from a specific job definition. Start typing and select a job definition ID from the dropdown list.
    Prediction Job ID Enter a specific prediction job ID.
    Prediction Environment Select from your configured prediction environments.

Updated August 1, 2022
Back to top