MLOps > Deployment

Deployment¶

With MLOps, the goal is to make model deployment easy. Regardless of your role—a business analyst, data scientist, data engineer, or member of an Operations team— you can easily create a deployment in MLOps. Deploy models built in DataRobot and those written in various programming languages like Python and R.

The following sections describe how to deploy models to a production environment of your choice and use MLOps to monitor and manage those models.

See the associated deployment and custom model deployment considerations for additional information.

Topic	Describes
Deployment workflows	How to deploy and monitor DataRobot AutoML models, custom inference models, and external models in various prediction environments.
Register models	How to register DataRobot AutoML models, custom inference models, and external models in the Model Registry.
Prepare custom models for deployment	How to create, test, and prepare custom inference models for deployment.
Prepare for external model deployment	How to create and manage external models and prediction environments in preparation for deployment.
Manage prediction environments	How to view DataRobot prediction environments and create, edit, delete, or share external prediction environments.
Deploy models	How to deploy DataRobot models, custom inference models, and external models to DataRobot MLOps.
MLOps agents	How to configure the monitoring and management agent for external models.

Feature considerations¶

When curating a prediction request/response dataset from an external source:

Include the 25 most important features.
Follow the CSV file size requirements.
For classification projects, classes must have a value of 0 or 1, or be text strings.

Additionally, note that:

Self-Managed AI Platform only: By default, the 25 most important features and the target are tracked for data drift.
The Make Predictions tab is not available for external deployments.
DataRobot deployments only track predictions made against dedicated prediction servers by deployment_id.
- To be analyzed by model management, other prediction methods should record requests and predictions to a CSV file. Then, upload the file to DataRobot as an external deployment.
- As of Self-Managed AI Platform version 7.0, the previously deprecated endpoints using project_id and model_id instead of deployment_id, return HTTP 404 Not found (unless otherwise configured with a DataRobot representative).
The first 1,000,000 predictions per deployment per hour are tracked for data drift analysis and computed for accuracy. Further predictions within an hour where this limit has been reached are not processed for either metric. However, there is no limit on predictions in general.
If you score larger datasets (up to 5GB), there will be a longer wait time for the predictions to become available, as multiple prediction jobs must be run. If you choose to navigate away from the predictions interface, the jobs will continue to run.
After making prediction requests, it can take 30 seconds or so for data drift and accuracy metrics to update. Note that the speed at which the metrics update depends on the model type (e.g., time series), the deployment configuration (e.g., segment attributes, number of forecast distances), and system stability.
DataRobot recommends that you do not submit multiple prediction rows that use the same association ID—an association ID is a unique identifier for a prediction row. If multiple prediction rows are submitted, only the latest prediction uses the associated actual value. All prior prediction rows are, in effect, unpaired from that actual value. Additionally, all predictions made are included in data drift statistics, even the unpaired prediction rows.
If you want to write back your predictions to a cloud location or database, you must use the Prediction API.

Time series deployments¶

To make predictions with a time series deployment, the amount of history needed depends on the model used:
- Traditional time series (ARIMA family) models require the full history between training time and prediction time. DataRobot recommends scoring these models with the Prediction API.
- All other time series models only require enough history to fill the feature derivation window, which varies by project. For cross series, all series must be provided at prediction time.
Both categories of models are supported for real-time predictions, with a maximum payload size of 50 MB.
ARIMA family and non-ARIMA cross-series models do not support batch predictions.
All other time series models support batch predictions. For multiseries, input data must be sorted by series ID and timestamp.
There is no data limit for time series batch predictions on supported models other than a single series cannot exceed 50 MB.
When scoring regression time series models using integrated enterprise databases, you may receive a warning that the target database is expected to contain the following column, which was not found: DEPLOYMENT_APPROVAL_STATUS. The column, which is optional, records whether the deployed model has been approved by an administrator. If your organization has configured a deployment approval workflow, you can:
- Add the column to the target database.
- Redirect the data to another column by using the columnNamesRemapping parameter.
After taking either of the above actions, run the prediction job again, and the approval status appears in the prediction results. If you are not recording approval status, ignore the message, and the prediction job continues.
To ensure DataRobot can process your time series data for deployment predictions, configure the dataset to meet the following requirements:
- Sort prediction rows by their timestamps, with the earliest row first.
- For multiseries, sort prediction rows by series ID and then by timestamp.
- There is no limit on the number of series DataRobot supports. The only limit is the job timeout. For more information, see the batch prediction limits.
For dataset examples, see the requirements for the scoring dataset.

Multiclass deployments¶

Multiclass deployments of up to 100 classes support monitoring for target, accuracy, and data drift.
Multiclass deployments of up to 100 classes support retraining.
Multiclass deployments created before Self-Managed AI Platform version 7.0 with feature drift enabled don't have historical data for feature drift of the target; only new data is tracked.
DataRobot uses holdout data as a baseline for target drift. As a result, for multiclass deployments using certain datasets, rare class values could be missing in the holdout data and in the baseline for drift. In this scenario, these rare values are treated as new values.

Challengers¶

To enable Challengers and replay predictions against them, the deployed model must support target drift tracking and not be a Feature Discovery or Unstructured custom inference model.
To replay predictions against Challengers, you must be in the organization associated with the deployment. This restriction also applies to deployment owners.

Prediction results cleanup¶

For each deployment, DataRobot periodically performs a cleanup job to delete the deployment's predicted and actual values from its corresponding prediction results table in Postgres. DataRobot does this to keep the size of these tables reasonable while allowing you to consistently generate accuracy metrics for all deployments and schedule replays for challenger models without the danger of hitting table size limits.

The cleanup job prevents a deployment from reaching its "hard" limit for prediction results tables; when the table is full, predicted and actual values are no longer stored, and additional accuracy metrics for the deployment cannot be produced. The cleanup job triggers when a deployment reaches its "soft" limit, serving as a buffer to prevent the deployment from reaching the "hard" limit. The cleanup prioritizes deleting the oldest prediction rows already tied to a corresponding actual value. Note that the aggregated data used to power data drift and accuracy over time are unaffected.

Managed AI Platform¶

Managed AI Platform users have the following hourly limitations. Each deployment is allowed:

Data drift analysis: 1,000,000 predictions or, for each individual prediction instance, 100MB of total prediction requests. If either limit is reached, data drift analysis is halted for the remainder of the hour.
Prediction row storage: the first 100MB of total prediction requests per deployment per each individual prediction instance. If the limit is reached, no prediction data is collected for the remainder of the hour.