Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Create model packages

The way in which you create model packages—archived model artifacts with associated metadata—is dependent on the type of model being used to create the package.

Manual model package creation:

Automatic model package creation:

Manual creation

The following sections describe the steps necessary for manually creating model packages for custom inference models and external models. Custom inference models are created and tested sin the Custom Model Workshop and external models operate outside of DataRobot, monitored by the MLOps agent.

Add a custom inference model

You can create a model package for a custom inference model to replace the model package in existing deployments with a new one, or to share with another user who wants to deploy your custom model package.

When you have successfully created and tested a custom inference model, you have the option to add it to the Model Registry as a model package. To do so, navigate to Model Registry > Custom Model Workshop and select the custom model you wish to add.

Under the Test and Deploy tab, click Add to registry.

The custom model is then added to the Model Registry. The Add to registry link is replaced with a View registry package , which if clicked, takes you to your newly created model package in the registry under the Model Packages tab.

Note that although a model package can be created without testing the custom model, DataRobot recommends that you confirm the model passes testing before proceeding. Untested custom models prompt a dialog box warning that the custom model is not tested.

For untested custom models, click Test now to start a model test, or Create package without testing to proceed with the creation of a model package.

Register external model packages

To create a model package for an external model that is monitored by the MLOps agent, navigate to Model Registry > Model Packages. Click Add New Package and select New external model package.

In the resulting dialog box, complete the fields pertaining to the MLOps Agent-monitored model from which you are retrieving statistics.

The following table describes the fields:

Field Description
Package Name The name of the model package.
Package Description (optional) Information to describe the model package.
Model location (optional) The location of the model running outside of DataRobot. Describe the location as a filepath, such as folder1/opt/model.tar.
Build environment The programming language in which the model was built.
Training data (optional) The filename of the training data, uploaded locally or via the AI Catalog. Click Clear selection to upload and use a different file.
Holdout data (optional) The filename of the holdout data, uploaded locally or via the AI Catalog. Use holdout data to set an accuracy baseline and enable support for target drift and challenger models.
Target The dataset column name the model will predict on.
Prediction type The type of prediction the model is making, either binary classification or regression. For a classification model, you must also provide the positive and negative class labels and a prediction threshold.
Prediction column The column name in the holdout dataset containing the prediction result.

If registering a time series model, mark the checkbox This is a time series model. You must complete additional fields:

Field Description
Forecast date feature The column in the training dataset that contains date/time values used by DataRobot to detect the range of dates (the valid forecast range) available for use as the forecast point.
Date/time format The format used by the date/time features in the training dataset.
Forecast point feature The column in the training dataset that contains the point from which you are making a prediction.
Forecast unit The time unit (seconds, days, months, etc.) that comprise the time step.
Forecast distance feature The column in the training dataset containing a unique time step—a relative position—within the forecast window. A time series model outputs one row for each forecast distance.
Series identifier (optional, used for multiseries models The column in the training dataset that identifies which series each row belongs to.

Once all fields for the model package are defined, click Create package. The package populates in the Model Registry and is available for use.

Set an accuracy baseline

To set an accuracy baseline for external models (which enables target drift and challenger models when deployed), you must provide holdout data. This is because DataRobot cannot use the model to generate predictions that typically serve as a baseline, as the model is hosted in a remote prediction environment outside of the application. Provide holdout data when registering an external model package and specify the column containing predictions.

Deploy external model packages

This section outlines how to create a deployment with an external model package. Before proceeding, make sure you have registered your external model package in the Model Registry.

Note

To send predictions, you must first configure the MLOps agent. Reference the agent's internal documentation for configuration information.

  1. Navigate to Model Registry > Model Packages and select "Deploy" from the action menu for the external model package you wish to deploy.

  2. This prompts the deployment information page, where you can configure the Data Drift settings for your deployment under the Inference header. Once enabled, you can activate additional features for the deployment: prediction row storage, challenger models, and segmented analysis. Data drift compares the model's training data to its scoring data in order to analyze a model's performance over time. You must upload training data to a deployment to enable Data drift monitoring.

    • For time series model packages, use the forecast date as the primary prediction timestamping method.
  3. Click Create deployment at the top of the screen.

  4. Once you create an external deployment, there are two options for additional configuration. You can upload historical prediction data to the deployment to analyze data drift and accuracy in the past. You can also instrument the deployment with the MLOps agent to monitor future predictions. To do so, navigate to the Predictions tab to access the monitoring snippet.

If you add prediction data for scoring in the Predictions tab, you must include the required features for times series predictions in the prediction dataset:

  • Forecast Distance: Supplied by DataRobot when you download the .mlpkg file.
  • dr_forecast_point: Supplied by DataRobot when you download the .mlpkg file.
  • Datetime_column_name: Defines the date/time feature to use for time-stamping prediction rows.
  • Series_column_name: Defines the feature (series ID) used for multiseries deployments (if applicable).

Create model packages automatically

The following sections describe the steps necessary to trigger the automatic creation of model packages for custom inference models and models provided when adding a new deployment.

Deploy custom models

When you deploy a custom model, DataRobot automatically creates a model package, which you can access in the Model Registry under the Model Packages tab. The deployment you create also uses this model package.

Deployment from the inventory

When you create a new deployment with any type of model, DataRobot automatically creates a model package for the model being deployed. You can access it in the Model Registry under the Model Packages tab.


Updated September 2, 2021