Create model packages¶
The way in which you create model packages—archived model artifacts with associated metadata—is dependent on the type of model being used to create the package.
Manual model package creation:
- Add a custom inference model to the registry
- Register an external model
- Deploy an external model package
Automatic model package creation:
- Deploy a custom model
- Create a deployment via the “Add Deployment” action in the Deployment Inventory
The following sections describe the steps necessary for manually creating model packages for custom inference models and external models. Custom inference models are created and tested sin the Custom Model Workshop and external models operate outside of DataRobot, monitored by the MLOps agent.
Add a custom inference model¶
You can create a model package for a custom inference model to replace the model package in existing deployments with a new one, or to share with another user who wants to deploy your custom model package.
When you have successfully created and tested a custom inference model, you have the option to add it to the Model Registry as a model package. To do so, navigate to Model Registry > Custom Model Workshop and select the custom model you wish to add.
Under the Test and Deploy tab, click Add to registry.
The custom model is then added to the Model Registry. The Add to registry link is replaced with a View registry package , which if clicked, takes you to your newly created model package in the registry under the Model Packages tab.
Note that although a model package can be created without testing the custom model, DataRobot recommends that you confirm the model passes testing before proceeding. Untested custom models prompt a dialog box warning that the custom model is not tested.
For untested custom models, click Test now to start a model test, or Create package without testing to proceed with the creation of a model package.
Register external model packages¶
To create a model package for an external model that is monitored by the MLOps agent, navigate to Model Registry > Model Packages. Click Add New Package and select New external model package.
In the resulting dialog box, complete the fields pertaining to the MLOps Agent-monitored model from which you are retrieving statistics.
The following table describes the fields:
|Package Name||The name of the model package.|
|Package Description (optional)||Information to describe the model package.|
|Model location (optional)||The location of the model running outside of DataRobot. Describe the location as a filepath, such as folder1/opt/model.tar.|
|Build environment||The programming language in which the model was built.|
|Training data (optional)||The filename of the training data, uploaded locally or via the AI Catalog. Click Clear selection to upload and use a different file.|
|Holdout data (optional)||The filename of the holdout data, uploaded locally or via the AI Catalog. Use holdout data to set an accuracy baseline and enable support for target drift and challenger models.|
|Target||The dataset column name the model will predict on.|
|Prediction type||The type of prediction the model is making, either binary classification or regression. For a classification model, you must also provide the positive and negative class labels and a prediction threshold.|
|Prediction column||The column name in the holdout dataset containing the prediction result.|
If registering a time series model, mark the checkbox This is a time series model. You must complete additional fields:
|Forecast date feature||The column in the training dataset that contains date/time values used by DataRobot to detect the range of dates (the valid forecast range) available for use as the forecast point.|
|Date/time format||The format used by the date/time features in the training dataset.|
|Forecast point feature||The column in the training dataset that contains the point from which you are making a prediction.|
|Forecast unit||The time unit (seconds, days, months, etc.) that comprise the time step.|
|Forecast distance feature||The column in the training dataset containing a unique time step—a relative position—within the forecast window. A time series model outputs one row for each forecast distance.|
|Series identifier (optional, used for multiseries models||The column in the training dataset that identifies which series each row belongs to.|
Once all fields for the model package are defined, click Create package. The package populates in the Model Registry and is available for use.
Set an accuracy baseline¶
To set an accuracy baseline for external models (which enables target drift and challenger models when deployed), you must provide holdout data. This is because DataRobot cannot use the model to generate predictions that typically serve as a baseline, as the model is hosted in a remote prediction environment outside of the application. Provide holdout data when registering an external model package and specify the column containing predictions.
Deploy external model packages¶
This section outlines how to create a deployment with an external model package. Before proceeding, make sure you have registered your external model package in the Model Registry.
To send predictions, you must first configure the MLOps agent. Reference the agent's internal documentation for configuration information.
Navigate to Model Registry > Model Packages and select "Deploy" from the action menu for the external model package you wish to deploy.
This prompts the deployment information page, where you can configure the Data Drift settings for your deployment under the Inference header. Once enabled, you can activate additional features for the deployment: prediction row storage, challenger models, and segmented analysis. Data drift compares the model's training data to its scoring data in order to analyze a model's performance over time. You must upload training data to a deployment to enable Data drift monitoring.
- For time series model packages, use the forecast date as the primary prediction timestamping method.
Click Create deployment at the top of the screen.
Once you create an external deployment, there are two options for additional configuration. You can upload historical prediction data to the deployment to analyze data drift and accuracy in the past. You can also instrument the deployment with the MLOps agent to monitor future predictions. To do so, navigate to the Predictions tab to access the monitoring snippet.
If you add prediction data for scoring in the Predictions tab, you must include the required features for times series predictions in the prediction dataset:
Forecast Distance: Supplied by DataRobot when you download the .mlpkg file.
dr_forecast_point: Supplied by DataRobot when you download the .mlpkg file.
Datetime_column_name: Defines the date/time feature to use for time-stamping prediction rows.
Series_column_name: Defines the feature (series ID) used for multiseries deployments (if applicable).
Create model packages automatically¶
Deploy custom models¶
When you deploy a custom model, DataRobot automatically creates a model package, which you can access in the Model Registry under the Model Packages tab. The deployment you create also uses this model package.
Deployment from the inventory¶
When you create a new deployment with any type of model, DataRobot automatically creates a model package for the model being deployed. You can access it in the Model Registry under the Model Packages tab.