Define custom model metadata¶
For structured inference models, model-metadata.yaml needs to declare an inferenceModel section: binary models need positive/negative class labels; multiclass models need targetName and classLabels in the same order as the model’s probability outputs. See Inference model metadata for more information.
To define metadata, create a model-metadata.yaml file and put it in the top level of the task/model directory. In most cases, it can be skipped, but it is required for custom transform tasks when a custom task outputs non-numeric data. The model-metadata.yaml is located in the same folder as custom.py.
The sections below show how to define metadata for custom models and tasks. For more information, you can review complete examples in the DRUM repository for custom models and tasks.
General metadata parameters¶
The following table describes options that are available to tasks and/or inference models. The parameters are required when using drum push to supply information about the model/task/version to create. Some of the parameters are also required outside of drum push for compatibility reasons.
Note
The modelID parameter adds a new version to a pre-existing custom model or task with the specified ID. Because of this, all options that configure a new base-level custom model or task are ignored when passed alongside this parameter. However, at this time, these parameters still must be included.
| Option | When required | Task or inference model | Description |
|---|---|---|---|
name |
Always | Both | A string, preferably unique for easy searching, that drum push uses as the custom model title. |
type |
Always | Both | A string, either training (for custom tasks) or inference (for custom inference models). |
environmentID |
Always | Both | A hash of the execution environment to use while running your custom model or task. You can find a list of available execution environments in Model Registry > Custom Model Workshop > Environments. Expand the environment and click on the Environment Info tab to view and copy the file ID. Required for drum push only. |
targetType |
Always | Both | A string indicating the type of target. Must be one of:
|
modelID |
Optional | Both | After creating a model or task, it is best practice to use versioning to add code while iterating. To create a new version instead of a new model or task, use this field to link the custom model/task you created. The ID (hash) is available from the UI, via the URL of the custom model or task. Used with drum push only. |
description |
Optional | Both | A searchable field. If modelID is set, use the UI to change a model/task description. Used with drum push only. |
majorVersion |
Optional | Both | Specifies whether the model version you are creating should be a major (True, the default) or minor (False) version update. For example, if the previous model version is 2.3, a major version update would create version 3.0; a minor version update would create version 2.4. Used for drum push only. |
targetName |
For binary and multiclass (in inferenceModel) |
Model | In inferenceModel, the name of the column the model predicts. For multiclass, use the same name as Target name in the Workshop and the same order of classes as Target classes for classLabels. |
positiveClassLabel / negativeClassLabel |
For binary classification models | Model | In inferenceModel, when your model predicts probability, the positiveClassLabel dictates what class the prediction corresponds to. |
classLabels |
For multiclass classification models | Model | In inferenceModel, a list of class names (strings). The list order must match the order of predicted class probabilities your model returns (for example, the column order of probability outputs). Use the same labels as the Target classes you configure for the custom model in the Workshop. |
predictionThreshold |
Optional (binary classification models only). | Model | In inferenceModel, the cutoff point between 0 and 1 that dictates which label will be chosen as the predicted label. |
trainOnProject |
Optional | Task | A hash with the ID of the project (PID) to train the model or version on. When using drum push to test and upload a custom estimator task, you have an option to train a single-task blueprint immediately after the estimator is successfully uploaded into DataRobot. The trainOnProject option specifies the project on which to train that blueprint. |
Inference model metadata (inferenceModel)¶
For structured inference models, target and class-label settings belong under the top-level key inferenceModel in model-metadata.yaml. If you omit fields that DataRobot or DRUM require for your targetType, builds, tests, or deployments can fail.
targetType |
Required under inferenceModel |
Notes |
|---|---|---|
binary |
targetName, positiveClassLabel, negativeClassLabel |
Optional: predictionThreshold. |
multiclass |
targetName, classLabels |
classLabels is a YAML list of class names in the same order as your model’s probability outputs. |
regression |
(often none) | Many regression templates work without an inferenceModel block; follow your environment and DRUM requirements. |
anomaly, unstructured, textgeneration, … |
Follow template / DRUM | See examples for your target type. |
Workshop-generated file: On the Registry Workshop Assemble tab, Create model-metadata.yaml produces a starter file for your model’s target type. For multiclass, that file includes inferenceModel with targetName and classLabels (aligned with your Target classes), matching what you need for a successful deployment.
In the model-metadata.yaml file, you can also define runtime parameters to make your custom model code easier to reuse.