Skip to content

Create a retraining job

Add a job, manually or from a template, implementing a code-based retraining policy. To view and add retraining jobs, navigate to the Jobs > Retraining tab, and then:

  • To add a new retraining job manually, click + Add new retraining job (or the minimized add button when the job panel is open).

  • To create a retraining job from a template, next to the add button, click , and then, under Retraining, click Create new from template.

The new job opens to the Assemble tab. Depending on the creation option you selected, proceed to the configuration steps linked in the table below.

Retraining job type Description
Add new retraining job Manually add a job implementing a code-based retraining policy.
Create new from template Add a job, from a template provided by DataRobot, implementing a code-based retraining policy.

Retraining jobs require metadata

All retraining jobs require a metadata.yaml file to associate the retraining policy with a deployment and a retraining policy.

Add a new retraining job

To manually add a job for code-based retraining:

  1. On the Assemble tab for the new job, click the job name (or the edit icon ) to enter a new job name, and then click confirm :

  2. In the Environment section, select a Base environment for the job.

    The available drop-in environments depend on your DataRobot installation; however, the table below lists commonly available public drop-in environments with templates in the DRUM repository. Depending on your DataRobot installation, the Python version of these environments may vary, and additional non-public environments may be available for use.

    Drop-in environment security

    Starting with the March 2025 Managed AI Platform release, most general purpose DataRobot custom model drop-in environments are security-hardened container images. When you require a security-hardened environment for running custom jobs, only shell code following the POSIX-shell standard is supported. Security-hardened environments following the POSIX-shell standard support a limited set of shell utilities.

    Drop-in environment security

    Starting with the 11.0 Self-Managed AI Platform release, most general purpose DataRobot custom model drop-in environments are security-hardened container images. When you require a security-hardened environment for running custom jobs, only shell code following the POSIX-shell standard is supported. Security-hardened environments following the POSIX-shell standard support a limited set of shell utilities.

    Environment name & example Compatibility & artifact file extension
    Python 3.X Python-based custom models and jobs. You are responsible for installing all required dependencies through the inclusion of a requirements.txt file in your model files.
    Python 3.X GenAI Generative AI models (Text Generation or Vector Database target type)
    Python 3.X ONNX Drop-In ONNX models and jobs (.onnx)
    Python 3.X PMML Drop-In PMML models and jobs (.pmml)
    Python 3.X PyTorch Drop-In PyTorch models and jobs (.pth)
    Python 3.X Scikit-Learn Drop-In Scikit-Learn models and jobs (.pkl)
    Python 3.X XGBoost Drop-In Native XGBoost models and jobs (.pkl)
    Python 3.X Keras Drop-In Keras models and jobs backed by tensorflow (.h5)
    Java Drop-In DataRobot Scoring Code models (.jar)
    R Drop-in Environment R models trained using CARET (.rds)
    Due to the time required to install all libraries recommended by CARET, only model types that are also package names are installed (e.g., brnn, glmnet). Make a copy of this environment and modify the Dockerfile to install the additional, required packages. To decrease build times when you customize this environment, you can also remove unnecessary lines in the # Install caret models section, installing only what you need. Review the CARET documentation to check if your model's method matches its package name. (Log in to GitHub before clicking this link.)

    scikit-learn

    All Python environments contain scikit-learn to help with preprocessing (if necessary), but only scikit-learn can make predictions on sklearn models.

  3. In the Files section, assemble the custom job. Drag files into the box, or use the options in this section to create or upload the files required to assemble a custom job:

    Option Description
    Choose from source / Upload Upload existing custom job files (run.sh, metadata.yaml, etc.) as Local Files or a Local Folder.
    Create Create a new file, empty or containing a template, and save it to the custom job:
    • Create run.sh: Creates a basic, editable example of an entry point file.
    • Create metadata.yaml: Creates a basic, editable example of a runtime parameters file.
    • Create README.md: Creates a basic, editable README file.
    • Create job.py: Creates a basic, editable Python job file to print runtime parameters and deployments.
    • Create example job: Combines all template files to create a basic, editable custom job. You can quickly configure the runtime parameters and run this example job.
    • Create blank file: Creates an empty file. Click the edit icon next to Untitled to provide a file name and extension, then add your custom contents. In the next step, it is possible to identify files created this way, with a custom name and content, as the entry point. After you configure the new file, click Save.

    File replacement

    If you add a new file with the same name as an existing file, when you click Save, the old file is replaced in the Files section.

  4. In the Settings section, configure the Entry point shell (.sh) file for the job. If you've added a run.sh file, that file is the entry point; otherwise, you must select the entry point shell file from the drop-down list. The entry point file allows you to orchestrate multiple job files:

  5. In the Resources section, next to the section header, click Edit and configure the following:

    Preview

    Custom job resource bundles are off by default. Contact your DataRobot representative or administrator for information on enabling this feature.

    Feature flag: Enable Resource Bundles

    Setting Description
    Resource bundle Preview feature. Configure the resources the custom job uses to run.
    Network access Configure the egress traffic of the custom job. Under Network access, select one of the following:
    • Public: The default setting. The custom job can access any fully qualified domain name (FQDN) in a public network to leverage third-party services.
    • None: The custom job is isolated from the public network and cannot access third party services.
    Default network access

    For the Managed AI Platform, the Network access setting is set to Public by default and the setting is configurable. For the Self-Managed AI Platform, the Network access setting is set to None by default and the setting is restricted; however, an administrator can change this behavior during DataRobot platform configuration. Contact your DataRobot representative or administrator for more information.

  6. (Optional) If you uploaded a metadata.yaml file, define the Runtime parameters, clicking the edit icon for each key value row you want to configure.

  7. (Optional) Configure additional Key values for Tags, Metrics, Training parameters, and Artifacts.

Create a retraining job from a template

To add a pre-made retraining job from a template:

Preview

The jobs template gallery is on by default.

Feature flags: Enable Custom Jobs Template Gallery, Enable Custom Templates

  1. In the Add custom job from gallery panel, click the job template you want to create a job from.

  2. Review the job description, Execution environment, Metadata, and Files, then, click Create custom job:

    The job opens to the Assemble tab.

  3. On the Assemble tab for the new job, click the job name (or the edit icon ()) to enter a new job name, and then click confirm :

  4. In the Environment section, review the Base environment for the job, set by the template.

  5. In the Files section, review the files added to the job by the template:

    • Click the edit icon to modify the files added by the template.

    • Click the delete icon to remove files added by the template.

  6. If you need to add new files, use the options in this section to create or upload the files required to assemble a custom job:

    Option Description
    Upload Upload existing custom job files (run.sh, metadata.yaml, etc.) as Local Files or a Local Folder.
    Create Create a new file, empty or containing a template, and save it to the custom job:
    • Create run.sh: Creates a basic, editable example of an entry point file.
    • Create metadata.yaml: Creates a basic, editable example of a runtime parameters file.
    • Create README.md: Creates a basic, editable README file.
    • Create job.py: Creates a basic, editable Python job file to print runtime parameters and deployments.
    • Create example job: Combines all template files to create a basic, editable custom job. You can quickly configure the runtime parameters and run this example job.
    • Create blank file: Creates an empty file. Click the edit icon () next to Untitled to provide a file name and extension, then add your custom contents. In the next step, it is possible to identify files created this way, with a custom name and content, as the entry point. After you configure the new file, click Save.

    File replacement

    If you add a new file with the same name as an existing file, when you click Save, the old file is replaced in the Files section.

  7. In the Settings section, review the Entry point shell (.sh) file for the job, added by the template (usually run.sh ). The entry point file allows you to orchestrate multiple job files:

  8. In the Resources section, review the default resource settings for the job. To modify the settings, next to the section header, click Edit and configure the following:

    Availability information

    Custom job resource bundles are off by default. Contact your DataRobot representative or administrator for information on enabling this feature.

    Feature flag: Enable Resource Bundles

    Setting Description
    Resource bundle Preview feature. Configure the resources the custom job uses to run.
    Network access Configure the egress traffic of the custom job. Under Network access, select one of the following:
    • Public: The default setting. The custom job can access any fully qualified domain name (FQDN) in a public network to leverage third-party services.
    • None: The custom job is isolated from the public network and cannot access third party services.
    Default network access

    For the Managed AI Platform, the Network access setting is set to Public by default and the setting is configurable. For the Self-Managed AI Platform, the Network access setting is set to None by default and the setting is restricted; however, an administrator can change this behavior during DataRobot platform configuration. Contact your DataRobot representative or administrator for more information.

  9. If the template included a metadata.yaml file, define the Runtime parameters, clicking the edit icon () for each key value row you want to configure.

  10. Configure additional Key values for Tags, Metrics, Training parameters, and Artifacts.

After you create a retraining job, you can add it to a deployment as a retraining policy.

Define runtime parameters

You can create and define runtime parameters to supply different values to scripts and tasks used by a custom job at runtime by including them in a metadata.yaml file, making your custom job easier to reuse. A template for this file is available from the Files > Create dropdown.

To define runtime parameters, you can add the following runtimeParameterDefinitions in metadata.yaml:

Key Description
fieldName Define the name of the runtime parameter.
type Define the data type the runtime parameter contains: string, boolean, numeric credential, deployment.
defaultValue (Optional) Set the default string value for the runtime parameter (the credential type doesn't support default values). If you define a runtime parameter without specifying a defaultValue, the default value is None.
minValue (Optional) For numeric runtime parameters, set the minimum numeric value allowed in the runtime parameter.
maxValue (Optional) For numeric runtime parameters, set the maximum numeric value allowed in the runtime parameter.
credentialType (Optional) For credential runtime parameters, set the type of credentials the parameter must contain.
allowEmpty (Optional) Set the empty field policy for the runtime parameter.
  • True: (Default) Allows an empty runtime parameter.
  • False: Enforces providing a value for the runtime parameter before deployment.
description (Optional) Provide a description of the purpose or contents of the runtime parameter.
Example: metadata.yaml
name: runtime-parameter-example

runtimeParameterDefinitions:
- fieldName: my_first_runtime_parameter
  type: string
  description: My first runtime parameter.

- fieldName: runtime_parameter_with_default_value
  type: string
  defaultValue: Default
  description: A string-type runtime parameter with a default value.

- fieldName: runtime_parameter_boolean
  type: boolean
  defaultValue: true
  description: A boolean-type runtime parameter with a default value of true.

- fieldName: runtime_parameter_numeric
  type: numeric
  defaultValue: 0
  minValue: -100
  maxValue: 100
  description: A boolean-type runtime parameter with a default value of 0, a minimum value of -100, and a maximum value of 100.

- fieldName: runtime_parameter_for_credentials
  type: credential
  allowEmpty: false
  description: A runtime parameter containing a dictionary of credentials.

The credential runtime parameter type supports any credentialType value available in the DataRobot REST API. The credential information included depends on the credentialType, as shown in the examples below:

Note

For more information on the supported credential types, see the API reference documentation for credentials.

Credential Type Example
basic
basic:
  credentialType: basic
  description: string
  name: string
  password: string
  user: string
        
azure
azure:
  credentialType: azure
  description: string
  name: string
  azureConnectionString: string
        
gcp
gcp:
  credentialType: gcp
  description: string
  name: string
  gcpKey: string
        
s3
s3:
  credentialType: s3
  description: string
  name: string
  awsAccessKeyId: string
  awsSecretAccessKey: string
  awsSessionToken: string
        
api_token
api_token:
  credentialType: api_token
  apiToken: string
  name: string