Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Set up Automated Retraining policies

To maintain model performance after deployment without extensive manual work, DataRobot provides an automatic retraining capability for deployments. Upon providing a retraining dataset registered in the AI Catalog, you can define up to five retraining policies on each deployment, each consisting of a trigger, a modeling strategy, modeling settings, and a replacement action. When triggered, retraining will produce a new model based on these settings and notify you to consider promoting it.

Important

To configure an Automated Retraining policy, the deployment's Retraining Settings must be configured.

Create a retraining policy

To create and define a retraining policy:

  1. Click Deployments and select a deployment from the inventory.

  2. On the Retraining > Summary tab, click + Add Retraining Policy.

    If you haven't set up retraining, click Configure Retraining and configure the Retraining Settings.

  3. Enter a Policy name and, optionally, a Policy description.

  4. Configure the following retraining policy settings:

    • Retraining trigger: Select the time or deployment status event DataRobot uses to determine when to run retraining.

    • Model selection: Configure the methods DataRobot should use to build the new model on the updated data.

    • Model action: Select the replacement strategy DataRobot should use for the model trained during a successful retraining policy run.

    • Modeling strategy: Configure how DataRobot should set up the new Autopilot project.

  5. Click Save policy.

Retraining trigger

Retraining policies can be triggered manually or in response to three types of conditions:

  • Automatic schedule: Pick a time for the retraining policy to trigger automatically. Choose from increments ranging from every three months to every day. Note that DataRobot uses your local time zone.

  • Drift status: Initiates retraining when the deployment's data drift status declines to the level(s) you select.

  • Accuracy status: Triggers when the deployment's accuracy status changes from a better status to the levels you select (green to yellow, yellow to red, etc.).

Note

Data drift and accuracy triggers are based on the definitions configured on the Data Drift > Settings and Accuracy > Settings tabs.

Once initiated, a retraining policy cannot be triggered again until it completes. For example, if a retraining policy is set to run every hour but takes more than an hour to complete, it will complete the first run rather than start over or queue with the second scheduled trigger. Only one trigger condition can be chosen for each retraining policy.

Model selection

Choose a modeling strategy for the retraining policy. The strategy controls how DataRobot builds the new model on the updated data.

  • Use same blueprint as champion at time of retraining: Fits the same blueprint as the champion model at the time of triggering on the new data snapshot. Select one of the following options:

    • Use current hyperparameters: Use the same hyperparameters and blueprint as the champion model. Uses the champion's hyperparameter search and strategy for each task in the blueprint. Note that if you select this option, the champion model's feature list is used for retraining. The Informative Features list cannot be used.

    • Automatically tune hyperparameters: Use the same blueprint but optimize the hyperparameters for retraining.

  • Use best Autopilot model (recommended): Run Autopilot on the new data snapshot and use the resulting recommended model. Choose from DataRobot's three modeling modes: Quick, Autopilot, and Comprehensive.

If selected, you can also toggle additional Autopilot options:

Model action

The model action determines what happens to the model produced by a successful retraining policy run. In all scenarios, deployment owners are notified of the new model's creation and the new model is added as a model package to the Model Registry. Apply one of three actions for each policy:

  • Add new model as a challenger model: If there is space in the deployment's five challenger models slots, this action—which is the default—adds the new model as a challenger model. It replaces any model that was previously added by this policy. If no slots are available, and no challenger was previously added by this policy, the model will only be saved to the Model Registry. Additionally, the retraining policy run fails because the model could not be added as a challenger.

  • Initiate model replacement with new model: Suitable for high-frequency (e.g., daily) replacement scenarios, this option automatically requests a model replacement as soon as the new model is created. This replacement is subject to defined approval policies and their applicability to the given deployment, based on its owners and importance level. Depending on that approval policy, reviewers may need to approve the replacement manually before it occurs.

  • Save model: In this case, no action is taken with the model other than adding it to the Model Registry.

Modeling strategy

The modeling strategy for retraining defines how DataRobot should set up the new Autopilot project. Define the features, optimization metric, partitioning strategies, sampling strategies, weights, and other advanced settings that instruct DataRobot on how to build models for a given problem.

You can either reuse the same features as the champion model uses (when the trigger initiates) or allow DataRobot to identify the informative features from the new data.

By default, DataRobot reuses the same settings as the champion model (at the time of the trigger initiating). Alternatively, you can define new partitioning settings, choosing from a subset of options available in the project Start screen.

Manage retraining policies

After creating a retraining policy, you can start it manually, cancel it, or update it, as explained in the table below.

Element Definition
1 Retraining policy row Click on a retraining policy row to expand it. Once expanded, view or edit the retraining settings.
2 Run Click the run button () to start a policy manually. Alternatively, edit the policy by clicking the policy row and scheduling a run using the retraining trigger.
3 Remove Click the remove button () to delete a policy. Click Remove in the confirmation window.
4 Cancel Click the cancel button () to cancel a policy that is in progress or scheduled to run. You can't cancel a policy if it has finished successfully, reached the "Creating challenger" or "Replacing model" step, failed, or has already been canceled.

Retraining history

You can view all previous runs of a training policy, successful or failed. Each run includes a start time, end time, duration, and—if the run succeeded—links to the resulting project and model package. While only the DataRobot-recommended model for each project is added automatically to the deployment, you may want to explore the project's Leaderboard to find or build alternative models.

Note

Policies cannot be deleted or interrupted while they are running. If the retraining worker and organization have sufficient workers, multiple policies on the same deployment can be running at once.

Retraining strategies

The Challengers and Retraining tab allows for simple performance comparison, meaning retraining strategies can be evaluated empirically and customized for different use cases. You may benefit from initial experimentation, using various time frames for the "same-blueprint" and Autopilot strategies. For example, consider running "same-blueprint" retraining strategies using both a nightly and a weekly pattern and comparing the results.

Typical strategies for implementing automatic retraining policies in a deployment include:

  • High-frequency automatic schedule: Frequently (e.g., daily) retrain the currently deployed blueprint on the newest data to stabilize the deployed model selection.
  • Low-frequency automatic schedule: Periodically (e.g., weekly, monthly) run Autopilot to explore alternative modeling techniques and potentially optimize performance. You can restrict this process to only Scoring Code-supported models if that is how you deploy. See the Include only blueprints with Scoring Code support advanced option for more information.
  • Drift status trigger: Monitor data drift and trigger Autopilot to prepare an alternative model when the champion model has shown data drift due to changing situations.
  • Accuracy status trigger: Monitor accuracy drift and trigger Autopilot to search for a better-performing model after the champion model has shown accuracy decay. This strategy is most effective for use cases with fast access to actuals.

Retraining availability

Only binary, multiclass, and regression target types support retraining. The Challengers and Retraining tab doesn't appear when a deployment's champion has a multilabel target type.

Unsupported models and projects

Retraining is not supported for the following DataRobot models and project types. In those cases, the Challengers and Retraining tab doesn't appear when a deployment's champion uses any of the listed functionality:

Partially supported models

The following model types partially support retraining. For each partially supported model, only the supported (✔) options are available in retraining policies on the Challengers and Retraining tab:

Note

Only some retraining policy options are model-dependent. If the support matrix below doesn't include a model type, all options of a retraining policy are available for configuration.

Model type Same blueprint as champion Champion model's feature list Project options from champion model Custom project options
Custom inference
External (agent)
Blender
Time series

Retraining for time series

Time series deployments support retraining, but there are limitations when configuring policies due to the time series feature derivation process. This process generates features such as lags and moving averages and creates a new modeling dataset.

Time series model selection

Same blueprint as champion: The retraining policy uses the same engineered features as the champion model's blueprint. The search for newly derived features does not occur because it could potentially generate features that are not captured in the champion's blueprint.

Autopilot: When using Autopilot instead of the same blueprint, the time series feature derivation process does occur. However, Comprehensive Autopilot mode is not supported. Additionally, time series Autopilot does not support the options to only include Scoring Code blueprints and models with SHAP value support.

Time series modeling strategy

Same blueprint as champion: When creating a "same-blueprint" retraining policy for a time series deployment, you must use the champion model's feature list and advanced modeling options. The only option that you can override is the calendar used because, for example, a new holiday or event may be included in an updated calendar that you want to account for during retraining.

Autopilot: When creating an Autopilot retraining policy for a time series deployment, you must use the informative features modeling strategy. This strategy allows Autopilot to derive a new set of feature lists based on the informative features generated by new or different data. You cannot use the model's original feature list because time series Autopilot uses a feature extraction and reduction process by default. You can, however, override additional modeling options from the champion's project:

Option Description
Treat as exponential trend Apply a log-transformation to the target feature.
Exponentially weighted moving average (EWMA) Set a smoothing factor for EWMA.
Apply differencing Set DataRobot to apply differencing to make the target stationary prior to modeling.
Add calendar Upload, add from the catalog, or generate an event file that specifies dates or events that require additional attention.

Time-aware retraining

For time-aware retraining, if you choose to reuse options from the champion model or override the champion model's project options, consider the following:

  • If the champion's project used the holdout start date and end date, the retraining project does not use these settings but instead uses holdout duration, the difference between these two dates.
  • If the champion project used the holdout duration with either the holdout start date or end date, the holdout start/end date is dropped, and holdout duration is used in the retraining project. A new holdout start date is computed (the end of the retraining dataset minus the holdout duration).

Your customizations to backtests are not retained; however, the number of backtests is retained. At retraining time, the training start and end dates will likely differ from the champion's start and end dates. The data used for retraining might have shifted so that it no longer contains all of the data from a specific backtest on the champion model.


Updated September 6, 2024