Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Challengers tab

Availability information

The Challengers tab is a feature exclusive to DataRobot MLOps users. Contact your DataRobot representative for information on enabling it.

During model development, many models are often compared to one another until one is chosen to be deployed into a production environment. The Challengers tab provides a way to continue model comparison post-deployment. You can submit challenger models that shadow a deployed model and replay predictions made against the deployed model. This allows you to compare the predictions made by the challenger models to the currently deployed model (the "champion") to determine if there is a superior DataRobot model that would be a better fit.

Enable challenger models

To enable Challenger models for a deployment, you must enable the Challengers tab and prediction row storage. To do so, adjust the deployment's data drift settings either when creating a deployment or on the Settings > Data tab. If you enable Challenger models, prediction row storage will automatically be enabled for the deployment as well and cannot be turned off, as it is required for challengers.

Select a challenger model

Before adding a challenger model to a deployment, you must first build and select the model to be added as a challenger. Complete the modeling process and choose a model from the Leaderboard, or deploy a custom model as a model package. When selecting a challenger model, consider the following:

  • It must have the same target type as the champion model.
  • It does not need to be trained on the same feature list as the champion model, but must share some features. However, to successfully replay predictions, must send union of all features required for champion and challengers.
  • It does not have to be built from the same project as the champion model.

When you have selected a model to serve as a challenger, from the Leaderboard navigate to Predict > Deploy and select Add model package to registry. This creates a model package for the selected model in the Model Registry, in order for you to add the model to a deployment as a challenger.

Add challengers to a deployment

To add a challenger model to a deployment, navigate to the Challengers tab and select Add challenger model. You can add up to 5 challengers to each deployment.

Note

The selection list contains only model packages where the target type and name are the same as the champion model.

The modal prompts you to select a model package from the registry to serve as a challenger model. Choose the model to add and click Select model package.

DataRobot verifies that the model shares features and a target type with the champion model. Once verified, click Add Challenger. The model is now added to the deployment as a challenger.

Replay predictions

After adding a challenger model, you can replay stored predictions made with the champion model for all challengers. This allows you to compare performance metrics such as predicted values, accuracy, and data errors across each model.

To replay predictions, select Update challenger predictions.

The champion model computes and stores up to 100,000 prediction rows per hour. The challengers replay the first 10,000 rows of the prediction requests made for each hour within the time range specified by the date slider. Note that for time series deployments, this limit does not apply. All prediction data is used by the challengers to compare statistics.

After predictions are made, click Refresh on the date slider to view an updated display of performance metrics for the challenger models.

Scheduled replay of predictions

You can replay predictions with challengers on a periodic schedule instead of doing so manually. Navigate to a deployment's Settings > Challengers tab. Turn on the toggle to automatically replay challengers. Scheduled replay can only be configured by the Owner of a deployment.

Configure the preferred cadence and time of day for replaying predictions.

Once enabled, the replay will trigger at the configured time for all challengers. Note that if you have a deployment with prediction requests made in the past and chose to add challengers at the current time, the scheduled job scores the newly added challenger models upon the next run cycle.

Challenger models overview

The Challengers tab displays information about the champion model and each challenger.

Element Description
Display Name The display name for each model. Use the pencil icon to edit the display name. This field is useful for describing the purpose or strategy of each challenger (e.g., "reference model," "former champion," "reduced feature list").
Challenger models The list of challenger models. Each model is associated with a color—the colors allow you to compare the models using visualization tools.
Model data The metadata for each model, including the project name, model name, and the execution environment type.
Training Data The filename of the data used to train the model.
Actions The actions available for each model:
  • Replace: Promotes a challenger to the champion (the currently deployed model) and demotes the current champion to a challenger model.
  • Remove: Removes the model from the deployment as a challenger. Only challengers can be deleted; a champion must be demoted before it can be deleted.

Challenger performance metrics

After prediction data is replayed for challenger models, you can examine the charts displayed below that capture the various performance metrics recorded for each model.

Each model is listed with its corresponding color. Uncheck a model's box to stop displaying the model's performance data on the charts.

Predictions chart

The Predictions chart records the average predicted value of the target for each model over time. Hover over a point to compare the average value for each model at the specific point in time.

For binary classification projects, use the Class dropdown to select the class for which you want to analyze the average predicted values. The chart also includes a toggle that allows you to switch between continuous and binary modes. Continuous mode shows the positive class predictions as probabilities between 0 and 1, without taking the prediction threshold into account. Binary mode takes the prediction threshold into account and shows, of all predictions made, the percentage for each possible class.

Accuracy chart

The Accuracy chart records the change in a selected accuracy metric value (LogLoss in this example) over time. These metrics are identical to those used for the evaluation of the model before deployment. Use the dropdown to change the accuracy metric. You can select from any of the supported metrics for the deployment's modeling type.

Data Errors chart

The Data Errors chart records the data error rate for each model over time. Data error rate measures the percentage of requests that result in a 4xx error (problems with the prediction request submission).

Challengers for external deployments

External deployments with remote prediction environments can also use the Challengers tab. Remote models can serve as the champion model, and you can compare them to DataRobot and custom models serving as challengers.

The workflow detailed for adding challenger models is largely the same, however there are unique differences for external deployments outlined below.

Add challenger models to external deployments

To enable challenger support, access an external deployment (one created with an external model package). In the Settings tab, under the Data Drift header, enable challenger models and prediction row storage.

The Challengers tab is now accessible. To add challenger models to the deployment, navigate to the tab and select Add challenger model.

Select a model package for the challenger you want to add (custom and DataRobot models only). Additionally, you must indicate a prediction environment used by the model package; this details where the model runs predictions. DataRobot or custom model can only use a DataRobot prediction environment for challengers models (unlike the champion model, deployed to an external prediction environment). When you have chosen the desired prediction environment, click Select.

The tab updates to display the model package you wish to add, verifying that the features used in the model package match the deployed model. Select Add challenger.

The model package is now serving as a challenger model for the remote deployment.

Manage challengers for external deployments

You can manage challenger models for remote deployments with various actions:

  • To edit the prediction environment used by a challenger, select the pencil icon and choose a new prediction environment from the dropdown.

  • To replace the deployed model with a challenger, the challenger must have a compatible prediction environment. Once replaced, the champion does not become a challenger because remote models are ineligible.

Note

Note that when replacing the champion model, the challenger's model package type determines the subsequent behavior. When a custom model becomes the champion model, the model remains remote and maintains the "External" prediction type. If you replace a model with a DataRobot model, the prediction environment remains external. To continue making predictions, you must download the .mlpkg file for the new champion and deploy it to the external prediction environment used by the remote model.


Updated November 16, 2021
Back to top