NextGen experience > AI experimentation > Manage experiments > Model Leaderboard

Model Leaderboard¶

Tile	Description
	Opens a list of all built models and overview information for each, with access to the model's available insights.

Once you start modeling, Workbench begins to construct a performance-ranked model Leaderboard to help with quick model evaluation. The Leaderboard provides a summary of information, including scoring information, for each model built in an experiment. From the Leaderboard, you can click a model to access visualizations for further exploration. Using these tools can help to assess what to do in your next experiment.

DataRobot populates the Leaderboard as it builds, initially displaying up to 50 models. Click Load more models to load 50 more models with each click.

If you ran Quick mode, after Workbench completes the 64% sample size phase, the most accurate model is selected and trained on 100% of the data. That model is marked with the Prepared for Deployment badge.

Why isn't the prepared for deployment model at the top of the Leaderboard?

When Workbench prepares a model for deployment, it trains the model on 100% of the data. While the most accurate was selected to be prepared, it was selected based on a 64% sample size. As a part of preparing the most accurate model for deployment, Workbench unlocks Holdout, resulting in the prepared model being trained on different data from the original. If you do not change the Leaderboard to sort by Holdout, the validation score in the left bar can make it appear as if the prepared model is not the most accurate.

Two elements make up the Leaderboard:

The Leaderboard itself, a manageable listing of all built models in the experiment.
A model overview page that provides summary information and access to model insights.

Model list¶

By default, the Leaderboard opens in an expanded, full-width view that shows all models in the experiment with a summary of their training settings and the validation, cross-validation, and holdout scores. Badges provide quick model identifying and scoring information while icons in front of the model name indicate model type.

Click any model to show additional scoring information and to access the model insights. Click Close to return to the full-width view.

If a model has more badges than the Leaderboard can display, use the dropdown:

Model list display¶

The Leaderboard offers a variety of ways to filter and sort the Leaderboard model list to make viewing and focusing on relevant models easier. In addition to using the search function, you can filter, sort, and "favorite" models.

Search¶

Combine any of the filters with search filtering. First, search for a model type or blueprint number, for example, and then select Filters to find only those models of that type meeting the additional criteria.

Model filtering¶

Filtering makes viewing and focusing on relevant models easier. Click Filter to set the criteria for the models that Workbench displays on the Leaderboard. The choices available for each filter are dependent on the experiment and/or model type—they were used in at least one Leaderboard model—and will potentially change as models are added to the experiment. For example:

Filter	Displays models that
Labeled models	Have been assigned the listed tag, either starred models or models recommended for deployment.
Feature list	Were built with the selected feature list.
Sample size (random or stratified partitioning)	Were trained on the selected sample size.
Training period (date/time partitioning)	Were trained on backtests defined by the selected duration mechanism.
Model family	Are part of the selected model family: GBM (Gradient Boosting Machine), such as Light Gradient Boosting on ElasticNet Predictions, eXtreme Gradient Boosted Trees Classifier GLMNET (Lasso and ElasticNet regularized generalized linear models), such as Elastic-Net Classifier, Generalized Additive2 RI (Rule induction), such as RuleFit Classifier RF (Random Forest), such as RandomForest Classifier or Regressor NN (Neural Network), such as Keras
Properties	Were built using GPUs.

Available fields, and the settings for that field, are dependent on the project and/or model type. For example, non-date/time models offer sample size filtering while time-aware models offer training period

Note

Filters are inclusive. That is, results show models that match any of the filters, not all filters. Also, options available for selection only include those in which at least one model matching the criteria is on the Leaderboard.

Model sorting¶

By default, the Leaderboard sorts models based on the score of the validation partition, using the selected optimization metric. You can, however, use the Sort models by control to change the basis of the display parameter when evaluating models.

Note that although Workbench built the project using the most appropriate metric for your data, it computes many applicable metrics on each of the models. After the build completes, you can redisplay the Leaderboard listing based on a different metric. It will not change any values within the models, it will simply reorder the model listing based on their performance on this alternate metric.

See the page on optimization metrics for detailed information on each.

Filter by favorites¶

Tag or "star" one or more models on the Leaderboard, making it easier to refer back to them when navigating through the application. click to star and then use Filter to show only starred models

Model Overview¶

When you select a model from the Leaderboard listing, it opens to the Model Overview where you can:

See specific details about metric scores and settings.
Retrain models on new feature lists or sample sizes. Note that you cannot change the feature list on the model prepared for deployment as it is "frozen".
Access model insights.

Model build failure¶

If a model failed to build, you will see that in the job queue as Autopilot runs. Once it completes, the model(s) are still listed in the Leaderboard but the entry indicates the failure. Click the model to display a log of issues that resulted in failure.

Use the Delete failed model button to remove the model from the Leaderbaord.

Experiment tools¶

In addition to the model access available from the Leaderboard, you can also:

Use Rerun modeling to rerun Autopilot with a different feature list, a different modeling mode, or additional automation settings.
Duplicate the experiment.

Duplicate experiments¶

Use the link at the top of the Leaderboard to duplicate the current experiment. This creates a new experiment and can be a faster method to work with your data than re-uploading it.

When you click to duplicate a modal opens with an option to provide a new experiment name. Then, select whether to copy only the dataset or to copy the dataset and experiment settings. If you select to include settings, DataRobot clones the target as well as any advanced settings and custom feature lists associated with the original project.

When complete, DataRobot opens to the new experiment setup page where you can begin the model building process.