Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

External Predictions

Through the External Predictions advanced option tab, you can bring external model(s) into the DataRobot AutoML environment, view them on the Leaderboard, and run a subset of DataRobot's evaluative insights for comparison against DataRobot models. This feature:

  • Helps to understand how a model trained outside of DataRobot compares in terms of accuracy with DataRobot-trained models from the prediction values.

  • Provides DataRobot’s trust and explainability visualizations for externally trained model(s) to provide better model understanding, compliance, and fairness results.

Workflow overview

To bring external models into DataRobot, follow this workflow:

  1. Prepare the dataset.
  2. Set advanced options.
  3. Add an external model.
  4. Evaluate the external model.
  5. Enable bias testing (binary classification only).

Prepare the dataset

To set up the project, ensure the uploaded dataset has the following two columns:

  • A column containing the values that identify the partition column, either cross validation or train/validation/holdout (TVH). If cross-validation is used, the values represent the folds, for example (5 CV fold example): 1, 2, 3, 4, and 5. For TVH, the values are typically T, V, and H.

    This column will later be referenced in the advanced option Partition Feature strategy. In the following example, the column is named partition_column.

  • A column of external model prediction values ("external predictions column"). The descriptions below use the name Model1_output as an example of the prediction values.

Note

External model prediction values must be numeric. For binary classification projects, the prediction values must be between [0.0, 1.0]. For regression projects, the prediction values must be between (-inf, inf).

Set advanced options

To prepare for modeling:

  1. Open the External Predictions tab in advanced options. Enter the external predictions column name(s) from your dataset (up to 100 columns). You are prompted to ensure that Partitioning is set.

  2. Click Set Partition Feature to open the appropriate tab. From the Partitioning tab:

    • Set Select partitioning method to Partition Feature.
    • Set Partition Feature to the column name partition_column.
    • Set Run models using to either cross validation or TVH.
    • If using TVH, set the values found in the partition_column that represent the partitions.

Add an external model

You can add an external model on the Leaderboard either:

  • As an individual model using Manual mode.
  • As one of many models using full Autopilot, Quick, or Comprehensive mode. In this case, the external model is added at the end of the model recommendation process.

For example, to add a single external model:

  1. From the Start page, change the modeling mode to Manual. (This allows you to select your external model from the Repository.) Click Start to begin EDA2.

  2. Once EDA2 finishes, open the Data page. In the Importance column, the external model prediction values column, Model1_Output, is labeled External and the partition feature, partition_column, is labeled Partition.

  3. Open the model Repository, search for Model1_Output, and select it. Notice in the resulting task setup fields, the feature list and sample size are not available for modification. This is because DataRobot cannot know which features from the training data, or what sample size, were used to train the external model.

  4. Click Run Task.

Evaluate the external model

When model building finishes, the model becomes available on the Leaderboard for comparison and further investigation. It is marked with the EXTERNAL PREDICTIONS label:

Note

The Leaderboard metric score (such as LogLoss) will be consistent with the equivalent validation, cross validation, and holdout metric scores calculated by scikit-learn.

The following insights are supported:

Insight Project type
Lift Chart All
Residuals Regression
ROC Curve Classification
Profit Curve Classification
Model comparison All
Model compliance documentation All; note only a subset of sections are generated due to the limited knowledge DataRobot has of the external model.
Bias and Fairness Classification; see below.

Bias and fairness testing

Additionally, if the dataset creates a binary classification project, you can set up Bias and Fairness options for bias testing of the external model.

  1. Complete the fields on the Bias and Fairness > Settings page. Click Save and DataRobot retrieves the necessary data.

  2. Open the Per-Class Bias tab to help identify if a model is biased, and if so, how much and who it's biased towards or against.

  3. Open the Cross-Class Accuracy tab to view calculated evaluation metrics and ROC curve-related scores, segmented by class, for each protected feature.


Updated May 12, 2022