Test with external datasets¶
Compute metric scores and insights on an external test dataset to help validate a model's performance and its generalization capabilities before deployment. Evaluating the model against data it hasn't seen during training ("external") provides insights into how well and consistently the model will perform in real-world scenarios. Note that this testing is not available for time series models.
Request external scores and insights¶
To compute scores and insights on a dataset:
- Upload a prediction dataset that contains the target column (
PredictionDataset.contains_target_values == True). - The dataset must have the same structure as the original project data.
import datarobot as dr
# Upload dataset
project = dr.Project(project_id)
dataset = project.upload_dataset('./test_set.csv')
dataset.contains_target_values
>>>True
# request external test to compute metric scores and insights on dataset
# select model using project.get_models()
external_test_job = model.request_external_test(dataset.id)
# once job is complete, scores and insights are ready for retrieving
external_test_job.wait_for_completion()
Retrieve external metric scores and insights¶
After completion of the external test job, metric scores and insights for external test sets will be ready.
Note
Some notes:
- Check
PredictionDataset.data_quality_warningsfor dataset warnings. - Insights are not available if the dataset is fewer than 10 rows.
- The ROC curve cannot be calculated if the dataset has only one class in the target column.
Retrieve external metric scores¶
import datarobot as dr
# retrieving list of external metric scores on multiple datasets
metric_scores_list = dr.ExternalScores.list(project_id, model_id)
# retrieving external metric scores on one dataset
metric_scores = dr.ExternalScores.get(project_id, model_id, dataset_id)
Retrieve an external lift chart¶
import datarobot as dr
# retrieving list of lift charts on multiple datasets
lift_list = dr.ExternalLiftChart.list(project_id, model_id)
# retrieving one lift chart for dataset
lift = dr.ExternalLiftChart.get(project_id, model_id, dataset_id)
Retrieve an external multiclass lift chart¶
Available for multiclass classification models only.
import datarobot as dr
# retrieving list of lift charts on multiple datasets
lift_list = ExternalMulticlassLiftChart.list(project_id, model_id)
# retrieving one lift chart for dataset and a target class
lift = ExternalMulticlassLiftChart.get(project_id, model_id, dataset_id, target_class)
Retrieve an external ROC curve¶
Available for binary classification models only.
import datarobot as dr
# retrieving list of roc curves on multiple datasets
roc_list = ExternalRocCurve.list(project_id, model_id)
# retrieving one ROC curve for dataset
roc = ExternalRocCurve.get(project_id, model_id, dataset_id)
Retrieve a multiclass confusion matrix¶
Available for multiclass classification models only.
import datarobot as dr
# retrieving list of confusion charts on multiple datasets
confusion_list = ExternalConfusionChart.list(project_id, model_id)
# retrieving one confusion chart for dataset
confusion = ExternalConfusionChart.get(project_id, model_id, dataset_id)
Retrieve a residuals chart¶
Available for regression models only.
import datarobot as dr
# retrieving list of residuals charts on multiple datasets
residuals_list = ExternalResidualsChart.list(project_id, model_id)
# retrieving one residuals chart for dataset
residuals = ExternalResidualsChart.get(project_id, model_id, dataset_id)