AI robustness testing¶
AI robustness tests help validate that your generative AI models perform well under various conditions and meet quality standards. Robustness testing allows you to configure insights that measure different aspects of model performance, such as toxicity, relevance, and coherence.
Configure insights for robustness testing¶
Set up insights to test AI model robustness. When creating an insight configuration, you should specify the following:
custom_model_version_id: The ID of the custom model version to test.insight_name: A user-friendly name for the insight.insight_type: The type of insight (e.g.,OOTB_METRIC).ootb_metric_name: The name of the DataRobot-provided metric (for theOOTB_METRICtype).stage: The stage when the metric is calculated (PROMPTorRESPONSE).
import datarobot as dr
custom_model_version = dr.CustomModelVersion.get(version_id)
insight_config = dr.InsightsConfiguration.create(
custom_model_version_id=custom_model_version.id,
insight_name="Toxicity Check",
insight_type=dr.InsightTypes.OOTB_METRIC,
ootb_metric_name="Toxicity",
stage=dr.InsightStage.RESPONSE
)
insight_config
Set up test configurations¶
To configure test parameters:
eval_dataset = dr.EvaluationDatasetConfiguration.create(
name="Robustness Test Dataset",
dataset_id=dataset.id
)
cost_config = dr.CostConfiguration.create(
name="Test Cost Config",
cost_per_token=0.0001
)
insight_config.update(
evaluation_dataset_configuration_id=eval_dataset.id,
cost_configuration_id=cost_config.id
)
Run robustness tests¶
To execute tests and review results:
insight_config = dr.InsightsConfiguration.get(insight_config_id)
if insight_config.execution_status == "COMPLETED":
results = insight_config.get_results()
print(f"Test results: {results}")
elif insight_config.execution_status == "ERROR":
print(f"Test failed: {insight_config.error_message}")
print(f"Resolution: {insight_config.error_resolution}")
Configure multiple insights¶
To set up multiple tests for comprehensive validation:
insights = [
{
"insight_name": "Toxicity Check",
"ootb_metric_name": "Toxicity",
"stage": dr.InsightStage.RESPONSE
},
{
"insight_name": "Relevance Check",
"ootb_metric_name": "Relevance",
"stage": dr.InsightStage.RESPONSE
},
{
"insight_name": "Coherence Check",
"ootb_metric_name": "Coherence",
"stage": dr.InsightStage.RESPONSE
}
]
for insight_config in insights:
dr.InsightsConfiguration.create(
custom_model_version_id=custom_model_version.id,
**insight_config
)
Get an insight configuration¶
To retrieve a specific insight configuration:
insight_config = dr.InsightsConfiguration.get(insight_config_id)
print(f"Insight name: {insight_config.insight_name}")
print(f"Insight type: {insight_config.insight_type}")
print(f"Execution status: {insight_config.execution_status}")
List insight configurations¶
Get all insights for a custom model version:
custom_model_version = dr.CustomModelVersion.get(version_id)
insights = dr.InsightsConfiguration.list(custom_model_version_id=custom_model_version.id)
for insight in insights:
print(f"{insight.insight_name}: {insight.execution_status}")