Performance monitoring¶
To trust a model to power mission-critical operations, you must have confidence in all aspects of model deployment. Model monitoring is the close tracking of the performance of models in production; it is used to identify potential issues before they impact the business. Monitoring ranges from whether the service is reliably providing predictions in a timely manner and without errors to ensuring the predictions themselves are reliable.
The predictive performance of a model typically starts to diminish as soon as it’s deployed. For example, someone might be making live predictions on a dataset with customer data, but the customer’s behavioral patterns might have changed due to an economic crisis, market volatility, natural disaster, or even the weather. Models trained on older data that no longer represents the current reality might not just be inaccurate, but irrelevant, leaving the prediction results meaningless or even harmful. Without dedicated production model monitoring, the user or business owner cannot know or be able to detect when this happens. If model accuracy starts to decline without detection, the results can impact a business, expose it to risk, and destroy user trust.
DataRobot automatically monitors model deployments and offers a central hub for detecting errors and model accuracy decay as soon as possible. For each deployment, DataRobot provides a status banner—model-specific information is also available on the Deployments inventory page.
These sections describe the tools available for monitoring model deployments:
Topic | Description | Data Required for Monitoring |
---|---|---|
Deployments | Viewing deployment inventory. | N/A |
Notifications tabs on the Settings page | Configuring notifications and monitoring. | N/A |
Service Health | Tracking model-specific deployment latency, throughput, and error rate. | Prediction data |
Data Drift | Monitoring model accuracy based on data distribution. | Prediction and training data |
Accuracy | Analyzing performance of a model over time. | Training data, prediction data, and actuals data |
Challenger Models | Comparing model performance post-deployment. | Prediction data |
Usage | Tracking prediction processing progress for use in accuracy, data drift, and predictions over time analysis. | Prediction data or actuals |
Data Exploration | Exploring a deployment's stored prediction data, actuals, and training data to compute and monitor custom business or performance metrics. | Training data, prediction data, or actuals data |
Custom Metrics | Creating and monitoring custom business or performance metrics. | Prediction data |
MLOps agent | Monitoring remote models. | Requires a remote model and an external model package deployment |
Segmented analysis | Tracking attributes for segmented analysis of training data and predictions. | Prediction data (training data also required to track data drift or accuracy) |
Batch monitoring | View monitoring statistics organized into batches instead of monitoring all predictions as a whole, over time. | Training data, prediction data, and actuals data (for accuracy) |
Generative model monitoring | The text generation target type for DataRobot custom and external models is compatible with generative Large Language Models (LLMs), allowing you to deploy generative models, make predictions, monitor service, usage, and data drift statistics, export data, and create custom metrics. | Generative model text data |