Management agent deployment status and events¶
To monitor the status and health of management agent deployments, you can view the overall deployment status and specific deployment service health events.
Deployment status¶
When the management agent is performing an action on an external deployment that it is managing, a badge appears under the deployment name in the deployment inventory, and on any tab within the deployment, to indicate the deployment status. The following four deployment status values are possible when an action is being taken on a deployment managed by the management agent:
Status | Badge |
---|---|
LAUNCHING | |
STOPPING | |
MODEL REPLACING | |
ERRORED |
Deployment events¶
The management agent sends periodic updates about deployment health and status via the API. These are reported as MLOps events and are listed on the Service Health page.
DataRobot allows you to monitor and work with deployment events for external deployments once set up with the management agent. From one place, you can:
Action | Example use case |
---|---|
Record and persist deployment-related events | Recording deployment actions, health changes, state changes, etc. |
View all related events | Auditing deployment events. |
Filter and search events | Viewing all model changes. |
Extract data | Reporting and offline storage. |
Receive notification of certain incidents | Receiving a Slack message for an outage. |
Enforce a retention policy | Ensuring that a log-retention policy is followed (90 days of retention guaranteed; older events may be purged). |
To view an overview of deployment events, select the deployment from the inventory and navigate to the Service Health tab. All events are recorded under the Recent Activity > Agent Activity section:
The most recent events are listed at the top of the list. Each event shows the time it occurred, a description, and an icon indicating its status:
Icon | Description |
---|---|
Green / Passing | No action needed. |
Yellow / At risk | Concerns found but no immediate action needed; continue monitoring. |
Red / Failing | Immediate action needed. |
Gray / Unknown | Unknown |
Informational | Details a deployment action (e.g., the deployment has launched). |
Note
The management agent's most recently reported service health status is prioritized. For example, if data drift is green and passing on a deployment, but the management agent delivers an inferior status (red and failing), the list updates to reflect that condition.
Select an event row to view its details on the right-side panel.