# Deployment service health

> Deployment service health - Track latency, throughput, and error rate for generative and agentic
> custom model deployments.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.232295+00:00` (UTC).

## Primary page

- [Deployment service health](https://docs.datarobot.com/en/docs/agentic-ai/agentic-monitor/agent-service-health.html): Full documentation for this topic (HTML).

## Sections on this page

- [Understand metric tiles and chart](https://docs.datarobot.com/en/docs/agentic-ai/agentic-monitor/agent-service-health.html#understand-metric-tiles-and-chart): In-page section heading.
- [Service health status indicators](https://docs.datarobot.com/en/docs/agentic-ai/agentic-monitor/agent-service-health.html#service-health-status-indicators): In-page section heading.
- [Explore deployment data tracing](https://docs.datarobot.com/en/docs/agentic-ai/agentic-monitor/agent-service-health.html#explore-deployment-data-tracing): In-page section heading.
- [Filter tracing logs](https://docs.datarobot.com/en/docs/agentic-ai/agentic-monitor/agent-service-health.html#filter-tracing-logs): In-page section heading.
- [Tracing table OTel attributes](https://docs.datarobot.com/en/docs/agentic-ai/agentic-monitor/agent-service-health.html#tracing-table-otel-attributes): In-page section heading.

## Related documentation

- [Agentic AI](https://docs.datarobot.com/en/docs/agentic-ai/index.html): Linked from this page.
- [Monitor](https://docs.datarobot.com/en/docs/agentic-ai/agentic-monitor/index.html): Linked from this page.
- [Data drift](https://docs.datarobot.com/en/docs/classic-ui/mlops/monitor/data-drift.html): Linked from this page.
- [Accuracy](https://docs.datarobot.com/en/docs/classic-ui/mlops/monitor/deploy-accuracy.html): Linked from this page.
- [Service health](https://docs.datarobot.com/en/docs/classic-ui/mlops/monitor/service-health.html): Linked from this page.
- [Prediction History and Service Health](https://docs.datarobot.com/en/docs/classic-ui/mlops/deployment/deploy-methods/add-deploy-info.html#prediction-history-and-service-health): Linked from this page.
- [agent-monitored deployments](https://docs.datarobot.com/en/docs/classic-ui/mlops/deployment/mlops-agent/monitoring-agent/index.html): Linked from this page.
- [prediction monitoring job](https://docs.datarobot.com/en/docs/classic-ui/predictions/batch/pred-monitoring-jobs/index.html): Linked from this page.
- [Deploymentsdashboard](https://docs.datarobot.com/en/docs/workbench/nxt-console/nxt-overview/nxt-dashboard.html): Linked from this page.
- [Service health](https://docs.datarobot.com/en/docs/workbench/nxt-console/nxt-monitoring/nxt-service-health.html): Linked from this page.
- [full deployment logs](https://docs.datarobot.com/en/docs/workbench/nxt-console/nxt-activity-log/nxt-otel-logs.html): Linked from this page.
- [Implement tracing](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-tracing-code.html#surface-tool-names-in-the-tracing-table): Linked from this page.

## Documentation content

# Deployment service health

The Service health tab tracks metrics about a deployment's ability to respond to prediction requests quickly and reliably. This helps identify bottlenecks and assess capacity, which is critical to proper provisioning. For example, if a model seems to have generally slowed in its response times, the Service health tab for the model's deployment can help. You might notice in the tab that median latency goes up with an increase in prediction requests. If latency increases when a new model is switched in, you can consult with your team to determine whether the new model can instead be replaced with one offering better performance.

To access Service health, select an individual deployment from the deployment inventory page and then, from the Overview, click Monitoring > Service health. The tab provides informational [tiles and a chart](https://docs.datarobot.com/en/docs/agentic-ai/agentic-monitor/agent-service-health.html#understand-metric-tiles-and-chart) to help assess the activity level and health of the deployment.

> [!NOTE] Time of Prediction
> The Time of Prediction value differs between the [Data drift](https://docs.datarobot.com/en/docs/classic-ui/mlops/monitor/data-drift.html) and [Accuracy](https://docs.datarobot.com/en/docs/classic-ui/mlops/monitor/deploy-accuracy.html) tabs and the [Service health](https://docs.datarobot.com/en/docs/classic-ui/mlops/monitor/service-health.html) tab:
> 
> On the
> Service health
> tab, the "time of prediction request" is
> always
> the time the prediction server
> received
> the prediction request. This method of prediction request tracking accurately represents the prediction service's health for diagnostic purposes.
> On the
> Data drift
> and
> Accuracy
> tabs, the "time of prediction request" is,
> by default
> , the time you
> submitted
> the prediction request, which you can override with the prediction timestamp in the
> Prediction History and Service Health
> settings.

## Understand metric tiles and chart

DataRobot displays informational statistics based on your current settings for model and time frame. That is, tile values correspond to the same units as those selected on the slider. If the slider interval values are weekly, the displayed tile metrics show values corresponding to weeks. Clicking a metric tile updates the chart below.

The Service health tab reports the following metrics on the dashboard:

> [!NOTE] Service health information for external models and monitoring jobs
> Service health information is unavailable for external [agent-monitored deployments](https://docs.datarobot.com/en/docs/classic-ui/mlops/deployment/mlops-agent/monitoring-agent/index.html) and deployments with predictions uploaded through a [prediction monitoring job](https://docs.datarobot.com/en/docs/classic-ui/predictions/batch/pred-monitoring-jobs/index.html).

| Statistic | Reports (for selected time period) |
| --- | --- |
| Total Predictions | The number of predictions the deployment has made (per prediction node). |
| Total Requests | The number of prediction requests the deployment has received (a single request can contain multiple prediction requests). |
| Requests over x ms | The number of requests where the response time was longer than the specified number of milliseconds. The default is 2000 ms; click in the box to enter a time between 10 and 100,000 ms or adjust with the controls. |
| Response Time | The time (in milliseconds) DataRobot spent receiving a prediction request, calculating the request, and returning a response to the user. The report does not include time due to network latency. Select the median prediction request time or 90th, 95th, or 99th percentile. The display reports a dash if you have made no requests against it or if it's an external deployment. |
| Execution Time | The time (in milliseconds) DataRobot spent calculating a prediction request. Select the median prediction request time or 90th, 95th, or 99th percentile. |
| Median/Peak Load | The median and maximum number of requests per minute. |
| Data Error Rate | The percentage of requests that result in a 4xx error (problems with the prediction request submission). This is a component of the value reported as the Service Health Summary on the Deployments dashboard top banner. |
| System Error Rate | The percentage of well-formed requests that result in a 5xx error (problem with the DataRobot prediction server). This is a component of the value reported as the Service Health Summary on the Deployments dashboard top banner. |
| Consumers | The number of distinct users (identified by API key) who have made prediction requests against this deployment. |

You can configure the dashboard to focus the visualized statistics on specific segments and time frames. The following controls are available:

| Control | Description |
| --- | --- |
| Model | Updates the dashboard displays to reflect the model you selected from the dropdown. |
| Range (UTC) | Sets the date range displayed for the deployment date slider. You can also drag the date slider to set the range. The range selector only allows you to select dates and times between the start date of the deployment's current version of a model and the current date. |
| Resolution | Sets the time granularity of the deployment date slider. The following resolution settings are available, based on the selected range: Hourly: If the range is less than 7 days.Daily: If the range is between 1-60 days (inclusive).Weekly: If the range is between 1-52 weeks (inclusive).Monthly: If the range is at least 1 month and less than 120 months. |
| Refresh | Initiates an on-demand update of the dashboard with new data. Otherwise, DataRobot refreshes the dashboard every 15 minutes. |
| Reset | Reverts the dashboard controls to the default settings. |

The chart below the metric tiles displays individual metrics over time, helping to identify patterns in the quality of service. Clicking on a metric tile updates the chart to represent that information; adjusting the data range slider focuses on a specific period:

> [!TIP] Export charts
> Click Export to download a `.csv` or `.png` file of the currently selected chart, or a `.zip` archive file of both (and a `.json` file).

The Median | Peak Load (calls/minute) chart displays two lines, one for Peak load and one for Median load over time:

## Service health status indicators

Service health tracks metrics about a deployment’s ability to respond to prediction requests quickly and reliably. You can view the service health status in the [deployment inventory](https://docs.datarobot.com/en/docs/workbench/nxt-console/nxt-overview/nxt-dashboard.html#health-indicators) and visualize service health on the [Service health](https://docs.datarobot.com/en/docs/workbench/nxt-console/nxt-monitoring/nxt-service-health.html) tab. Service health monitoring represents the occurrence of 4XX and 5XX errors in your prediction requests or prediction server:

- 4xx errors indicate problems with the prediction request submission.
- 5xx errors indicate problems with the DataRobot prediction server.

| Color | Service Health | Action |
| --- | --- | --- |
| Green / Passing | Zero 4xx or 5xx errors. | No action needed. |
| Yellow / At risk | At least one 4xx error and zero 5xx errors. | Concerns found, but no immediate action needed; monitor. |
| Red / Failing | At least one 5xx error. | Immediate action needed. |
| Gray / Disabled | Unmonitored deployment. | Enable monitoring and make predictions. |
| Gray / Not started | No service health events recorded. | Make predictions. |
| Gray / Unknown | No predictions made. | Make predictions. |

## Explore deployment data tracing

> [!NOTE] Premium
> Tracing is a premium feature. Contact your DataRobot representative or administrator for information on enabling this feature.

On the Service health tab of a custom model deployment (including text generation, agentic workflow, VDB, and MCP deployments), you can view the tracing table below the Total predictions chart. To view the tracing table, in the upper-right corner of the Total predictions chart, click Show tracing.

Traces represent the path taken by a request to a model or agentic workflow. DataRobot uses the [OpenTelemetry framework for tracing](https://opentelemetry.io/docs/concepts/signals/traces/). A trace follows the entire end-to-end path of a request, from origin to resolution. Each trace contains one or more spans, starting with the root span. The root span represents the entire path of the request and contains a child span for each individual step in the process. The root (or parent) span and each child span share the same Trace ID.

> [!NOTE] Access and retention
> The tracing table is available for all custom and external model deployments. Tracing data is stored for a retention period of 30 days, after which it is automatically deleted.

In the Tracing table, you can review the following fields related to each trace:

| Column | Description |
| --- | --- |
| Timestamp | The date and time of the trace in YYYY-MM-DD HH:MM format. |
| Status | The overall status of the trace, including all spans. The Status will be Error if any dependent task fails. |
| Trace ID | A unique identifier for the trace. |
| Duration | The amount of time, in milliseconds, it took for the trace to complete. This value is equal to the duration of the root span (rounded) and includes all actions represented by child spans. |
| Spans count | The number of completed spans (actions) included in the trace. |
| Cost | If cost data is provided, the total cost of the trace. |
| Prompt | The user prompt related to the trace. |
| Completion | The agent or model response (completion) associated with the prompt for the trace. |
| Tools | The tool or tools called during the request represented by the trace. |

Click Filter to filter by Min span duration, Max span duration, Min trace cost, and Max trace cost. The unit for span filters is nanoseconds (ns), the chart displays spans in milliseconds (ms).

> [!TIP] Filter accessibility
> The Filter button is hidden when a span is expanded to detail view. To return to the chart view with the filter, click Hide details panel.

To review the [spans](https://opentelemetry.io/docs/concepts/signals/traces/#spans) contained in a trace, along with trace details, click a trace row in the Tracing table. The span colors correspond to a Span service, usually a deployment.Restricted span appears when you don’t have access to the deployment or service associated with the span. You can view spans in Chart format or List format.

> [!TIP] Span detail controls
> From either view, you can click Hide table to collapse the Timestamps table or Hide details panel to return to the expanded Tracing table view.

**Chart view:**
[https://docs.datarobot.com/en/docs/images/nxt-tracing-table-spans.png](https://docs.datarobot.com/en/docs/images/nxt-tracing-table-spans.png)

**List view:**
[https://docs.datarobot.com/en/docs/images/nxt-tracing-table-spans-list.png](https://docs.datarobot.com/en/docs/images/nxt-tracing-table-spans-list.png)

> [!NOTE] Trace details
> In list view, you can click Trace details to view the Input/Output ( Prompt and Completion) and Evaluation details about the trace associated with the current span.


For either view, click the Span service name to access the deployment or resource (if you have access). Additional information, dependent on the configuration of the generative AI model or agentic workflow, is available on the Info, Resources, Events, Input/Output, Error, and Logs tabs. The Error tab only appears when an error occurs in a trace.

**Chart view:**
[https://docs.datarobot.com/en/docs/images/nxt-tracing-table-span-tabs.png](https://docs.datarobot.com/en/docs/images/nxt-tracing-table-span-tabs.png)

**List view:**
[https://docs.datarobot.com/en/docs/images/nxt-tracing-table-span-tabs-list-view.png](https://docs.datarobot.com/en/docs/images/nxt-tracing-table-span-tabs-list-view.png)


### Filter tracing logs

From the list view, you can display OTel logs for a span. The results shown are a subset of the [full deployment logs](https://docs.datarobot.com/en/docs/workbench/nxt-console/nxt-activity-log/nxt-otel-logs.html), and are accessed as follows:

1. Open the list view and select a span underTrace details.
2. Click theLogstab.
3. ClickShow logs.

### Tracing table OTel attributes

For Cost, Prompt, Completion, and Tools, DataRobot reads specific span attributes across all spans that belong to the trace. Other columns (such as Timestamp and Duration) come from trace and span metadata rather than these attributes.

| Column | OpenTelemetry mapping |
| --- | --- |
| Cost | Sums numeric values from the datarobot.moderation.cost attribute on spans in the trace (when that attribute is present). |
| Prompt | Uses the gen_ai.prompt attribute. If more than one span includes gen_ai.prompt, the first value encountered in trace order is shown. |
| Completion | Uses the gen_ai.completion attribute. If more than one span includes gen_ai.completion, the last value encountered in trace order is shown. |
| Tools | Collects every distinct value of the tool_name attribute found on spans in the trace and lists those tool names in the column. |

Attribute keys must match exactly (including the underscore in `gen_ai`). Names such as `genai.prompt` or `GenAI.prompt` are not read for the Prompt and Completion columns.

Automatic instrumentation (including DataRobot agent templates) often sets `gen_ai.prompt`, `gen_ai.completion`, and sometimes `tool_name`. For custom or external models, frameworks differ: tool execution may not emit `tool_name` even when tools run (for example, some LangGraph callback flows). In that case Prompt and Completion can populate while Tools remains empty until `tool_name` is configured on a span that runs inside the tool—see [Implement tracing](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-tracing-code.html#surface-tool-names-in-the-tracing-table).
