# Monitoring and observability

> Monitoring and observability - OpenTelemetry integration, metrics, logs, and traces.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-06T18:17:09.610367+00:00` (UTC).

## Primary page

- [Monitoring and observability](https://docs.datarobot.com/en/docs/api/dev-learning/workload-api/monitoring.html): Full documentation for this topic (HTML).

## Sections on this page

- [Available metrics](https://docs.datarobot.com/en/docs/api/dev-learning/workload-api/monitoring.html#available-metrics): In-page section heading.
- [Accessing logs and traces](https://docs.datarobot.com/en/docs/api/dev-learning/workload-api/monitoring.html#accessing-logs-and-traces): In-page section heading.
- [Tracer configuration example](https://docs.datarobot.com/en/docs/api/dev-learning/workload-api/monitoring.html#tracer-configuration): In-page section heading.
- [Logger configuration example](https://docs.datarobot.com/en/docs/api/dev-learning/workload-api/monitoring.html#logger-configuration): In-page section heading.
- [API access to telemetry](https://docs.datarobot.com/en/docs/api/dev-learning/workload-api/monitoring.html#api-access-telemetry): In-page section heading.

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.

## Documentation content

The Workload API provides built-in monitoring capabilities through OpenTelemetry (OTel) integration.

## Available metrics

| Category | Metrics |
| --- | --- |
| Service health | Number of requests (succeeded / failed), latency, error rate, requests per minute. |
| Resource utilization | Number of replicas; CPU and memory consumption by container. |
| OTel metrics | OTel-compliant metrics emitted by your application. |

## Accessing logs and traces

You can instrument your applications for OpenTelemetry-compliant tracing and leverage the open-source OpenTelemetry (OTel) libraries for seamless auto-instrumentation.

### Tracer configuration example

```
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

resource = Resource.create({"service.namespace": "my-service"})

def configure_tracer() -> TracerProvider:
    trace_exporter = OTLPSpanExporter()
    trace_provider = TracerProvider(resource=resource)
    trace_provider.add_span_processor(BatchSpanProcessor(trace_exporter))
    trace.set_tracer_provider(trace_provider)
    return trace_provider

trace_provider = configure_tracer()
tracer = trace.get_tracer(__name__)

# Usage example
with tracer.start_as_current_span("Generate Text") as span:
    span.set_attribute("foo", "bar")
    span.add_event(name="ack", attributes={"john": "doe"})
```

### Logger configuration example

```
import logging
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter
from opentelemetry._logs import set_logger_provider

resource = Resource.create({"service.namespace": "my-service"})

def configure_logging() -> LoggerProvider:
    log_exporter = OTLPLogExporter()
    log_provider = LoggerProvider(resource=resource)
    log_provider.add_log_record_processor(BatchLogRecordProcessor(log_exporter))
    set_logger_provider(log_provider)
    # Bridge Python logging to OTel so logger.info() / logger.warning() are exported via OTLP
    handler = LoggingHandler(level=logging.NOTSET, logger_provider=log_provider)
    logging.getLogger().setLevel(logging.NOTSET)
    logging.getLogger().addHandler(handler)
    return log_provider

log_provider = configure_logging()
logger = logging.getLogger(__name__)

# Usage
logger.info("Logging info.", extra={"extra": "INFO details"})
logger.warning("Logging warning.", extra={"extra": "WARNING details"})
```

## API access to telemetry

You can retrieve specific traces, logs, and metrics directly via the API:

| Endpoint | Description |
| --- | --- |
| GET /otel/workload/{workloadId}/traces | Get traces for a workload. |
| GET /otel/workload/{workloadId}/traces/{traceId} | Get a specific trace. |
| GET /otel/workload/{workloadId}/logs | Get logs. |
| GET /otel/workload/{workloadId}/metrics/summary | Get metrics summary. |
