Deploy production-ready artifact¶
This tutorial walks you through deploying a containerized AI service via the Workload API and linking it to a DataRobot Deployment for governance.
What you'll deploy¶
This tutorial deploys a FastAPI-based agent service that provides:
- OpenAI-compatible
/chat/completions—Proxies requests to a DataRobot-deployed LLM. - LangGraph
/agentendpoint—A ReAct agent with arXiv search for research queries. - Health endpoints—
/healthz(liveness),/readyz(readiness),/health(detailed status). - OpenTelemetry logging—Structured logs exported via OTLP for observability.
Workflow overview¶
- Create a workload → Returns
workloadIdandartifactId - Monitor status → Use
workloadIdto poll until running - Test the workload → Invoke endpoints using
workloadId - Create a deployment → Pass
workloadId, returnsdeploymentId
API base URLs¶
| API group | Base path |
|---|---|
| Workloads | /api/v2/console/workloads/ |
| Deployments | /api/v2/console/deployments/ |
| Workload invoke | /api/v2/endpoints/workloads/{workloadId}/ |
| Deployment invoke | /api/v2/endpoints/deployments/{deploymentId}/ |
Step 1: Create a workload¶
Deploy your container by calling the Workload API. This creates a registered artifact and schedules the container on Kubernetes.
curl -X POST "${DATAROBOT_ENDPOINT}/console/workloads/" \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "agent-service",
"artifact": {
"name": "agent-service-artifact",
"status": "registered",
"spec": {
"containerGroups": [{
"containers": [{
"imageUri": "<your-registry>/agent-service:1.0.0",
"port": 8080,
"primary": true,
"entrypoint": ["python", "server.py"],
"resourceRequest": {"cpu": 1, "memory": 536870912},
"environmentVars": [
{"name": "MODEL", "value": "openai/gpt-oss-20b"},
{"name": "DEPLOYMENT_ID", "value": "<your-llm-deployment-id>"}
],
"readinessProbe": {"path": "/readyz", "port": 8080}
}]
}]
}
},
"runtime": {"replicaCount": 1}
}'
Step 2: Monitor status¶
Poll until status transitions to running:
curl -s "${DATAROBOT_ENDPOINT}/console/workloads/${WORKLOAD_ID}" \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" | jq '.status'
Step 3: Test the workload¶
Once running, test the endpoints:
Chat completions:
curl -X POST "${DATAROBOT_ENDPOINT}/endpoints/workloads/${WORKLOAD_ID}/chat/completions" \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"model": "openai/gpt-oss-20b", "messages": [{"role": "user", "content": "Hello!"}]}'
Step 4: Create a deployment¶
Link the workload to a Deployment for monitoring, sharing, and audit trails:
curl -X POST "${DATAROBOT_ENDPOINT}/console/deployments/" \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"workloadId": "'${WORKLOAD_ID}'",
"name": "agent-service-deployment",
"importance": "moderate"
}'
Step 5: Invoke via deployment endpoint¶
After creating the deployment, invoke your service through the governed deployment endpoint:
curl -X POST "${DATAROBOT_ENDPOINT}/endpoints/deployments/${DEPLOYMENT_ID}/chat/completions" \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"model": "openai/gpt-oss-20b", "messages": [{"role": "user", "content": "Hello!"}]}'