Skip to content

Deploy production-ready artifact

This tutorial walks you through deploying a containerized AI service via the Workload API and linking it to a DataRobot Deployment for governance.

What you'll deploy

This tutorial deploys a FastAPI-based agent service that provides:

  • OpenAI-compatible /chat/completions—Proxies requests to a DataRobot-deployed LLM.
  • LangGraph /agent endpoint—A ReAct agent with arXiv search for research queries.
  • Health endpoints/healthz (liveness), /readyz (readiness), /health (detailed status).
  • OpenTelemetry logging—Structured logs exported via OTLP for observability.

Workflow overview

  1. Create a workload → Returns workloadId and artifactId
  2. Monitor status → Use workloadId to poll until running
  3. Test the workload → Invoke endpoints using workloadId
  4. Create a deployment → Pass workloadId, returns deploymentId

API base URLs

API group Base path
Workloads /api/v2/console/workloads/
Deployments /api/v2/console/deployments/
Workload invoke /api/v2/endpoints/workloads/{workloadId}/
Deployment invoke /api/v2/endpoints/deployments/{deploymentId}/

Step 1: Create a workload

Deploy your container by calling the Workload API. This creates a registered artifact and schedules the container on Kubernetes.

curl -X POST "${DATAROBOT_ENDPOINT}/console/workloads/" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "agent-service",
    "artifact": {
      "name": "agent-service-artifact",
      "status": "registered",
      "spec": {
        "containerGroups": [{
          "containers": [{
            "imageUri": "<your-registry>/agent-service:1.0.0",
            "port": 8080,
            "primary": true,
            "entrypoint": ["python", "server.py"],
            "resourceRequest": {"cpu": 1, "memory": 536870912},
            "environmentVars": [
              {"name": "MODEL", "value": "openai/gpt-oss-20b"},
              {"name": "DEPLOYMENT_ID", "value": "<your-llm-deployment-id>"}
            ],
            "readinessProbe": {"path": "/readyz", "port": 8080}
          }]
        }]
      }
    },
    "runtime": {"replicaCount": 1}
  }'

Step 2: Monitor status

Poll until status transitions to running:

curl -s "${DATAROBOT_ENDPOINT}/console/workloads/${WORKLOAD_ID}" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" | jq '.status'

Step 3: Test the workload

Once running, test the endpoints:

Chat completions:

curl -X POST "${DATAROBOT_ENDPOINT}/endpoints/workloads/${WORKLOAD_ID}/chat/completions" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"model": "openai/gpt-oss-20b", "messages": [{"role": "user", "content": "Hello!"}]}'

Step 4: Create a deployment

Link the workload to a Deployment for monitoring, sharing, and audit trails:

curl -X POST "${DATAROBOT_ENDPOINT}/console/deployments/" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "workloadId": "'${WORKLOAD_ID}'",
    "name": "agent-service-deployment",
    "importance": "moderate"
  }'

Step 5: Invoke via deployment endpoint

After creating the deployment, invoke your service through the governed deployment endpoint:

curl -X POST "${DATAROBOT_ENDPOINT}/endpoints/deployments/${DEPLOYMENT_ID}/chat/completions" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"model": "openai/gpt-oss-20b", "messages": [{"role": "user", "content": "Hello!"}]}'