Workload API > Get started: Workload API

Get started: Workload API¶

プレミアム機能

Workload APIはプレミアム機能です。この機能を有効にする方法については、DataRobotの担当者または管理者にお問い合わせください。

AIワークロードを実行し、本番環境向けのエンドポイントを取得します。 HTTPでリッスンするあらゆるものを構築します。

The Workload API runs containerized AI services on DataRobot. Bring a container image to get a stable URL with autoscaling, monitoring, sharing, importance-based prioritization, and an API-driven end-to-end lifecycle—from early development to production rollouts.

What you can build¶

The Workload API is intentionally container-shaped—anything that listens on HTTP is a viable Workload.

Pattern	例
Agent services	LangGraph, CrewAI, AutoGen, and Google ADK orchestrations on top of any LLM, or your own framework.
Model inference servers	NVIDIA NIM, vLLM, TGI, and custom GPU servers with autoscaling and bundle-based GPU selection.
RAG pipelines	Retrievers, rerankers, and embedding services with shared backends.
MCP servers	Model Context Protocol servers that expose tools to agents.
ベクターデータベース	Qdrant, Weaviate, Milvus—anything with an HTTP API.
Frontends and dashboards	Streamlit, Gradio, FastAPI apps, and custom UIs.

New here?

Deploy your first Workload in five minutes with Tutorial: Hello, Workload—a runnable notebook with paired curl and Python cells: create a draft Workload, wait for running, and invoke the endpoint.

The model in 30 seconds¶

The Workload API consists of three objects, flowing in one direction:

flowchart LR
    A["<b>Artifact</b><br/><i>What to run</i>"] --> W["<b>Workload</b><br/><i>Governed identity</i>"]
    W --> P["<b>Protons / Pods</b><br/><i>Runtime backing / Kubernetes execution</i>"]

エンティティ	役割	What it carries
アーティファクト	What to run	A container specification (image, ports, entrypoint, environment, resources, probes) and its lifecycle (`draft` or `locked`).
ワークロード	Governed identity	The identity you hand to consumers: a stable invoke endpoint plus sharing settings, importance, monitoring, and a lifetime policy that moves from ephemeral draft (8-hour TTL) to persistent production on promote.
Proton	Runtime backing	The running container instances (and their pods) that execute the artifact behind the Workload URL.

The artifact describes what. The Workload is the governed identity you hand to consumers. Protons are the execution layer—you'll work at the Workload level most of the time and only drop down to protons when you're inspecting status, debugging a failure, or validating a candidate during a replacement.

Lifecycle state flows up: from pods to protons to the Workload. A Workload always has at least one proton; during a replacement it can temporarily have two (active and candidate), and high-availability Workloads can have multiple protons permanently. A replacement adds a candidate proton alongside the active one and shifts traffic between them; Workload identity stays put. A promotion flips the artifact from draft to locked and the Workload from ephemeral (8-hour TTL) to persistent—identity preserved.

Pick your path¶

Goal	Go here
Deploy your first container and see traffic flow end-to-end	Tutorial: Hello, Workload
Take a service to production with sharing, importance, and monitoring	Tutorial: Deploy a production-ready container
See what your code is doing inside each request—traces, metrics, and logs	Instrument a Workload with OpenTelemetry (Python)
Ship a new container version without dropping the endpoint	Tutorial: Replace the artifact behind a running Workload

Two ways to create a Workload¶

POST /workloads/ accepts either an inline artifact spec (artifact) or a reference to an existing artifact (artifactId). Exactly one of the two is required.

モード	目的
Inline (`artifact: { ... }`)	Quick experiments and hello-world Workloads. One request creates the artifact and the Workload together; the artifact is created in `draft` status.
By reference (`artifactId: "..."`)	The artifact is governed separately—created via `POST /artifacts`, locked from a previous draft, or used as a shared versioned baseline. Required when the same locked artifact serves more than one Workload (for example, multi-region or A/B deployments).

Core concepts at a glance¶

Use this section as a compass. Each row points to deeper pages; details live in the linked sections.

トピック	説明	リファレンス
アーティファクト	Container spec plus draft/locked lifecycle, configuration layering, and repositories for version history.	Artifact concepts, REST: artifacts.
ワークロード	Stable identity, invoke URL, importance, sharing, monitoring, and lifetime rules tied to artifact status.	Workload concepts, Choose draft vs. locked.
Runtime execution	Pods back protons; Workload status aggregates pod and proton state (including worst-state-wins for replicas).	Lifecycle states, Replace and roll out.
Day zero to production	Draft Workloads for iteration; lock (or promote) for indefinite lifetime; rolling replacement for zero-downtime upgrades.	Hello, Workload, Production-ready tutorial, Promote.
Observability	OTel logs, metrics, traces; Workload stats, events, and history; retention differs for draft vs. locked.	Monitoring concepts, Health and readiness.

Design principles¶

These principles explain why the API is shaped the way it is—useful when you're deciding where a piece of configuration belongs or whether a workflow fits the model.

Principle	説明
Separation of concerns	Artifacts define what the container is (image, entrypoint, probes, resource requests, baseline env). Workloads define the governed identity and per-deployment runtime (importance, sharing, autoscaling, runtime parameter overrides). The same artifact can back many Workloads across environments.
Immutability for production	Locked artifacts are immutable and versioned within an artifact repository. The same locked artifact can back staging, production, and per-region Workloads with confidence that the binary hasn't changed underneath them.
Progressive governance	Draft artifacts and draft Workloads exist for fast iteration: 8-hour TTL, no required `importance`. Governance applies when you lock—either directly with `PATCH /artifacts/{id}` or in place via `POST /workloads/{id}/promote`.
Infrastructure abstraction	Workload status is computed from pod predicates by the platform, not surfaced from any specific underlying operator. The API surface is uniform across Workload types so the same lifecycle semantics apply whether your Workload runs as a Kubernetes Deployment, a NIM custom resource, or another execution shape.

次のステップ¶

Next, learn about Workload API endpoints and run the hello-world tutorial to deploy a real container in five minutes.

リソース	説明
Hello, Workload	Deploy `whoami` as a draft Workload, following the shortest path from zero to a running container on DataRobot.
API quick reference	Authentication, endpoint groups, and links into generated API documentation.
Best practices and troubleshooting	Container design, production hardening, security, and recovery steps for common failures.
Workload APIスキル	`datarobot-workload-api`コーディングエージェントスキルをインストールすると、Claude Code、Cursor、その他のエージェントから自然言語でワークロードのデプロイ、デバッグ、監視、管理を行うことができます。
CLIを使用したアーティファクトの管理	`dr artifact`リファレンス：ターミナルからワークロードアーティファクトのコードを作成、構築、ロック、および管理します。
CLIを使用したワークロードの管理	`dr workload`リファレンス：ターミナルからワークロードを作成、開始、停止、監視、および削除します。