Skip to content

Replace and roll out

A replacement transitions a Workload onto a new proton—either to pick up updated runtime settings against the same artifact, or to swap the artifact behind the Workload. The Workload identity, endpoint, and governance stay put; the platform brings up a candidate proton alongside the active one, and traffic shifts between them.

Replace rules by artifact status

When the replacement also swaps the artifact behind a Workload, the artifact statuses must match.

Current artifact New artifact Allowed Notes
draft draft Yes Iterative development—swap draft for draft.
locked locked Yes Production update—swap locked for locked.
draft locked No Use POST /workloads/{workload_id}/promote for in-place promotion (see Promote to production).
locked draft No Locked Workloads cannot downgrade to draft.

Rollout strategy

A replacement specifies its strategy on the StartReplacementRequest.strategy field. The current ReplacementStrategy enum exposes a single value:

Strategy How it works Rollback speed Resource cost Best for
rolling New proton is deployed alongside the old one; traffic switches to the new proton once it is ready, then the old proton is decommissioned. Minutes 1× + surge Standard updates.

ReplacementConfig carries two timing controls:

Field Purpose
warmupDurationMinutes How long the candidate warms up before traffic exposure (default 0).
keepOldVersionMinutes How long the platform keeps the previous proton after the switch as the rollback window (default 0).

Trigger a replacement

There are two ways to start a replacement, with different scope.

Endpoint Use case Body Returns
POST /workloads/{workload_id}/replacement Swap the artifact behind a Workload, optionally with a new runtime, choosing the rollout strategy and config. StartReplacementRequest (artifactId and strategy required; config and runtime optional—omitting runtime reuses the current one). 202 with a Replacement.
PATCH /workloads/{workload_id}/settings Update the runtime against the current artifact (replica count, autoscaling, resources). The platform queues a rolling replacement onto the new runtime. WorkloadSettingsUpdate with runtime. 202 with a Replacement.

Start a replacement onto a new artifact, with a warmupDurationMinutes warmup and a keepOldVersionMinutes rollback window:

curl -X POST "${DATAROBOT_ENDPOINT}/workloads/${WORKLOAD_ID}/replacement" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "artifactId": "'"${NEW_ARTIFACT_ID}"'",
    "strategy": "rolling",
    "config": {
      "warmupDurationMinutes": 5,
      "keepOldVersionMinutes": 30
    }
  }'

Or queue a runtime-only replacement against the current artifact (scale replicas and enable autoscaling):

curl -X PATCH "${DATAROBOT_ENDPOINT}/workloads/${WORKLOAD_ID}/settings" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "runtime": {
      "containerGroups": [{
        "name": "default",
        "replicaCount": 4,
        "autoscaling": {
          "enabled": true,
          "policies": [{
            "scalingMetric": "httpRequestsConcurrency",
            "target": 20,
            "minCount": 2,
            "maxCount": 10
          }]
        }
      }]
    }
  }'

For scalingMetric values and scale-to-zero behavior (httpRequestsConcurrency with minCount: 0), see Scaling metrics.

If a replacement is already in progress for the Workload, POST /replacement returns 422.

Inspect or cancel an active replacement

Verb Path Description
GET /workloads/{workload_id}/replacement Get the current active Replacement for the Workload.
DELETE /workloads/{workload_id}/replacement Cancel an active replacement. Returns 202.

What gates the switch

The candidate proton reports running only when all of its pods reach pod phase Running and every container in those pods is ready (readiness probe passing). For multi-replica candidates, every replica has to pass before the candidate as a whole reports running. The traffic switch is gated by these predicates—see Lifecycle states for the full predicate table.

If the candidate hangs at warming or initializing, the cause is almost always a readiness-probe issue (wrong path or non-2xx response), an image-pull issue (often on a sidecar), or a crash loop.

Replacement status lifecycle

ReplacementStatus has nine values. The platform drives the active transitions; clients observe.

Status Meaning
unknown Default after creation, before the platform picks it up.
submitted Accepted by the API.
initializing Candidate proton is being prepared (resources allocated, pods starting).
promoting Candidate is running and traffic is shifting from active to candidate.
finalizing Traffic switch is complete; old or replaced resources are being torn down.
deleting Cancellation was requested via DELETE /workloads/{workload_id}/replacement.
finalizing Traffic switch complete (happy path) or cancellation cleanup in progress (cancel path); resources are being torn down.
completed Terminal—successful swap.
errored Terminal—failed; see Replacement.message for details.

Active states are unknown, submitted, initializing, promoting, finalizing, and deleting. Terminal states are completed and errored. Another POST /replacement while a replacement is in any active state returns 422.

The platform drives the full transition automatically. There is no client-side switch action—once the candidate reaches promoting and the warmup window expires, traffic flips and the status moves to finalizingcompleted on its own. There is also no rollback verb; to revert, issue another POST /replacement with the previous artifactId. DELETE /workloads/{workload_id}/replacement is cancel (sets the status to deleting and reverts the candidate)—not "complete early."

Inspect candidate and active protons

While a replacement is active, the Workload's protons sub-resource lists both protons with their current role.

curl -s "${DATAROBOT_ENDPOINT}/workloads/${WORKLOAD_ID}/protons" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" | jq '.data[] | {id, artifactId, role, status}'

role is active or candidate (ProtonRole). To inspect per-replica readiness inside a proton—useful when a candidate is stuck—use GET /workloads/{workload_id}/protons/{proton_id}/statusDetails, which returns each replica's pod phase, container statuses, and readiness conditions.

Replacement history

GET /workloads/{workload_id}/history returns the replacement history for a Workload (paginated ReplacementHistoryListResponse). Each entry includes the candidate artifact ID, strategy, config, status, and the final outcome—useful for change tracking and post-incident review.