Skip to content

Artifact concepts

An artifact describes what a Workload runs: a container specification with image URI, port, entrypoint, environment variables, and health probes. Workloads are created from artifacts; one locked artifact can serve as the foundation for many Workloads.

Artifact types

Artifacts use a type discriminator. The API accepts service (default) and nim (NVIDIA NIM model artifacts). Both use the same multi-container containerGroups shape; NIM adds optional storage and GPU-oriented autoscaling metrics in the generated API reference.

タイプ 使用するタイミング
service Container-based Workloads where you supply images and the platform runs them (primary plus optional sidecar containers): inference servers, agent containers (LangGraph, CrewAI, AutoGen), APIs, and web services.
nim NVIDIA NIM model serving when you need NIM-specific scheduling, storage options, and scaling signals (gpuCacheUtilization, gpuRequestQueueDepth). See the Workload API reference for NimArtifactSpec and Scaling metrics.

NIM artifacts include an optional storage field (NimStorageConfig) with two choices: dedicatedPvc (default) gives the Workload its own PVC for model weights. Alternatively, nimCache uses a cluster-wide PVC keyed on the model image—pick nimCache when multiple Workloads share the same NIM image so that the cluster keeps a single cached copy of the weights.

Container requirements

Every container the platform runs must satisfy the following baseline requirements:

要件 Why
Runs as non-root user The platform runs containers unprivileged.
Listens on a port between 1024 and 65535 Containers cannot bind to privileged ports (0-1023). The port is set on the artifact via containers[].port.
Exposes an HTTP server The platform proxies invoke traffic as HTTP to the primary container.
Implements a readiness probe endpoint The platform polls readinessProbe.path to determine readiness.

Primary vs. non-primary containers

A container group has exactly one primary container (primary: true) plus any number of sidecars. The primary container must define a port; non-primary containers (sidecars) must omit port. Assigning these incorrectly returns a 422 validation error at artifact create or update.

Image build configuration

For draft service artifacts, set imageBuildConfig on a container instead of imageUri. This then builds from uploaded source code (POST /artifacts/{id}/builds). Provide either a pre-built imageUri or imageBuildConfig (not both). After a successful build, the platform populates imageUri.

ImageBuildConfig.dockerfile is a discriminated union on source (defaults to provided using ./Dockerfile from your source):

source スキーマ フィールド
provided ProvidedDockerfile path (string, default ./Dockerfile)—relative path to the Dockerfile in synced source code.
generated GeneratedDockerfile executionEnvironmentId, executionEnvironmentVersionId, entrypoint—platform generates a Dockerfile from the execution environment base image.

Example using a provided Dockerfile:

"imageBuildConfig": {
  "dockerfile": {
    "source": "provided",
    "path": "./Dockerfile"
  }
} 

Omit path to use the default ./Dockerfile. See Image builds REST.

Artifact lifecycle

This section defines draft and locked as they relate to artifacts. For the Workload-creation decision—when each fits and what each implies for TTL (time-to-live), importance, and replace rules—see Choose draft vs. locked. Each status differs in lifecycle and editability.

ステータス Editable 説明
draft はい Default status. Mutable. Update via PATCH/PUT during development.
locked いいえ Immutable. Cannot be modified once set. Required for production Workloads.

Locking is one-way: locked artifacts cannot return to draft. To iterate further on a locked artifact, create a new draft artifact. You can lock an artifact using either of the following methods:

方法 Process
Direct lock Call PATCH /artifacts/{id} with {"status": "locked"}. This automatically resets the associated Workload's statistics so production starts from a clean baseline.
プロモート Call POST /workloads/{id}/promote; see Promote to production. This also wipes stats and removes the draft Workload's 8-hour TTL in a single call.

Deletion rules

Locked artifacts cannot be deleted. Artifacts with running protons (draft or locked) cannot be deleted either—stop or delete the backing Workloads first.

Artifact vs. Workload: what lives where

The artifact defines what runs; the Workload runtime defines how it runs. The runtime does not accept per-Workload environment variable overrides—values that need to vary across deployments belong in the artifact's environmentVars.

Layer What lives here Mutability
Artifact (spec.containerGroups[].containers[]) Container topology—image URI or build config, port, entrypoint, environment variables, and probes. Immutable once locked. imageUri, imageBuildConfig, port, entrypoint, environmentVars, readinessProbe.
Workload (runtime.containerGroups[]) Deployment-time settings—replicas, autoscaling, per-container resource allocation, resource bundles. Always mutable. On a locked Workload, changes via PATCH /workloads/{id}/settings trigger a rolling replacement rather than taking effect immediately. (It is the artifact spec that becomes immutable at lock time, not the runtime.) replicaCount, autoscaling, per-container resourceAllocation, resourceBundles.

Container and group name rules

Each container's name (and the matching name on a runtime override) follows DNS-label syntax—lowercase letters, digits, and hyphens; must start with a lowercase letter and end with a letter or digit; up to 63 characters. The runtime matches its containerGroups[].containers[] entries to the artifact by these names, so they must agree exactly.

Consider the following guidance when deciding where to define a field:

  • If a field is part of the container's identity (which code runs, what ports it listens on, which env vars it needs), it belongs in the artifact spec.
  • If a field is a deployment-time knob (replica count, CPU allocation, scaling policy), it belongs in the Workload runtime.

Artifact repositories

Artifact repositories group artifact versions and provide:

機能 What you get
バージョン履歴 A traceable lineage of artifact revisions in a single location.
Shared governance A sharedRoles grant on the repository that controls who can read or modify the collection.
Discoverability Easier discovery of artifacts that belong to the same product or team.

The platform creates a repository automatically the first time you create an artifact with artifactRepositoryId set.