Artifact concepts¶
An artifact describes what a Workload runs: a container specification with image URI, port, entrypoint, environment variables, and health probes. Workloads are created from artifacts; one locked artifact can serve as the foundation for many Workloads.
Artifact types¶
Artifacts use a type discriminator. The API accepts service (default) and nim (NVIDIA NIM model artifacts). Both use the same multi-container containerGroups shape; NIM adds optional storage and GPU-oriented autoscaling metrics in the generated API reference.
| Type | When to use |
|---|---|
service |
Container-based Workloads where you supply images and the platform runs them (primary plus optional sidecar containers): inference servers, agent containers (LangGraph, CrewAI, AutoGen), APIs, and web services. |
nim |
NVIDIA NIM model serving when you need NIM-specific scheduling, storage options, and scaling signals (gpuCacheUtilization, gpuRequestQueueDepth). See the Workload API reference for NimArtifactSpec and Scaling metrics. |
NIM artifacts include an optional storage field (NimStorageConfig) with two choices: dedicatedPvc (default) gives the Workload its own PVC for model weights. Alternatively, nimCache uses a cluster-wide PVC keyed on the model image—pick nimCache when multiple Workloads share the same NIM image so that the cluster keeps a single cached copy of the weights.
Container requirements¶
Every container the platform runs must satisfy the following baseline requirements:
| Requirement | Why |
|---|---|
| Runs as non-root user | The platform runs containers unprivileged. |
| Listens on a port between 1024 and 65535 | Containers cannot bind to privileged ports (0-1023). The port is set on the artifact via containers[].port. |
| Exposes an HTTP server | The platform proxies invoke traffic as HTTP to the primary container. |
| Implements a readiness probe endpoint | The platform polls readinessProbe.path to determine readiness. |
Primary vs. non-primary containers
A container group has exactly one primary container (primary: true) plus any number of sidecars. The primary container must define a port; non-primary containers (sidecars) must omit port. Assigning these incorrectly returns a 422 validation error at artifact create or update.
Image build configuration¶
For draft service artifacts, set imageBuildConfig on a container instead of imageUri. This then builds from uploaded source code (POST /artifacts/{id}/builds). Provide either a pre-built imageUri or imageBuildConfig (not both). After a successful build, the platform populates imageUri.
ImageBuildConfig.dockerfile is a discriminated union on source (defaults to provided using ./Dockerfile from your source):
source |
Schema | Fields |
|---|---|---|
provided |
ProvidedDockerfile |
path (string, default ./Dockerfile)—relative path to the Dockerfile in synced source code. |
generated |
GeneratedDockerfile |
executionEnvironmentId, executionEnvironmentVersionId, entrypoint—platform generates a Dockerfile from the execution environment base image. |
Example using a provided Dockerfile:
"imageBuildConfig": {
"dockerfile": {
"source": "provided",
"path": "./Dockerfile"
}
}
Omit path to use the default ./Dockerfile. See Image builds REST.
Artifact lifecycle¶
This section defines draft and locked as they relate to artifacts. For the Workload-creation decision—when each fits and what each implies for TTL (time-to-live), importance, and replace rules—see Choose draft vs. locked. Each status differs in lifecycle and editability.
| Status | Editable | Description |
|---|---|---|
draft |
Yes | Default status. Mutable. Update via PATCH/PUT during development. |
locked |
No | Immutable. Cannot be modified once set. Required for production Workloads. |
Locking is one-way: locked artifacts cannot return to draft. To iterate further on a locked artifact, create a new draft artifact. You can lock an artifact using either of the following methods:
| Method | Process |
|---|---|
| Direct lock | Call PATCH /artifacts/{id} with {"status": "locked"}. This automatically resets the associated Workload's statistics so production starts from a clean baseline. |
| Promote | Call POST /workloads/{id}/promote; see Promote to production. This also wipes stats and removes the draft Workload's 8-hour TTL in a single call. |
Deletion rules
Locked artifacts cannot be deleted. Artifacts with running protons (draft or locked) cannot be deleted either—stop or delete the backing Workloads first.
Artifact vs. Workload: what lives where¶
The artifact defines what runs; the Workload runtime defines how it runs. The runtime does not accept per-Workload environment variable overrides—values that need to vary across deployments belong in the artifact's environmentVars.
| Layer | What lives here | Mutability | Examples |
|---|---|---|---|
Artifact (spec.containerGroups[].containers[]) |
Container topology—image URI or build config, port, entrypoint, environment variables, and probes. | Immutable once locked. |
imageUri, imageBuildConfig, port, entrypoint, environmentVars, readinessProbe. |
Workload (runtime.containerGroups[]) |
Deployment-time settings—replicas, autoscaling, per-container resource allocation, resource bundles. | Always mutable. On a locked Workload, changes via PATCH /workloads/{id}/settings trigger a rolling replacement rather than taking effect immediately. (It is the artifact spec that becomes immutable at lock time, not the runtime.) |
replicaCount, autoscaling, per-container resourceAllocation, resourceBundles. |
Container and group name rules
Each container's name (and the matching name on a runtime override) follows DNS-label syntax—lowercase letters, digits, and hyphens; must start with a lowercase letter and end with a letter or digit; up to 63 characters. The runtime matches its containerGroups[].containers[] entries to the artifact by these names, so they must agree exactly.
Consider the following guidance when deciding where to define a field:
- If a field is part of the container's identity (which code runs, what ports it listens on, which env vars it needs), it belongs in the artifact spec.
- If a field is a deployment-time knob (replica count, CPU allocation, scaling policy), it belongs in the Workload runtime.
Artifact repositories¶
Artifact repositories group artifact versions and provide:
| Capability | What you get |
|---|---|
| Version history | A traceable lineage of artifact revisions in a single location. |
| Shared governance | A sharedRoles grant on the repository that controls who can read or modify the collection. |
| Discoverability | Easier discovery of artifacts that belong to the same product or team. |
The platform creates a repository automatically the first time you create an artifact with artifactRepositoryId set.