Skip to content

Workloads

Manage AI workload deployments and their lifecycle.

List Workloads

Operation path: GET /workloads

List all workloads accessible to the authenticated user.

Parameters

Name In Type Required Description
offset query integer false Skip the specified number of values.
limit query integer false Retrieve only the specified number of values.
createdBy query any false Filters by those created by the given user.
ids query any false Filter by specific IDs
search query any false Case insensitive search against name, description and partial ID.
tagKeys query any false List of tag keys to filter for. If multiple values are specified, results with tags that match any of the values will be returned.
tagValues query any false List of tag values to filter for. If multiple values are specified, results with tags that match any of the values will be returned.
orderBy query any false The order to sort the results.
status query any false Filters workloads by status.
artifactStatus query any false Filters workloads by their corresponding artifact status.
importance query any false Filters workloads by their importance.
artifactId query any false Filter workloads by their active artifact ID.
repositoryId query any false Filter workloads by their active artifact's repository ID.
type query any false Filters workloads by artifact type.
serviceStats query any false If true, includes recent service statistics in the response.

Example responses

200 Response

{
  "additionalProperties": false,
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "description": "API representation of a workload. this is the formatted version returned to clients, excluding internal fields and including computed properties like permissions and statistics.",
        "properties": {
          "artifact": {
            "anyOf": [
              {
                "description": "Artifact basic information.",
                "properties": {
                  "artifactRepositoryId": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Id of the artifact repository this artifact belongs to (for versioning).",
                    "title": "Artifactrepositoryid"
                  },
                  "id": {
                    "description": "Unique identifier of the entity.",
                    "title": "Id",
                    "type": "string"
                  },
                  "name": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Name of the entity.",
                    "title": "Name"
                  },
                  "status": {
                    "anyOf": [
                      {
                        "enum": [
                          "draft",
                          "locked"
                        ],
                        "title": "ArtifactStatus",
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Artifact status."
                  },
                  "templateId": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Id of the template used to create this artifact.",
                    "title": "Templateid"
                  },
                  "type": {
                    "anyOf": [
                      {
                        "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
                        "enum": [
                          "service",
                          "nim"
                        ],
                        "title": "ArtifactType",
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Artifact type."
                  },
                  "version": {
                    "anyOf": [
                      {
                        "type": "integer"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Version number of the artifact (set only for locked artifacts).",
                    "title": "Version"
                  }
                },
                "required": [
                  "id"
                ],
                "title": "ArtifactInfoFormatted",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Basic information about the currently active artifact for this workload.",
            "title": "Artifact"
          },
          "artifactId": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Id of the currently active artifact for this workload.",
            "title": "Artifact ID"
          },
          "createdAt": {
            "description": "Timestamp of when the entity was created.",
            "format": "date-time",
            "title": "Created At",
            "type": "string"
          },
          "creator": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "User information embedded in API responses.",
                "properties": {
                  "email": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User email address.",
                    "title": "Email"
                  },
                  "fullName": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User's full name.",
                    "title": "Full Name"
                  },
                  "id": {
                    "description": "User id associated with this resource.",
                    "title": "User ID",
                    "type": "string"
                  },
                  "userhash": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User's gravatar hash.",
                    "title": "Userhash"
                  },
                  "username": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Username.",
                    "title": "Username"
                  }
                },
                "required": [
                  "id"
                ],
                "title": "UserData",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Owner user details including id, username and email.",
            "title": "Creator"
          },
          "description": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "default": "",
            "description": "Workload description.",
            "title": "Description"
          },
          "endpoint": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Workload endpoint url.",
            "title": "Endpoint"
          },
          "id": {
            "description": "Unique identifier of the entity.",
            "title": "ID",
            "type": "string"
          },
          "importance": {
            "description": "Importance level for workloads.",
            "enum": [
              "critical",
              "high",
              "moderate",
              "low"
            ],
            "title": "WorkloadImportance",
            "type": "string"
          },
          "lastResponse": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of the last response received from this workload.",
            "title": "Last Response Time"
          },
          "name": {
            "description": "Name of the entity.",
            "title": "Name",
            "type": "string"
          },
          "owners": {
            "description": "List of workload owners.",
            "items": {
              "additionalProperties": false,
              "description": "User information embedded in API responses.",
              "properties": {
                "email": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "User email address.",
                  "title": "Email"
                },
                "fullName": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "User's full name.",
                  "title": "Full Name"
                },
                "id": {
                  "description": "User id associated with this resource.",
                  "title": "User ID",
                  "type": "string"
                },
                "userhash": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "User's gravatar hash.",
                  "title": "Userhash"
                },
                "username": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Username.",
                  "title": "Username"
                }
              },
              "required": [
                "id"
              ],
              "title": "UserData",
              "type": "object"
            },
            "title": "Owners",
            "type": "array"
          },
          "permissions": {
            "anyOf": [
              {
                "items": {
                  "description": "Represents the particular role a user, group or organization holds on an entity.",
                  "enum": [
                    "CAN_VIEW",
                    "CAN_UPDATE",
                    "CAN_DELETE",
                    "CAN_SHARE",
                    "CAN_MAKE_PREDICTIONS",
                    "CAN_SHARE_ROLE_OWNER",
                    "CAN_SHARE_ROLE_READ_WRITE",
                    "CAN_SHARE_ROLE_READ_ONLY"
                  ],
                  "title": "ResourcePermission",
                  "type": "string"
                },
                "type": "array"
              },
              {
                "items": {
                  "const": "*",
                  "type": "string"
                },
                "type": "array"
              }
            ]
          },
          "protonId": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Id of the currently active proton for this workload.",
            "title": "Proton ID"
          },
          "replacement": {
            "anyOf": [
              {
                "description": "Formatted replacement information for API responses.",
                "properties": {
                  "candidateProtonIds": {
                    "description": "Ids of protons pending promotion during artifact replacement.",
                    "items": {
                      "type": "string"
                    },
                    "title": "Candidateprotonids",
                    "type": "array"
                  },
                  "status": {
                    "description": "Statuses for workload replacement process.",
                    "enum": [
                      "unknown",
                      "submitted",
                      "initializing",
                      "awaiting_promotion",
                      "switching",
                      "deleting",
                      "completed",
                      "errored",
                      "cleaning_up"
                    ],
                    "title": "ReplacementStatus",
                    "type": "string"
                  },
                  "strategy": {
                    "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
                    "enum": [
                      "rolling"
                    ],
                    "title": "ReplacementStrategy",
                    "type": "string"
                  }
                },
                "title": "WorkloadReplacementFormatted",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Information about an active replacement process for this workload, if any.",
            "title": "Replacement"
          },
          "requestStats": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "Request statistics summary.",
                "properties": {
                  "concurrentRequests": {
                    "default": 0,
                    "description": "Number of concurrent requests.",
                    "title": "Concurrentrequests",
                    "type": "integer"
                  },
                  "errorRate": {
                    "default": 0,
                    "description": "Error rate percentage.",
                    "title": "Errorrate",
                    "type": "number"
                  },
                  "errorRates": {
                    "description": "Error rates over the last 7 time periods.",
                    "items": {
                      "type": "integer"
                    },
                    "title": "Errorrates",
                    "type": "array"
                  },
                  "lastRequestAt": {
                    "anyOf": [
                      {
                        "format": "date-time",
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Timestamp of the last request.",
                    "title": "Lastrequestat"
                  },
                  "requestRates": {
                    "description": "Request rates over the last 7 time periods.",
                    "items": {
                      "type": "integer"
                    },
                    "title": "Requestrates",
                    "type": "array"
                  },
                  "responseTime": {
                    "default": 0,
                    "description": "Average response time in milliseconds.",
                    "title": "Responsetime",
                    "type": "integer"
                  },
                  "totalRequests": {
                    "default": 0,
                    "description": "Total number of requests.",
                    "title": "Totalrequests",
                    "type": "integer"
                  }
                },
                "title": "RequestStats",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Request statistics for this workload.",
            "title": "Request Stats"
          },
          "runtime": {
            "additionalProperties": false,
            "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
            "properties": {
              "containerGroups": {
                "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime configuration for a single container group.",
                  "properties": {
                    "autoscaling": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Autoscaling configuration for a proton.",
                          "properties": {
                            "enabled": {
                              "default": true,
                              "description": "Whether autoscaling is enabled.",
                              "title": "Enabled",
                              "type": "boolean"
                            },
                            "policies": {
                              "items": {
                                "additionalProperties": false,
                                "description": "Base class for autoscaling policies.",
                                "properties": {
                                  "maxCount": {
                                    "description": "Maximum number of replicas.",
                                    "minimum": 0,
                                    "title": "Max Count",
                                    "type": "integer"
                                  },
                                  "minCount": {
                                    "description": "Minimum number of replicas.",
                                    "minimum": 0,
                                    "title": "Min Count",
                                    "type": "integer"
                                  },
                                  "priority": {
                                    "anyOf": [
                                      {
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Policy priority when multiple policies are defined.",
                                    "title": "Priority"
                                  },
                                  "scalingMetric": {
                                    "anyOf": [
                                      {
                                        "oneOf": [
                                          {
                                            "const": "cpuAverageUtilization",
                                            "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                            "title": "CPU Average Utilization"
                                          },
                                          {
                                            "const": "httpRequestsConcurrency",
                                            "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                            "title": "HTTP Requests Concurrency"
                                          },
                                          {
                                            "const": "gpuCacheUtilization",
                                            "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                            "title": "GPU Cache Utilization"
                                          },
                                          {
                                            "const": "gpuRequestQueueDepth",
                                            "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                            "title": "GPU Request Queue Depth"
                                          }
                                        ],
                                        "title": "ScalingMetricType",
                                        "type": "string"
                                      },
                                      {
                                        "type": "string"
                                      }
                                    ],
                                    "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                                    "title": "Scaling Metric"
                                  },
                                  "target": {
                                    "description": "Target value for the scaling metric.",
                                    "minimum": 0,
                                    "title": "Target",
                                    "type": "number"
                                  }
                                },
                                "required": [
                                  "scalingMetric",
                                  "target",
                                  "minCount",
                                  "maxCount"
                                ],
                                "title": "AutoscalingPolicy",
                                "type": "object"
                              },
                              "title": "Policies",
                              "type": "array"
                            }
                          },
                          "required": [
                            "policies"
                          ],
                          "title": "AutoscalingProperties",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Autoscaling configuration for this group. takes precedence over replicacount."
                    },
                    "bundleSelectionPolicy": {
                      "enum": [
                        "availability"
                      ],
                      "title": "BundleSelectionPolicy",
                      "type": "string"
                    },
                    "containers": {
                      "description": "Per-container overrides for this group.",
                      "items": {
                        "additionalProperties": false,
                        "description": "Runtime diff targeting a single named container within a group.",
                        "properties": {
                          "name": {
                            "description": "Container name. must match a container declared in the artifact group.",
                            "title": "Name",
                            "type": "string"
                          },
                          "resourceAllocation": {
                            "anyOf": [
                              {
                                "additionalProperties": false,
                                "description": "Per-container resource allocation declared at runtime.",
                                "properties": {
                                  "cpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0.1,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Cpu cores allocated to this container.",
                                    "title": "Cpu"
                                  },
                                  "gpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Gpus allocated to this container.",
                                    "title": "Gpu"
                                  },
                                  "memory": {
                                    "anyOf": [
                                      {
                                        "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                        "type": "string"
                                      },
                                      {
                                        "minimum": 0,
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                                    "examples": [
                                      "8GB",
                                      "512MB"
                                    ],
                                    "title": "Memory"
                                  }
                                },
                                "title": "ResourceAllocation",
                                "type": "object"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Resource allocation for this container. required for multi-container groups."
                          }
                        },
                        "required": [
                          "name"
                        ],
                        "title": "ContainerOverride",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "default": "default",
                      "description": "Group name. must match a container group name declared in the artifact.",
                      "title": "Name",
                      "type": "string"
                    },
                    "replicaCount": {
                      "anyOf": [
                        {
                          "minimum": 1,
                          "type": "integer"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "default": 1,
                      "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                      "title": "Replicacount"
                    },
                    "resolvedBundle": {
                      "anyOf": [
                        {
                          "description": "Bundle details returned in the runtime response after scheduling.",
                          "properties": {
                            "cpuCount": {
                              "description": "Number of cpu cores.",
                              "title": "CPU Count",
                              "type": "number"
                            },
                            "gpuCount": {
                              "default": 0,
                              "description": "Number of gpu units.",
                              "title": "GPU Count",
                              "type": "integer"
                            },
                            "gpuMaker": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu manufacturer.",
                              "title": "GPU Maker"
                            },
                            "gpuTypeLabel": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu type label.",
                              "title": "GPU Type Label"
                            },
                            "id": {
                              "description": "Bundle identifier that was selected.",
                              "title": "Id",
                              "type": "string"
                            },
                            "memoryBytes": {
                              "description": "Memory size in bytes.",
                              "title": "Memory Bytes",
                              "type": "integer"
                            }
                          },
                          "required": [
                            "id",
                            "cpuCount",
                            "memoryBytes"
                          ],
                          "title": "ResolvedBundle",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Full details of the bundle selected at scheduling time. read-only.",
                      "readOnly": true
                    },
                    "resourceBundles": {
                      "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                      "items": {
                        "type": "string"
                      },
                      "title": "Resourcebundles",
                      "type": "array"
                    }
                  },
                  "title": "GroupRuntime",
                  "type": "object"
                },
                "title": "Containergroups",
                "type": "array"
              }
            },
            "title": "WorkloadRuntime",
            "type": "object"
          },
          "status": {
            "description": "User-facing workload status. a subset of :class:`protonstatus` — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.",
            "enum": [
              "unknown",
              "submitted",
              "provisioning",
              "launching",
              "running",
              "suspended",
              "interrupted",
              "stopping",
              "stopped",
              "errored",
              "terminated"
            ],
            "title": "WorkloadStatus",
            "type": "string"
          },
          "tags": {
            "items": {
              "additionalProperties": false,
              "properties": {
                "id": {
                  "description": "Unique identifier of the tag.",
                  "title": "Id",
                  "type": "string"
                },
                "name": {
                  "description": "Name of the tag.",
                  "title": "Name",
                  "type": "string"
                },
                "value": {
                  "description": "Value of the tag.",
                  "title": "Value",
                  "type": "string"
                }
              },
              "required": [
                "id",
                "name",
                "value"
              ],
              "title": "TagInfo",
              "type": "object"
            },
            "type": "array"
          },
          "type": {
            "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
            "enum": [
              "service",
              "nim"
            ],
            "title": "ArtifactType",
            "type": "string"
          },
          "updatedAt": {
            "description": "Timestamp of when the entity was last updated.",
            "format": "date-time",
            "title": "Updated At",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "createdAt",
          "updatedAt"
        ],
        "title": "WorkloadFormatted",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "WorkloadListResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response WorkloadListResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
422 Unprocessable Entity Validation Error HTTPValidationError

Create Workload

Operation path: POST /workloads

Create a workload from the specified artifact and schedule its initial runtime.

Body parameter

{
  "additionalProperties": false,
  "description": "Request to create a new workload.",
  "properties": {
    "artifact": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Request to create an artifact. an artifact is always created as a draft.",
          "properties": {
            "artifactRepositoryId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the artifact repository this artifact belongs to (for versioning support).",
              "title": "Artifactrepositoryid"
            },
            "description": {
              "default": "",
              "description": "Description of the artifact.",
              "title": "Description",
              "type": "string"
            },
            "name": {
              "description": "Name of the artifact.",
              "maxLength": 5000,
              "minLength": 1,
              "title": "Name",
              "type": "string"
            },
            "spec": {
              "description": "Artifact specification.",
              "discriminator": {
                "mapping": {
                  "nim": "#/components/schemas/NimArtifactSpec",
                  "service": "#/components/schemas/ServiceArtifactSpec"
                },
                "propertyName": "type"
              },
              "oneOf": [
                {
                  "additionalProperties": false,
                  "properties": {
                    "containerGroups": {
                      "default": [],
                      "description": "List of container groups.",
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "containers": {
                            "default": [],
                            "description": "List of containers making this container group.",
                            "items": {
                              "additionalProperties": false,
                              "properties": {
                                "build": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Build reference embedded in a container spec when an image build is triggered.",
                                      "properties": {
                                        "artifactImageBuildId": {
                                          "description": "Artifact image build id.",
                                          "title": "Artifactimagebuildid",
                                          "type": "string"
                                        },
                                        "createdAt": {
                                          "description": "Build creation timestamp (utc).",
                                          "format": "date-time",
                                          "title": "Createdat",
                                          "type": "string"
                                        },
                                        "status": {
                                          "description": "Image build reported status at submit time.",
                                          "title": "Status",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "artifactImageBuildId",
                                        "status",
                                        "createdAt"
                                      ],
                                      "title": "ContainerBuildInfo",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
                                },
                                "description": {
                                  "default": "",
                                  "description": "Description of the container.",
                                  "title": "Description",
                                  "type": "string"
                                },
                                "entrypoint": {
                                  "anyOf": [
                                    {
                                      "items": {
                                        "type": "string"
                                      },
                                      "type": "array"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
                                  "title": "Entrypoint"
                                },
                                "environmentVars": {
                                  "default": [],
                                  "description": "Environment variables.",
                                  "items": {
                                    "anyOf": [
                                      {
                                        "properties": {
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "string",
                                            "default": "string",
                                            "title": "Source",
                                            "type": "string"
                                          },
                                          "value": {
                                            "description": "Value of the environment variable.",
                                            "title": "Value",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "name",
                                          "value"
                                        ],
                                        "title": "StringEnvironmentVariable",
                                        "type": "object"
                                      },
                                      {
                                        "properties": {
                                          "drCredentialId": {
                                            "description": "Id of the datarobot credential to use.",
                                            "title": "DR Credential ID",
                                            "type": "string"
                                          },
                                          "key": {
                                            "description": "Key within the credential.",
                                            "title": "Key",
                                            "type": "string"
                                          },
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "dr-credential",
                                            "title": "Source",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "source",
                                          "name",
                                          "drCredentialId",
                                          "key"
                                        ],
                                        "title": "CredentialEnvironmentVariable",
                                        "type": "object"
                                      },
                                      {
                                        "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                                        "properties": {
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "dr-api-token",
                                            "title": "Source",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "source",
                                          "name"
                                        ],
                                        "title": "DrApiTokenEnvironmentVariable",
                                        "type": "object"
                                      }
                                    ]
                                  },
                                  "title": "Environmentvars",
                                  "type": "array"
                                },
                                "imageBuildConfig": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "User-provided configuration for server-side image builds from source code.",
                                      "properties": {
                                        "codeRef": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "properties": {
                                                "datarobot": {
                                                  "additionalProperties": false,
                                                  "properties": {
                                                    "catalogId": {
                                                      "title": "Catalogid",
                                                      "type": "string"
                                                    },
                                                    "catalogVersionId": {
                                                      "title": "Catalogversionid",
                                                      "type": "string"
                                                    }
                                                  },
                                                  "required": [
                                                    "catalogId",
                                                    "catalogVersionId"
                                                  ],
                                                  "title": "DataRobotCodeRef",
                                                  "type": "object"
                                                },
                                                "provider": {
                                                  "const": "datarobot",
                                                  "default": "datarobot",
                                                  "title": "Provider",
                                                  "type": "string"
                                                },
                                                "type": {
                                                  "const": "datarobot",
                                                  "default": "datarobot",
                                                  "title": "Type",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "datarobot"
                                              ],
                                              "title": "CodeRef",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                                        },
                                        "dockerfile": {
                                          "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                                          "discriminator": {
                                            "mapping": {
                                              "generated": "#/components/schemas/GeneratedDockerfile",
                                              "provided": "#/components/schemas/ProvidedDockerfile"
                                            },
                                            "propertyName": "source"
                                          },
                                          "oneOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "User supplies a dockerfile in the uploaded source code.",
                                              "properties": {
                                                "path": {
                                                  "default": "./Dockerfile",
                                                  "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                                                  "title": "Path",
                                                  "type": "string"
                                                },
                                                "source": {
                                                  "const": "provided",
                                                  "default": "provided",
                                                  "title": "Source",
                                                  "type": "string"
                                                }
                                              },
                                              "title": "ProvidedDockerfile",
                                              "type": "object"
                                            },
                                            {
                                              "additionalProperties": false,
                                              "description": "System generates a dockerfile from execution environment metadata.",
                                              "properties": {
                                                "entrypoint": {
                                                  "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                                                  "items": {
                                                    "type": "string"
                                                  },
                                                  "minItems": 1,
                                                  "title": "Entrypoint",
                                                  "type": "array"
                                                },
                                                "executionEnvironmentId": {
                                                  "description": "Execution environment id used to resolve the base Docker image.",
                                                  "title": "Execution Environment ID",
                                                  "type": "string"
                                                },
                                                "executionEnvironmentVersionId": {
                                                  "description": "Execution environment version id that pins the exact base image tag.",
                                                  "title": "Execution Environment Version ID",
                                                  "type": "string"
                                                },
                                                "source": {
                                                  "const": "generated",
                                                  "default": "generated",
                                                  "title": "Source",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "executionEnvironmentId",
                                                "executionEnvironmentVersionId",
                                                "entrypoint"
                                              ],
                                              "title": "GeneratedDockerfile",
                                              "type": "object"
                                            }
                                          ],
                                          "title": "Dockerfile"
                                        }
                                      },
                                      "title": "ImageBuildConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Configuration for server-side image builds from source code."
                                },
                                "imageUri": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
                                  "title": "Imageuri"
                                },
                                "livenessProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container liveness check configuration."
                                },
                                "name": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                                  "title": "Name"
                                },
                                "port": {
                                  "anyOf": [
                                    {
                                      "maximum": 65535,
                                      "minimum": 1024,
                                      "type": "integer"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
                                  "title": "Port"
                                },
                                "primary": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "default": false,
                                  "description": "Whether this is the primary container.",
                                  "title": "Primary"
                                },
                                "readinessProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container readiness check configuration."
                                },
                                "securityContext": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                                      "properties": {
                                        "allowPrivilegeEscalation": {
                                          "anyOf": [
                                            {
                                              "type": "boolean"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                                          "title": "Allowprivilegeescalation"
                                        },
                                        "capabilities": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "Linux capabilities to add or drop from the container.",
                                              "properties": {
                                                "add": {
                                                  "anyOf": [
                                                    {
                                                      "items": {
                                                        "type": "string"
                                                      },
                                                      "type": "array"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Capabilities to add.",
                                                  "title": "Add"
                                                },
                                                "drop": {
                                                  "anyOf": [
                                                    {
                                                      "items": {
                                                        "type": "string"
                                                      },
                                                      "type": "array"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Capabilities to drop.",
                                                  "title": "Drop"
                                                }
                                              },
                                              "title": "Capabilities",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Linux capabilities to add or drop."
                                        },
                                        "readOnlyRootFilesystem": {
                                          "anyOf": [
                                            {
                                              "type": "boolean"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Whether the root filesystem is read-only.",
                                          "title": "Readonlyrootfilesystem"
                                        },
                                        "seccompProfile": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "Seccomp profile configuration.",
                                              "properties": {
                                                "localhostProfile": {
                                                  "anyOf": [
                                                    {
                                                      "type": "string"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                                                  "title": "Localhostprofile"
                                                },
                                                "type": {
                                                  "description": "Allowed seccomp profile types.",
                                                  "enum": [
                                                    "RuntimeDefault",
                                                    "Unconfined",
                                                    "Localhost"
                                                  ],
                                                  "title": "SeccompProfileType",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "type"
                                              ],
                                              "title": "SeccompProfile",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Seccomp profile for the container."
                                        }
                                      },
                                      "title": "SecurityContext",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container security context."
                                },
                                "startupProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container startup check configuration."
                                }
                              },
                              "title": "Container",
                              "type": "object"
                            },
                            "title": "Containers",
                            "type": "array"
                          },
                          "name": {
                            "default": "default",
                            "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                            "title": "Name",
                            "type": "string"
                          }
                        },
                        "title": "ContainerGroup",
                        "type": "object"
                      },
                      "title": "Containergroups",
                      "type": "array"
                    },
                    "type": {
                      "const": "service",
                      "default": "service",
                      "description": "Artifact type discriminator. injected automatically from the top-level `type` field — do not set this directly.",
                      "title": "Type",
                      "type": "string"
                    }
                  },
                  "title": "ServiceArtifactSpec",
                  "type": "object"
                },
                {
                  "additionalProperties": false,
                  "properties": {
                    "containerGroups": {
                      "default": [],
                      "description": "List of container groups.",
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "containers": {
                            "default": [],
                            "description": "List of containers making this container group.",
                            "items": {
                              "additionalProperties": false,
                              "properties": {
                                "build": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Build reference embedded in a container spec when an image build is triggered.",
                                      "properties": {
                                        "artifactImageBuildId": {
                                          "description": "Artifact image build id.",
                                          "title": "Artifactimagebuildid",
                                          "type": "string"
                                        },
                                        "createdAt": {
                                          "description": "Build creation timestamp (utc).",
                                          "format": "date-time",
                                          "title": "Createdat",
                                          "type": "string"
                                        },
                                        "status": {
                                          "description": "Image build reported status at submit time.",
                                          "title": "Status",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "artifactImageBuildId",
                                        "status",
                                        "createdAt"
                                      ],
                                      "title": "ContainerBuildInfo",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
                                },
                                "description": {
                                  "default": "",
                                  "description": "Description of the container.",
                                  "title": "Description",
                                  "type": "string"
                                },
                                "entrypoint": {
                                  "anyOf": [
                                    {
                                      "items": {
                                        "type": "string"
                                      },
                                      "type": "array"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
                                  "title": "Entrypoint"
                                },
                                "environmentVars": {
                                  "default": [],
                                  "description": "Environment variables.",
                                  "items": {
                                    "anyOf": [
                                      {
                                        "properties": {
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "string",
                                            "default": "string",
                                            "title": "Source",
                                            "type": "string"
                                          },
                                          "value": {
                                            "description": "Value of the environment variable.",
                                            "title": "Value",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "name",
                                          "value"
                                        ],
                                        "title": "StringEnvironmentVariable",
                                        "type": "object"
                                      },
                                      {
                                        "properties": {
                                          "drCredentialId": {
                                            "description": "Id of the datarobot credential to use.",
                                            "title": "DR Credential ID",
                                            "type": "string"
                                          },
                                          "key": {
                                            "description": "Key within the credential.",
                                            "title": "Key",
                                            "type": "string"
                                          },
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "dr-credential",
                                            "title": "Source",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "source",
                                          "name",
                                          "drCredentialId",
                                          "key"
                                        ],
                                        "title": "CredentialEnvironmentVariable",
                                        "type": "object"
                                      },
                                      {
                                        "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                                        "properties": {
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "dr-api-token",
                                            "title": "Source",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "source",
                                          "name"
                                        ],
                                        "title": "DrApiTokenEnvironmentVariable",
                                        "type": "object"
                                      }
                                    ]
                                  },
                                  "title": "Environmentvars",
                                  "type": "array"
                                },
                                "imageBuildConfig": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "User-provided configuration for server-side image builds from source code.",
                                      "properties": {
                                        "codeRef": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "properties": {
                                                "datarobot": {
                                                  "additionalProperties": false,
                                                  "properties": {
                                                    "catalogId": {
                                                      "title": "Catalogid",
                                                      "type": "string"
                                                    },
                                                    "catalogVersionId": {
                                                      "title": "Catalogversionid",
                                                      "type": "string"
                                                    }
                                                  },
                                                  "required": [
                                                    "catalogId",
                                                    "catalogVersionId"
                                                  ],
                                                  "title": "DataRobotCodeRef",
                                                  "type": "object"
                                                },
                                                "provider": {
                                                  "const": "datarobot",
                                                  "default": "datarobot",
                                                  "title": "Provider",
                                                  "type": "string"
                                                },
                                                "type": {
                                                  "const": "datarobot",
                                                  "default": "datarobot",
                                                  "title": "Type",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "datarobot"
                                              ],
                                              "title": "CodeRef",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                                        },
                                        "dockerfile": {
                                          "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                                          "discriminator": {
                                            "mapping": {
                                              "generated": "#/components/schemas/GeneratedDockerfile",
                                              "provided": "#/components/schemas/ProvidedDockerfile"
                                            },
                                            "propertyName": "source"
                                          },
                                          "oneOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "User supplies a dockerfile in the uploaded source code.",
                                              "properties": {
                                                "path": {
                                                  "default": "./Dockerfile",
                                                  "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                                                  "title": "Path",
                                                  "type": "string"
                                                },
                                                "source": {
                                                  "const": "provided",
                                                  "default": "provided",
                                                  "title": "Source",
                                                  "type": "string"
                                                }
                                              },
                                              "title": "ProvidedDockerfile",
                                              "type": "object"
                                            },
                                            {
                                              "additionalProperties": false,
                                              "description": "System generates a dockerfile from execution environment metadata.",
                                              "properties": {
                                                "entrypoint": {
                                                  "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                                                  "items": {
                                                    "type": "string"
                                                  },
                                                  "minItems": 1,
                                                  "title": "Entrypoint",
                                                  "type": "array"
                                                },
                                                "executionEnvironmentId": {
                                                  "description": "Execution environment id used to resolve the base Docker image.",
                                                  "title": "Execution Environment ID",
                                                  "type": "string"
                                                },
                                                "executionEnvironmentVersionId": {
                                                  "description": "Execution environment version id that pins the exact base image tag.",
                                                  "title": "Execution Environment Version ID",
                                                  "type": "string"
                                                },
                                                "source": {
                                                  "const": "generated",
                                                  "default": "generated",
                                                  "title": "Source",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "executionEnvironmentId",
                                                "executionEnvironmentVersionId",
                                                "entrypoint"
                                              ],
                                              "title": "GeneratedDockerfile",
                                              "type": "object"
                                            }
                                          ],
                                          "title": "Dockerfile"
                                        }
                                      },
                                      "title": "ImageBuildConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Configuration for server-side image builds from source code."
                                },
                                "imageUri": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
                                  "title": "Imageuri"
                                },
                                "livenessProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container liveness check configuration."
                                },
                                "name": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                                  "title": "Name"
                                },
                                "port": {
                                  "anyOf": [
                                    {
                                      "maximum": 65535,
                                      "minimum": 1024,
                                      "type": "integer"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
                                  "title": "Port"
                                },
                                "primary": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "default": false,
                                  "description": "Whether this is the primary container.",
                                  "title": "Primary"
                                },
                                "readinessProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container readiness check configuration."
                                },
                                "securityContext": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                                      "properties": {
                                        "allowPrivilegeEscalation": {
                                          "anyOf": [
                                            {
                                              "type": "boolean"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                                          "title": "Allowprivilegeescalation"
                                        },
                                        "capabilities": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "Linux capabilities to add or drop from the container.",
                                              "properties": {
                                                "add": {
                                                  "anyOf": [
                                                    {
                                                      "items": {
                                                        "type": "string"
                                                      },
                                                      "type": "array"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Capabilities to add.",
                                                  "title": "Add"
                                                },
                                                "drop": {
                                                  "anyOf": [
                                                    {
                                                      "items": {
                                                        "type": "string"
                                                      },
                                                      "type": "array"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Capabilities to drop.",
                                                  "title": "Drop"
                                                }
                                              },
                                              "title": "Capabilities",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Linux capabilities to add or drop."
                                        },
                                        "readOnlyRootFilesystem": {
                                          "anyOf": [
                                            {
                                              "type": "boolean"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Whether the root filesystem is read-only.",
                                          "title": "Readonlyrootfilesystem"
                                        },
                                        "seccompProfile": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "Seccomp profile configuration.",
                                              "properties": {
                                                "localhostProfile": {
                                                  "anyOf": [
                                                    {
                                                      "type": "string"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                                                  "title": "Localhostprofile"
                                                },
                                                "type": {
                                                  "description": "Allowed seccomp profile types.",
                                                  "enum": [
                                                    "RuntimeDefault",
                                                    "Unconfined",
                                                    "Localhost"
                                                  ],
                                                  "title": "SeccompProfileType",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "type"
                                              ],
                                              "title": "SeccompProfile",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Seccomp profile for the container."
                                        }
                                      },
                                      "title": "SecurityContext",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container security context."
                                },
                                "startupProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container startup check configuration."
                                }
                              },
                              "title": "Container",
                              "type": "object"
                            },
                            "title": "Containers",
                            "type": "array"
                          },
                          "name": {
                            "default": "default",
                            "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                            "title": "Name",
                            "type": "string"
                          }
                        },
                        "title": "ContainerGroup",
                        "type": "object"
                      },
                      "title": "Containergroups",
                      "type": "array"
                    },
                    "storage": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Model weight storage configuration for nim artifacts.",
                          "properties": {
                            "mode": {
                              "default": "dedicatedPvc",
                              "description": "Storage mode for model weights. `dedicatedpvc` (default) provisions a separate pvc owned exclusively by this workload. `nimcache` reuses a single cluster-wide pvc per model image, shared across all workloads using the same model.",
                              "enum": [
                                "dedicatedPvc",
                                "nimCache"
                              ],
                              "title": "Mode",
                              "type": "string"
                            },
                            "pvcSize": {
                              "anyOf": [
                                {
                                  "pattern": "^\\d+(\\.\\d+)?(Gi|Mi|Ti)$",
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Pvc size for dedicated storage (e.g. '150gi'). only applies when mode is `dedicatedpvc`. when omitted, the platform-configured default is used.",
                              "title": "Pvcsize"
                            }
                          },
                          "title": "NimStorageConfig",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Model weight storage configuration. when omitted, defaults to a dedicated per-workload pvc provisioned exclusively for this workload."
                    },
                    "templateId": {
                      "anyOf": [
                        {
                          "type": "string"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Id of the template used to create this nim artifact.",
                      "title": "Templateid"
                    },
                    "type": {
                      "const": "nim",
                      "default": "nim",
                      "description": "Artifact type discriminator. injected automatically from the top-level `type` field — do not set this directly.",
                      "title": "Type",
                      "type": "string"
                    }
                  },
                  "title": "NimArtifactSpec",
                  "type": "object"
                }
              ],
              "title": "Spec"
            },
            "status": {
              "enum": [
                "draft",
                "locked"
              ],
              "title": "ArtifactStatus",
              "type": "string"
            },
            "type": {
              "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
              "enum": [
                "service",
                "nim"
              ],
              "title": "ArtifactType",
              "type": "string"
            }
          },
          "required": [
            "name",
            "spec"
          ],
          "title": "InputArtifact",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Inline artifact spec to create and deploy in one step.",
      "title": "Artifact"
    },
    "artifactId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of an existing artifact to deploy.",
      "title": "Artifact ID"
    },
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Workload description.",
      "title": "Description"
    },
    "importance": {
      "description": "Importance level for workloads.",
      "enum": [
        "critical",
        "high",
        "moderate",
        "low"
      ],
      "title": "WorkloadImportance",
      "type": "string"
    },
    "name": {
      "description": "Workload name.",
      "maxLength": 5000,
      "minLength": 1,
      "title": "Name",
      "type": "string"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    }
  },
  "required": [
    "name"
  ],
  "title": "CreateWorkloadRequest",
  "type": "object"
}

Parameters

Name In Type Required Description
body body CreateWorkloadRequest true none

Example responses

201 Response

{
  "additionalProperties": false,
  "description": "API representation of a workload. this is the formatted version returned to clients, excluding internal fields and including computed properties like permissions and statistics.",
  "properties": {
    "artifact": {
      "anyOf": [
        {
          "description": "Artifact basic information.",
          "properties": {
            "artifactRepositoryId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the artifact repository this artifact belongs to (for versioning).",
              "title": "Artifactrepositoryid"
            },
            "id": {
              "description": "Unique identifier of the entity.",
              "title": "Id",
              "type": "string"
            },
            "name": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Name of the entity.",
              "title": "Name"
            },
            "status": {
              "anyOf": [
                {
                  "enum": [
                    "draft",
                    "locked"
                  ],
                  "title": "ArtifactStatus",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact status."
            },
            "templateId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the template used to create this artifact.",
              "title": "Templateid"
            },
            "type": {
              "anyOf": [
                {
                  "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
                  "enum": [
                    "service",
                    "nim"
                  ],
                  "title": "ArtifactType",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact type."
            },
            "version": {
              "anyOf": [
                {
                  "type": "integer"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Version number of the artifact (set only for locked artifacts).",
              "title": "Version"
            }
          },
          "required": [
            "id"
          ],
          "title": "ArtifactInfoFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Basic information about the currently active artifact for this workload.",
      "title": "Artifact"
    },
    "artifactId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active artifact for this workload.",
      "title": "Artifact ID"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Created At",
      "type": "string"
    },
    "creator": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User information embedded in API responses.",
          "properties": {
            "email": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User email address.",
              "title": "Email"
            },
            "fullName": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's full name.",
              "title": "Full Name"
            },
            "id": {
              "description": "User id associated with this resource.",
              "title": "User ID",
              "type": "string"
            },
            "userhash": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's gravatar hash.",
              "title": "Userhash"
            },
            "username": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Username.",
              "title": "Username"
            }
          },
          "required": [
            "id"
          ],
          "title": "UserData",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Owner user details including id, username and email.",
      "title": "Creator"
    },
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": "",
      "description": "Workload description.",
      "title": "Description"
    },
    "endpoint": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Workload endpoint url.",
      "title": "Endpoint"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "ID",
      "type": "string"
    },
    "importance": {
      "description": "Importance level for workloads.",
      "enum": [
        "critical",
        "high",
        "moderate",
        "low"
      ],
      "title": "WorkloadImportance",
      "type": "string"
    },
    "lastResponse": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last response received from this workload.",
      "title": "Last Response Time"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "owners": {
      "description": "List of workload owners.",
      "items": {
        "additionalProperties": false,
        "description": "User information embedded in API responses.",
        "properties": {
          "email": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User email address.",
            "title": "Email"
          },
          "fullName": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's full name.",
            "title": "Full Name"
          },
          "id": {
            "description": "User id associated with this resource.",
            "title": "User ID",
            "type": "string"
          },
          "userhash": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's gravatar hash.",
            "title": "Userhash"
          },
          "username": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Username.",
            "title": "Username"
          }
        },
        "required": [
          "id"
        ],
        "title": "UserData",
        "type": "object"
      },
      "title": "Owners",
      "type": "array"
    },
    "permissions": {
      "anyOf": [
        {
          "items": {
            "description": "Represents the particular role a user, group or organization holds on an entity.",
            "enum": [
              "CAN_VIEW",
              "CAN_UPDATE",
              "CAN_DELETE",
              "CAN_SHARE",
              "CAN_MAKE_PREDICTIONS",
              "CAN_SHARE_ROLE_OWNER",
              "CAN_SHARE_ROLE_READ_WRITE",
              "CAN_SHARE_ROLE_READ_ONLY"
            ],
            "title": "ResourcePermission",
            "type": "string"
          },
          "type": "array"
        },
        {
          "items": {
            "const": "*",
            "type": "string"
          },
          "type": "array"
        }
      ]
    },
    "protonId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active proton for this workload.",
      "title": "Proton ID"
    },
    "replacement": {
      "anyOf": [
        {
          "description": "Formatted replacement information for API responses.",
          "properties": {
            "candidateProtonIds": {
              "description": "Ids of protons pending promotion during artifact replacement.",
              "items": {
                "type": "string"
              },
              "title": "Candidateprotonids",
              "type": "array"
            },
            "status": {
              "description": "Statuses for workload replacement process.",
              "enum": [
                "unknown",
                "submitted",
                "initializing",
                "awaiting_promotion",
                "switching",
                "deleting",
                "completed",
                "errored",
                "cleaning_up"
              ],
              "title": "ReplacementStatus",
              "type": "string"
            },
            "strategy": {
              "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
              "enum": [
                "rolling"
              ],
              "title": "ReplacementStrategy",
              "type": "string"
            }
          },
          "title": "WorkloadReplacementFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Information about an active replacement process for this workload, if any.",
      "title": "Replacement"
    },
    "requestStats": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Request statistics summary.",
          "properties": {
            "concurrentRequests": {
              "default": 0,
              "description": "Number of concurrent requests.",
              "title": "Concurrentrequests",
              "type": "integer"
            },
            "errorRate": {
              "default": 0,
              "description": "Error rate percentage.",
              "title": "Errorrate",
              "type": "number"
            },
            "errorRates": {
              "description": "Error rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Errorrates",
              "type": "array"
            },
            "lastRequestAt": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Timestamp of the last request.",
              "title": "Lastrequestat"
            },
            "requestRates": {
              "description": "Request rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Requestrates",
              "type": "array"
            },
            "responseTime": {
              "default": 0,
              "description": "Average response time in milliseconds.",
              "title": "Responsetime",
              "type": "integer"
            },
            "totalRequests": {
              "default": 0,
              "description": "Total number of requests.",
              "title": "Totalrequests",
              "type": "integer"
            }
          },
          "title": "RequestStats",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Request statistics for this workload.",
      "title": "Request Stats"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "User-facing workload status. a subset of :class:`protonstatus` — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.",
      "enum": [
        "unknown",
        "submitted",
        "provisioning",
        "launching",
        "running",
        "suspended",
        "interrupted",
        "stopping",
        "stopped",
        "errored",
        "terminated"
      ],
      "title": "WorkloadStatus",
      "type": "string"
    },
    "tags": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "id": {
            "description": "Unique identifier of the tag.",
            "title": "Id",
            "type": "string"
          },
          "name": {
            "description": "Name of the tag.",
            "title": "Name",
            "type": "string"
          },
          "value": {
            "description": "Value of the tag.",
            "title": "Value",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "value"
        ],
        "title": "TagInfo",
        "type": "object"
      },
      "type": "array"
    },
    "type": {
      "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
      "enum": [
        "service",
        "nim"
      ],
      "title": "ArtifactType",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updated At",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt"
  ],
  "title": "WorkloadFormatted",
  "type": "object"
}

Responses

Status Meaning Description Schema
201 Created Successful Response WorkloadFormatted
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get All Workloads Stats

Operation path: GET /workloads/stats

Get aggregated statistics across all workloads accessible to the authenticated user.

Parameters

Name In Type Required Description
createdBy query any false Filters by those created by the given user.
ids query any false Filter by specific IDs
search query any false Case insensitive search against name, description and partial ID.
tagKeys query any false List of tag keys to filter for. If multiple values are specified, results with tags that match any of the values will be returned.
tagValues query any false List of tag values to filter for. If multiple values are specified, results with tags that match any of the values will be returned.
orderBy query any false The order to sort the results.
status query any false Filters workloads by status.
artifactStatus query any false Filters workloads by their corresponding artifact status.
importance query any false Filters workloads by their importance.
artifactId query any false Filter workloads by their active artifact ID.
repositoryId query any false Filter workloads by their active artifact's repository ID.
type query any false Filters workloads by artifact type.

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "Response containing workload statistics.",
  "properties": {
    "concurrentRequests": {
      "additionalProperties": false,
      "description": "Workload concurrent requests statistics.",
      "properties": {
        "count": {
          "default": 0,
          "description": "Number of concurrent requests.",
          "title": "Count",
          "type": "integer"
        },
        "trend": {
          "default": 0,
          "description": "Trend indicator (positive = increasing).",
          "title": "Trend",
          "type": "number"
        }
      },
      "title": "WorkloadsConcurrentRequestsStats",
      "type": "object"
    },
    "errorRate": {
      "additionalProperties": false,
      "description": "Workload error rate statistics.",
      "properties": {
        "rate": {
          "default": 0,
          "description": "Error rate percentage.",
          "title": "Rate",
          "type": "number"
        },
        "trend": {
          "default": 0,
          "description": "Trend indicator (positive = increasing).",
          "title": "Trend",
          "type": "number"
        }
      },
      "title": "WorkloadsErrorRateStats",
      "type": "object"
    },
    "requests": {
      "additionalProperties": false,
      "description": "Workload request statistics.",
      "properties": {
        "failed": {
          "default": 0,
          "description": "Number of failed requests.",
          "title": "Failed",
          "type": "integer"
        },
        "succeeded": {
          "default": 0,
          "description": "Number of successful requests.",
          "title": "Succeeded",
          "type": "integer"
        },
        "total": {
          "default": 0,
          "description": "Total number of requests.",
          "title": "Total",
          "type": "integer"
        },
        "trend": {
          "default": 0,
          "description": "Trend indicator (positive = increasing).",
          "title": "Trend",
          "type": "number"
        }
      },
      "title": "WorkloadsRequestsStats",
      "type": "object"
    },
    "responseTime": {
      "additionalProperties": false,
      "description": "Workload response time statistics.",
      "properties": {
        "millis": {
          "default": 0,
          "description": "Response time in milliseconds.",
          "title": "Millis",
          "type": "integer"
        },
        "trend": {
          "default": 0,
          "description": "Trend indicator (positive = increasing).",
          "title": "Trend",
          "type": "number"
        }
      },
      "title": "WorkloadsResponseTimeStats",
      "type": "object"
    },
    "workloads": {
      "additionalProperties": false,
      "description": "Workload count statistics.",
      "properties": {
        "active": {
          "default": 0,
          "description": "Number of active workloads.",
          "title": "Active",
          "type": "integer"
        },
        "total": {
          "default": 0,
          "description": "Total number of workloads.",
          "title": "Total",
          "type": "integer"
        }
      },
      "title": "WorkloadsCountStats",
      "type": "object"
    }
  },
  "title": "WorkloadStatsResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response WorkloadStatsResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
422 Unprocessable Entity Validation Error HTTPValidationError

Delete Workload By Workload_ Id by workload_ ID

Operation path: DELETE /workloads/{workload_id}

Delete a workload permanently.

Returns 204 on success. Proton stop failures are logged internally but do not change the response status.

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

422 Response

{
  "properties": {
    "detail": {
      "items": {
        "properties": {
          "ctx": {
            "title": "Context",
            "type": "object"
          },
          "input": {
            "title": "Input"
          },
          "loc": {
            "items": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "integer"
                }
              ]
            },
            "title": "Location",
            "type": "array"
          },
          "msg": {
            "title": "Message",
            "type": "string"
          },
          "type": {
            "title": "Error Type",
            "type": "string"
          }
        },
        "required": [
          "loc",
          "msg",
          "type"
        ],
        "title": "ValidationError",
        "type": "object"
      },
      "title": "Detail",
      "type": "array"
    }
  },
  "title": "HTTPValidationError",
  "type": "object"
}

Responses

Status Meaning Description Schema
204 No Content Successful Response None
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Workload By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}

Retrieve a workload by ID.

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "API representation of a workload. this is the formatted version returned to clients, excluding internal fields and including computed properties like permissions and statistics.",
  "properties": {
    "artifact": {
      "anyOf": [
        {
          "description": "Artifact basic information.",
          "properties": {
            "artifactRepositoryId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the artifact repository this artifact belongs to (for versioning).",
              "title": "Artifactrepositoryid"
            },
            "id": {
              "description": "Unique identifier of the entity.",
              "title": "Id",
              "type": "string"
            },
            "name": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Name of the entity.",
              "title": "Name"
            },
            "status": {
              "anyOf": [
                {
                  "enum": [
                    "draft",
                    "locked"
                  ],
                  "title": "ArtifactStatus",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact status."
            },
            "templateId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the template used to create this artifact.",
              "title": "Templateid"
            },
            "type": {
              "anyOf": [
                {
                  "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
                  "enum": [
                    "service",
                    "nim"
                  ],
                  "title": "ArtifactType",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact type."
            },
            "version": {
              "anyOf": [
                {
                  "type": "integer"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Version number of the artifact (set only for locked artifacts).",
              "title": "Version"
            }
          },
          "required": [
            "id"
          ],
          "title": "ArtifactInfoFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Basic information about the currently active artifact for this workload.",
      "title": "Artifact"
    },
    "artifactId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active artifact for this workload.",
      "title": "Artifact ID"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Created At",
      "type": "string"
    },
    "creator": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User information embedded in API responses.",
          "properties": {
            "email": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User email address.",
              "title": "Email"
            },
            "fullName": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's full name.",
              "title": "Full Name"
            },
            "id": {
              "description": "User id associated with this resource.",
              "title": "User ID",
              "type": "string"
            },
            "userhash": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's gravatar hash.",
              "title": "Userhash"
            },
            "username": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Username.",
              "title": "Username"
            }
          },
          "required": [
            "id"
          ],
          "title": "UserData",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Owner user details including id, username and email.",
      "title": "Creator"
    },
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": "",
      "description": "Workload description.",
      "title": "Description"
    },
    "endpoint": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Workload endpoint url.",
      "title": "Endpoint"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "ID",
      "type": "string"
    },
    "importance": {
      "description": "Importance level for workloads.",
      "enum": [
        "critical",
        "high",
        "moderate",
        "low"
      ],
      "title": "WorkloadImportance",
      "type": "string"
    },
    "lastResponse": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last response received from this workload.",
      "title": "Last Response Time"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "owners": {
      "description": "List of workload owners.",
      "items": {
        "additionalProperties": false,
        "description": "User information embedded in API responses.",
        "properties": {
          "email": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User email address.",
            "title": "Email"
          },
          "fullName": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's full name.",
            "title": "Full Name"
          },
          "id": {
            "description": "User id associated with this resource.",
            "title": "User ID",
            "type": "string"
          },
          "userhash": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's gravatar hash.",
            "title": "Userhash"
          },
          "username": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Username.",
            "title": "Username"
          }
        },
        "required": [
          "id"
        ],
        "title": "UserData",
        "type": "object"
      },
      "title": "Owners",
      "type": "array"
    },
    "permissions": {
      "anyOf": [
        {
          "items": {
            "description": "Represents the particular role a user, group or organization holds on an entity.",
            "enum": [
              "CAN_VIEW",
              "CAN_UPDATE",
              "CAN_DELETE",
              "CAN_SHARE",
              "CAN_MAKE_PREDICTIONS",
              "CAN_SHARE_ROLE_OWNER",
              "CAN_SHARE_ROLE_READ_WRITE",
              "CAN_SHARE_ROLE_READ_ONLY"
            ],
            "title": "ResourcePermission",
            "type": "string"
          },
          "type": "array"
        },
        {
          "items": {
            "const": "*",
            "type": "string"
          },
          "type": "array"
        }
      ]
    },
    "protonId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active proton for this workload.",
      "title": "Proton ID"
    },
    "replacement": {
      "anyOf": [
        {
          "description": "Formatted replacement information for API responses.",
          "properties": {
            "candidateProtonIds": {
              "description": "Ids of protons pending promotion during artifact replacement.",
              "items": {
                "type": "string"
              },
              "title": "Candidateprotonids",
              "type": "array"
            },
            "status": {
              "description": "Statuses for workload replacement process.",
              "enum": [
                "unknown",
                "submitted",
                "initializing",
                "awaiting_promotion",
                "switching",
                "deleting",
                "completed",
                "errored",
                "cleaning_up"
              ],
              "title": "ReplacementStatus",
              "type": "string"
            },
            "strategy": {
              "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
              "enum": [
                "rolling"
              ],
              "title": "ReplacementStrategy",
              "type": "string"
            }
          },
          "title": "WorkloadReplacementFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Information about an active replacement process for this workload, if any.",
      "title": "Replacement"
    },
    "requestStats": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Request statistics summary.",
          "properties": {
            "concurrentRequests": {
              "default": 0,
              "description": "Number of concurrent requests.",
              "title": "Concurrentrequests",
              "type": "integer"
            },
            "errorRate": {
              "default": 0,
              "description": "Error rate percentage.",
              "title": "Errorrate",
              "type": "number"
            },
            "errorRates": {
              "description": "Error rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Errorrates",
              "type": "array"
            },
            "lastRequestAt": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Timestamp of the last request.",
              "title": "Lastrequestat"
            },
            "requestRates": {
              "description": "Request rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Requestrates",
              "type": "array"
            },
            "responseTime": {
              "default": 0,
              "description": "Average response time in milliseconds.",
              "title": "Responsetime",
              "type": "integer"
            },
            "totalRequests": {
              "default": 0,
              "description": "Total number of requests.",
              "title": "Totalrequests",
              "type": "integer"
            }
          },
          "title": "RequestStats",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Request statistics for this workload.",
      "title": "Request Stats"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "User-facing workload status. a subset of :class:`protonstatus` — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.",
      "enum": [
        "unknown",
        "submitted",
        "provisioning",
        "launching",
        "running",
        "suspended",
        "interrupted",
        "stopping",
        "stopped",
        "errored",
        "terminated"
      ],
      "title": "WorkloadStatus",
      "type": "string"
    },
    "tags": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "id": {
            "description": "Unique identifier of the tag.",
            "title": "Id",
            "type": "string"
          },
          "name": {
            "description": "Name of the tag.",
            "title": "Name",
            "type": "string"
          },
          "value": {
            "description": "Value of the tag.",
            "title": "Value",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "value"
        ],
        "title": "TagInfo",
        "type": "object"
      },
      "type": "array"
    },
    "type": {
      "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
      "enum": [
        "service",
        "nim"
      ],
      "title": "ArtifactType",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updated At",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt"
  ],
  "title": "WorkloadFormatted",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response WorkloadFormatted
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Update Workload By Workload_ Id by workload_ ID

Operation path: PATCH /workloads/{workload_id}

Partially update a workload's properties.

Body parameter

{
  "additionalProperties": false,
  "description": "Request to update an existing workload.",
  "properties": {
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Updated workload description.",
      "title": "Description"
    },
    "importance": {
      "anyOf": [
        {
          "description": "Importance level for workloads.",
          "enum": [
            "critical",
            "high",
            "moderate",
            "low"
          ],
          "title": "WorkloadImportance",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Updated workload importance level."
    },
    "name": {
      "anyOf": [
        {
          "maxLength": 5000,
          "minLength": 1,
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Updated workload name.",
      "title": "Name"
    }
  },
  "title": "UpdateWorkloadRequest",
  "type": "object"
}

Parameters

Name In Type Required Description
workload_id path string true Workload ID
body body UpdateWorkloadRequest true none

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "API representation of a workload. this is the formatted version returned to clients, excluding internal fields and including computed properties like permissions and statistics.",
  "properties": {
    "artifact": {
      "anyOf": [
        {
          "description": "Artifact basic information.",
          "properties": {
            "artifactRepositoryId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the artifact repository this artifact belongs to (for versioning).",
              "title": "Artifactrepositoryid"
            },
            "id": {
              "description": "Unique identifier of the entity.",
              "title": "Id",
              "type": "string"
            },
            "name": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Name of the entity.",
              "title": "Name"
            },
            "status": {
              "anyOf": [
                {
                  "enum": [
                    "draft",
                    "locked"
                  ],
                  "title": "ArtifactStatus",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact status."
            },
            "templateId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the template used to create this artifact.",
              "title": "Templateid"
            },
            "type": {
              "anyOf": [
                {
                  "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
                  "enum": [
                    "service",
                    "nim"
                  ],
                  "title": "ArtifactType",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact type."
            },
            "version": {
              "anyOf": [
                {
                  "type": "integer"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Version number of the artifact (set only for locked artifacts).",
              "title": "Version"
            }
          },
          "required": [
            "id"
          ],
          "title": "ArtifactInfoFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Basic information about the currently active artifact for this workload.",
      "title": "Artifact"
    },
    "artifactId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active artifact for this workload.",
      "title": "Artifact ID"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Created At",
      "type": "string"
    },
    "creator": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User information embedded in API responses.",
          "properties": {
            "email": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User email address.",
              "title": "Email"
            },
            "fullName": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's full name.",
              "title": "Full Name"
            },
            "id": {
              "description": "User id associated with this resource.",
              "title": "User ID",
              "type": "string"
            },
            "userhash": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's gravatar hash.",
              "title": "Userhash"
            },
            "username": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Username.",
              "title": "Username"
            }
          },
          "required": [
            "id"
          ],
          "title": "UserData",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Owner user details including id, username and email.",
      "title": "Creator"
    },
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": "",
      "description": "Workload description.",
      "title": "Description"
    },
    "endpoint": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Workload endpoint url.",
      "title": "Endpoint"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "ID",
      "type": "string"
    },
    "importance": {
      "description": "Importance level for workloads.",
      "enum": [
        "critical",
        "high",
        "moderate",
        "low"
      ],
      "title": "WorkloadImportance",
      "type": "string"
    },
    "lastResponse": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last response received from this workload.",
      "title": "Last Response Time"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "owners": {
      "description": "List of workload owners.",
      "items": {
        "additionalProperties": false,
        "description": "User information embedded in API responses.",
        "properties": {
          "email": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User email address.",
            "title": "Email"
          },
          "fullName": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's full name.",
            "title": "Full Name"
          },
          "id": {
            "description": "User id associated with this resource.",
            "title": "User ID",
            "type": "string"
          },
          "userhash": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's gravatar hash.",
            "title": "Userhash"
          },
          "username": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Username.",
            "title": "Username"
          }
        },
        "required": [
          "id"
        ],
        "title": "UserData",
        "type": "object"
      },
      "title": "Owners",
      "type": "array"
    },
    "permissions": {
      "anyOf": [
        {
          "items": {
            "description": "Represents the particular role a user, group or organization holds on an entity.",
            "enum": [
              "CAN_VIEW",
              "CAN_UPDATE",
              "CAN_DELETE",
              "CAN_SHARE",
              "CAN_MAKE_PREDICTIONS",
              "CAN_SHARE_ROLE_OWNER",
              "CAN_SHARE_ROLE_READ_WRITE",
              "CAN_SHARE_ROLE_READ_ONLY"
            ],
            "title": "ResourcePermission",
            "type": "string"
          },
          "type": "array"
        },
        {
          "items": {
            "const": "*",
            "type": "string"
          },
          "type": "array"
        }
      ]
    },
    "protonId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active proton for this workload.",
      "title": "Proton ID"
    },
    "replacement": {
      "anyOf": [
        {
          "description": "Formatted replacement information for API responses.",
          "properties": {
            "candidateProtonIds": {
              "description": "Ids of protons pending promotion during artifact replacement.",
              "items": {
                "type": "string"
              },
              "title": "Candidateprotonids",
              "type": "array"
            },
            "status": {
              "description": "Statuses for workload replacement process.",
              "enum": [
                "unknown",
                "submitted",
                "initializing",
                "awaiting_promotion",
                "switching",
                "deleting",
                "completed",
                "errored",
                "cleaning_up"
              ],
              "title": "ReplacementStatus",
              "type": "string"
            },
            "strategy": {
              "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
              "enum": [
                "rolling"
              ],
              "title": "ReplacementStrategy",
              "type": "string"
            }
          },
          "title": "WorkloadReplacementFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Information about an active replacement process for this workload, if any.",
      "title": "Replacement"
    },
    "requestStats": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Request statistics summary.",
          "properties": {
            "concurrentRequests": {
              "default": 0,
              "description": "Number of concurrent requests.",
              "title": "Concurrentrequests",
              "type": "integer"
            },
            "errorRate": {
              "default": 0,
              "description": "Error rate percentage.",
              "title": "Errorrate",
              "type": "number"
            },
            "errorRates": {
              "description": "Error rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Errorrates",
              "type": "array"
            },
            "lastRequestAt": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Timestamp of the last request.",
              "title": "Lastrequestat"
            },
            "requestRates": {
              "description": "Request rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Requestrates",
              "type": "array"
            },
            "responseTime": {
              "default": 0,
              "description": "Average response time in milliseconds.",
              "title": "Responsetime",
              "type": "integer"
            },
            "totalRequests": {
              "default": 0,
              "description": "Total number of requests.",
              "title": "Totalrequests",
              "type": "integer"
            }
          },
          "title": "RequestStats",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Request statistics for this workload.",
      "title": "Request Stats"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "User-facing workload status. a subset of :class:`protonstatus` — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.",
      "enum": [
        "unknown",
        "submitted",
        "provisioning",
        "launching",
        "running",
        "suspended",
        "interrupted",
        "stopping",
        "stopped",
        "errored",
        "terminated"
      ],
      "title": "WorkloadStatus",
      "type": "string"
    },
    "tags": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "id": {
            "description": "Unique identifier of the tag.",
            "title": "Id",
            "type": "string"
          },
          "name": {
            "description": "Name of the tag.",
            "title": "Name",
            "type": "string"
          },
          "value": {
            "description": "Value of the tag.",
            "title": "Value",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "value"
        ],
        "title": "TagInfo",
        "type": "object"
      },
      "type": "array"
    },
    "type": {
      "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
      "enum": [
        "service",
        "nim"
      ],
      "title": "ArtifactType",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updated At",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt"
  ],
  "title": "WorkloadFormatted",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response WorkloadFormatted
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Workload events By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/events

List events for a workload, such as status changes and errors.

Parameters

Name In Type Required Description
workload_id path string true Workload ID
offset query integer false Skip the specified number of values.
limit query integer false Retrieve only the specified number of values.

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "Response containing workload events.",
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "description": "A single workload event record. note: full event schema will be defined once event storage is implemented (p7). this placeholder documents the known shape.",
        "properties": {
          "actorId": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Id of the user or system that triggered the event.",
            "title": "Actor ID"
          },
          "details": {
            "anyOf": [
              {
                "additionalProperties": true,
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Additional event-specific details.",
            "title": "Details"
          },
          "eventType": {
            "description": "Type of event.",
            "title": "Event Type",
            "type": "string"
          },
          "id": {
            "description": "Event id.",
            "title": "ID",
            "type": "string"
          },
          "timestamp": {
            "description": "When the event occurred.",
            "format": "date-time",
            "title": "Timestamp",
            "type": "string"
          },
          "workloadId": {
            "description": "Id of the workload this event belongs to.",
            "title": "Workload ID",
            "type": "string"
          }
        },
        "required": [
          "id",
          "workloadId",
          "timestamp",
          "eventType"
        ],
        "title": "WorkloadEvent",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "WorkloadEventsResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response WorkloadEventsResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Artifact Histories By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/history

List the artifact deployment history for a workload.

Parameters

Name In Type Required Description
workload_id path string true Workload ID
offset query integer false Skip the specified number of values.
limit query integer false Retrieve only the specified number of values.

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "Response model for listing replacement history of a workload.",
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "description": "Store replacement information for workloads.",
        "properties": {
          "candidateArtifactId": {
            "description": "Candidate artifact id.",
            "title": "Candidateartifactid",
            "type": "string"
          },
          "candidateProtonIds": {
            "description": "Ids of protons pending promotion during artifact replacement.",
            "items": {
              "type": "string"
            },
            "title": "Candidateprotonids",
            "type": "array"
          },
          "config": {
            "additionalProperties": false,
            "description": "Configuration for workload replacement.",
            "properties": {
              "keepOldVersionMinutes": {
                "default": 0,
                "description": "Duration in minutes to keep the old version during replacement.",
                "title": "Keepoldversionminutes",
                "type": "integer"
              },
              "warmupDurationMinutes": {
                "default": 0,
                "description": "Duration in minutes for the warmup phase during replacement.",
                "title": "Warmupdurationminutes",
                "type": "integer"
              }
            },
            "title": "ReplacementConfig",
            "type": "object"
          },
          "createdAt": {
            "description": "Timestamp of when the entity was created.",
            "format": "date-time",
            "title": "Createdat",
            "type": "string"
          },
          "deletedAt": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of when the entity was deleted.",
            "title": "Deletedat"
          },
          "id": {
            "description": "Unique identifier of the entity.",
            "title": "Id",
            "type": "string"
          },
          "isDeleted": {
            "default": false,
            "description": "Whether this entity has been deleted.",
            "title": "Isdeleted",
            "type": "boolean"
          },
          "message": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Additional information about the replacement status, such as validation errors or reasons for failure.",
            "title": "Message"
          },
          "name": {
            "description": "Name of the entity.",
            "title": "Name",
            "type": "string"
          },
          "previousProtonIds": {
            "anyOf": [
              {
                "items": {
                  "type": "string"
                },
                "type": "array"
              },
              {
                "type": "null"
              }
            ],
            "description": "Ids of protons pending decommissioning during artifact replacement.",
            "title": "Previousprotonids"
          },
          "protonStatuses": {
            "anyOf": [
              {
                "additionalProperties": {
                  "additionalProperties": false,
                  "properties": {
                    "overallStatus": {
                      "additionalProperties": false,
                      "description": "Overall status as reported by the workload-monitor service.",
                      "properties": {
                        "lastUpdated": {
                          "description": "Rfc3339 timestamp of the last state transition.",
                          "title": "Lastupdated",
                          "type": "string"
                        },
                        "state": {
                          "enum": [
                            "unknown",
                            "submitted",
                            "initializing",
                            "provisioning",
                            "launching",
                            "running",
                            "suspended",
                            "warming",
                            "draining",
                            "interrupted",
                            "restarting",
                            "stopping",
                            "stopped",
                            "errored",
                            "terminated"
                          ],
                          "title": "ProtonStatus",
                          "type": "string"
                        },
                        "summary": {
                          "description": "Human-readable description of the current state.",
                          "title": "Summary",
                          "type": "string"
                        }
                      },
                      "required": [
                        "state",
                        "summary",
                        "lastUpdated"
                      ],
                      "title": "WorkloadMonitorOverallStatus",
                      "type": "object"
                    },
                    "replicas": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "address": {
                            "title": "Address",
                            "type": "string"
                          },
                          "conditions": {
                            "items": {
                              "additionalProperties": false,
                              "properties": {
                                "lastTransitionTime": {
                                  "title": "Lasttransitiontime",
                                  "type": "string"
                                },
                                "message": {
                                  "default": "",
                                  "title": "Message",
                                  "type": "string"
                                },
                                "reason": {
                                  "default": "",
                                  "title": "Reason",
                                  "type": "string"
                                },
                                "type": {
                                  "title": "Type",
                                  "type": "string"
                                },
                                "value": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "title": "Value"
                                }
                              },
                              "required": [
                                "type",
                                "value",
                                "lastTransitionTime"
                              ],
                              "title": "ReplicaConditionDetail",
                              "type": "object"
                            },
                            "title": "Conditions",
                            "type": "array"
                          },
                          "containers": {
                            "items": {
                              "additionalProperties": false,
                              "properties": {
                                "image": {
                                  "title": "Image",
                                  "type": "string"
                                },
                                "name": {
                                  "title": "Name",
                                  "type": "string"
                                },
                                "ready": {
                                  "title": "Ready",
                                  "type": "boolean"
                                },
                                "restartCount": {
                                  "title": "Restartcount",
                                  "type": "integer"
                                },
                                "startedAt": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "title": "Startedat"
                                },
                                "status": {
                                  "description": "Lifecycle state of a container within a deployment replica.",
                                  "enum": [
                                    "running",
                                    "waiting",
                                    "terminated",
                                    "unknown"
                                  ],
                                  "title": "ContainerStatus",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "name",
                                "status",
                                "startedAt",
                                "ready",
                                "restartCount",
                                "image"
                              ],
                              "title": "ContainerStatusDetail",
                              "type": "object"
                            },
                            "title": "Containers",
                            "type": "array"
                          },
                          "name": {
                            "title": "Name",
                            "type": "string"
                          },
                          "nodeAddress": {
                            "title": "Nodeaddress",
                            "type": "string"
                          },
                          "startedAt": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Startedat"
                          },
                          "status": {
                            "description": "Lifecycle phase of a deployment replica.",
                            "enum": [
                              "pending",
                              "running",
                              "succeeded",
                              "failed",
                              "unknown"
                            ],
                            "title": "ReplicaPhase",
                            "type": "string"
                          }
                        },
                        "required": [
                          "name",
                          "status",
                          "address",
                          "nodeAddress",
                          "startedAt",
                          "conditions",
                          "containers"
                        ],
                        "title": "ReplicaDetail",
                        "type": "object"
                      },
                      "title": "Replicas",
                      "type": "array"
                    }
                  },
                  "required": [
                    "overallStatus",
                    "replicas"
                  ],
                  "title": "ReplicaStatusesSnapshot",
                  "type": "object"
                },
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Latest known status of candidate protons, used to determine replacement status transitions.",
            "title": "Protonstatuses"
          },
          "runtime": {
            "additionalProperties": false,
            "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
            "properties": {
              "containerGroups": {
                "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime configuration for a single container group.",
                  "properties": {
                    "autoscaling": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Autoscaling configuration for a proton.",
                          "properties": {
                            "enabled": {
                              "default": true,
                              "description": "Whether autoscaling is enabled.",
                              "title": "Enabled",
                              "type": "boolean"
                            },
                            "policies": {
                              "items": {
                                "additionalProperties": false,
                                "description": "Base class for autoscaling policies.",
                                "properties": {
                                  "maxCount": {
                                    "description": "Maximum number of replicas.",
                                    "minimum": 0,
                                    "title": "Max Count",
                                    "type": "integer"
                                  },
                                  "minCount": {
                                    "description": "Minimum number of replicas.",
                                    "minimum": 0,
                                    "title": "Min Count",
                                    "type": "integer"
                                  },
                                  "priority": {
                                    "anyOf": [
                                      {
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Policy priority when multiple policies are defined.",
                                    "title": "Priority"
                                  },
                                  "scalingMetric": {
                                    "anyOf": [
                                      {
                                        "oneOf": [
                                          {
                                            "const": "cpuAverageUtilization",
                                            "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                            "title": "CPU Average Utilization"
                                          },
                                          {
                                            "const": "httpRequestsConcurrency",
                                            "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                            "title": "HTTP Requests Concurrency"
                                          },
                                          {
                                            "const": "gpuCacheUtilization",
                                            "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                            "title": "GPU Cache Utilization"
                                          },
                                          {
                                            "const": "gpuRequestQueueDepth",
                                            "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                            "title": "GPU Request Queue Depth"
                                          }
                                        ],
                                        "title": "ScalingMetricType",
                                        "type": "string"
                                      },
                                      {
                                        "type": "string"
                                      }
                                    ],
                                    "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                                    "title": "Scaling Metric"
                                  },
                                  "target": {
                                    "description": "Target value for the scaling metric.",
                                    "minimum": 0,
                                    "title": "Target",
                                    "type": "number"
                                  }
                                },
                                "required": [
                                  "scalingMetric",
                                  "target",
                                  "minCount",
                                  "maxCount"
                                ],
                                "title": "AutoscalingPolicy",
                                "type": "object"
                              },
                              "title": "Policies",
                              "type": "array"
                            }
                          },
                          "required": [
                            "policies"
                          ],
                          "title": "AutoscalingProperties",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Autoscaling configuration for this group. takes precedence over replicacount."
                    },
                    "bundleSelectionPolicy": {
                      "enum": [
                        "availability"
                      ],
                      "title": "BundleSelectionPolicy",
                      "type": "string"
                    },
                    "containers": {
                      "description": "Per-container overrides for this group.",
                      "items": {
                        "additionalProperties": false,
                        "description": "Runtime diff targeting a single named container within a group.",
                        "properties": {
                          "name": {
                            "description": "Container name. must match a container declared in the artifact group.",
                            "title": "Name",
                            "type": "string"
                          },
                          "resourceAllocation": {
                            "anyOf": [
                              {
                                "additionalProperties": false,
                                "description": "Per-container resource allocation declared at runtime.",
                                "properties": {
                                  "cpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0.1,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Cpu cores allocated to this container.",
                                    "title": "Cpu"
                                  },
                                  "gpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Gpus allocated to this container.",
                                    "title": "Gpu"
                                  },
                                  "memory": {
                                    "anyOf": [
                                      {
                                        "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                        "type": "string"
                                      },
                                      {
                                        "minimum": 0,
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                                    "examples": [
                                      "8GB",
                                      "512MB"
                                    ],
                                    "title": "Memory"
                                  }
                                },
                                "title": "ResourceAllocation",
                                "type": "object"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Resource allocation for this container. required for multi-container groups."
                          }
                        },
                        "required": [
                          "name"
                        ],
                        "title": "ContainerOverride",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "default": "default",
                      "description": "Group name. must match a container group name declared in the artifact.",
                      "title": "Name",
                      "type": "string"
                    },
                    "replicaCount": {
                      "anyOf": [
                        {
                          "minimum": 1,
                          "type": "integer"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "default": 1,
                      "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                      "title": "Replicacount"
                    },
                    "resolvedBundle": {
                      "anyOf": [
                        {
                          "description": "Bundle details returned in the runtime response after scheduling.",
                          "properties": {
                            "cpuCount": {
                              "description": "Number of cpu cores.",
                              "title": "CPU Count",
                              "type": "number"
                            },
                            "gpuCount": {
                              "default": 0,
                              "description": "Number of gpu units.",
                              "title": "GPU Count",
                              "type": "integer"
                            },
                            "gpuMaker": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu manufacturer.",
                              "title": "GPU Maker"
                            },
                            "gpuTypeLabel": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu type label.",
                              "title": "GPU Type Label"
                            },
                            "id": {
                              "description": "Bundle identifier that was selected.",
                              "title": "Id",
                              "type": "string"
                            },
                            "memoryBytes": {
                              "description": "Memory size in bytes.",
                              "title": "Memory Bytes",
                              "type": "integer"
                            }
                          },
                          "required": [
                            "id",
                            "cpuCount",
                            "memoryBytes"
                          ],
                          "title": "ResolvedBundle",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Full details of the bundle selected at scheduling time. read-only.",
                      "readOnly": true
                    },
                    "resourceBundles": {
                      "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                      "items": {
                        "type": "string"
                      },
                      "title": "Resourcebundles",
                      "type": "array"
                    }
                  },
                  "title": "GroupRuntime",
                  "type": "object"
                },
                "title": "Containergroups",
                "type": "array"
              }
            },
            "title": "WorkloadRuntime",
            "type": "object"
          },
          "status": {
            "description": "Statuses for workload replacement process.",
            "enum": [
              "unknown",
              "submitted",
              "initializing",
              "awaiting_promotion",
              "switching",
              "deleting",
              "completed",
              "errored",
              "cleaning_up"
            ],
            "title": "ReplacementStatus",
            "type": "string"
          },
          "strategy": {
            "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
            "enum": [
              "rolling"
            ],
            "title": "ReplacementStrategy",
            "type": "string"
          },
          "switchedAt": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of when the replacement take action.",
            "title": "Switchedat"
          },
          "taskiqLastHeartbeat": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of the last taskiq poll for this replacement; used by the cron to detect abandoned taskiq-managed replacements.",
            "title": "Taskiqlastheartbeat"
          },
          "taskiqManaged": {
            "default": false,
            "description": "When true, this replacement is managed by the taskiq worker and should be skipped by the batch cronjob.",
            "title": "Taskiqmanaged",
            "type": "boolean"
          },
          "tenantId": {
            "description": "Id of the tenant this entity belongs to.",
            "format": "uuid4",
            "title": "Tenantid",
            "type": "string"
          },
          "updatedAt": {
            "description": "Timestamp of when the entity was last updated.",
            "format": "date-time",
            "title": "Updatedat",
            "type": "string"
          },
          "userId": {
            "description": "Id of the user who owns this entity.",
            "title": "Userid",
            "type": "string"
          },
          "workloadId": {
            "description": "Workload id.",
            "title": "Workloadid",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "createdAt",
          "updatedAt",
          "userId",
          "tenantId",
          "workloadId",
          "candidateArtifactId"
        ],
        "title": "Replacement",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "ReplacementHistoryListResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response ReplacementHistoryListResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Promote Workload Artifact By Workload_ Id by workload_ ID

Operation path: POST /workloads/{workload_id}/promote

Lock the draft artifact currently running on a workload.

The workload continues running the same artifact, which is promoted from draft to locked and assigned a version number. Workload stats are reset and the event is recorded in the replacement history.

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

202 Response

{
  "additionalProperties": false,
  "description": "API representation of a workload. this is the formatted version returned to clients, excluding internal fields and including computed properties like permissions and statistics.",
  "properties": {
    "artifact": {
      "anyOf": [
        {
          "description": "Artifact basic information.",
          "properties": {
            "artifactRepositoryId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the artifact repository this artifact belongs to (for versioning).",
              "title": "Artifactrepositoryid"
            },
            "id": {
              "description": "Unique identifier of the entity.",
              "title": "Id",
              "type": "string"
            },
            "name": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Name of the entity.",
              "title": "Name"
            },
            "status": {
              "anyOf": [
                {
                  "enum": [
                    "draft",
                    "locked"
                  ],
                  "title": "ArtifactStatus",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact status."
            },
            "templateId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the template used to create this artifact.",
              "title": "Templateid"
            },
            "type": {
              "anyOf": [
                {
                  "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
                  "enum": [
                    "service",
                    "nim"
                  ],
                  "title": "ArtifactType",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact type."
            },
            "version": {
              "anyOf": [
                {
                  "type": "integer"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Version number of the artifact (set only for locked artifacts).",
              "title": "Version"
            }
          },
          "required": [
            "id"
          ],
          "title": "ArtifactInfoFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Basic information about the currently active artifact for this workload.",
      "title": "Artifact"
    },
    "artifactId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active artifact for this workload.",
      "title": "Artifact ID"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Created At",
      "type": "string"
    },
    "creator": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User information embedded in API responses.",
          "properties": {
            "email": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User email address.",
              "title": "Email"
            },
            "fullName": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's full name.",
              "title": "Full Name"
            },
            "id": {
              "description": "User id associated with this resource.",
              "title": "User ID",
              "type": "string"
            },
            "userhash": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's gravatar hash.",
              "title": "Userhash"
            },
            "username": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Username.",
              "title": "Username"
            }
          },
          "required": [
            "id"
          ],
          "title": "UserData",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Owner user details including id, username and email.",
      "title": "Creator"
    },
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": "",
      "description": "Workload description.",
      "title": "Description"
    },
    "endpoint": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Workload endpoint url.",
      "title": "Endpoint"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "ID",
      "type": "string"
    },
    "importance": {
      "description": "Importance level for workloads.",
      "enum": [
        "critical",
        "high",
        "moderate",
        "low"
      ],
      "title": "WorkloadImportance",
      "type": "string"
    },
    "lastResponse": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last response received from this workload.",
      "title": "Last Response Time"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "owners": {
      "description": "List of workload owners.",
      "items": {
        "additionalProperties": false,
        "description": "User information embedded in API responses.",
        "properties": {
          "email": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User email address.",
            "title": "Email"
          },
          "fullName": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's full name.",
            "title": "Full Name"
          },
          "id": {
            "description": "User id associated with this resource.",
            "title": "User ID",
            "type": "string"
          },
          "userhash": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's gravatar hash.",
            "title": "Userhash"
          },
          "username": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Username.",
            "title": "Username"
          }
        },
        "required": [
          "id"
        ],
        "title": "UserData",
        "type": "object"
      },
      "title": "Owners",
      "type": "array"
    },
    "permissions": {
      "anyOf": [
        {
          "items": {
            "description": "Represents the particular role a user, group or organization holds on an entity.",
            "enum": [
              "CAN_VIEW",
              "CAN_UPDATE",
              "CAN_DELETE",
              "CAN_SHARE",
              "CAN_MAKE_PREDICTIONS",
              "CAN_SHARE_ROLE_OWNER",
              "CAN_SHARE_ROLE_READ_WRITE",
              "CAN_SHARE_ROLE_READ_ONLY"
            ],
            "title": "ResourcePermission",
            "type": "string"
          },
          "type": "array"
        },
        {
          "items": {
            "const": "*",
            "type": "string"
          },
          "type": "array"
        }
      ]
    },
    "protonId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active proton for this workload.",
      "title": "Proton ID"
    },
    "replacement": {
      "anyOf": [
        {
          "description": "Formatted replacement information for API responses.",
          "properties": {
            "candidateProtonIds": {
              "description": "Ids of protons pending promotion during artifact replacement.",
              "items": {
                "type": "string"
              },
              "title": "Candidateprotonids",
              "type": "array"
            },
            "status": {
              "description": "Statuses for workload replacement process.",
              "enum": [
                "unknown",
                "submitted",
                "initializing",
                "awaiting_promotion",
                "switching",
                "deleting",
                "completed",
                "errored",
                "cleaning_up"
              ],
              "title": "ReplacementStatus",
              "type": "string"
            },
            "strategy": {
              "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
              "enum": [
                "rolling"
              ],
              "title": "ReplacementStrategy",
              "type": "string"
            }
          },
          "title": "WorkloadReplacementFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Information about an active replacement process for this workload, if any.",
      "title": "Replacement"
    },
    "requestStats": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Request statistics summary.",
          "properties": {
            "concurrentRequests": {
              "default": 0,
              "description": "Number of concurrent requests.",
              "title": "Concurrentrequests",
              "type": "integer"
            },
            "errorRate": {
              "default": 0,
              "description": "Error rate percentage.",
              "title": "Errorrate",
              "type": "number"
            },
            "errorRates": {
              "description": "Error rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Errorrates",
              "type": "array"
            },
            "lastRequestAt": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Timestamp of the last request.",
              "title": "Lastrequestat"
            },
            "requestRates": {
              "description": "Request rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Requestrates",
              "type": "array"
            },
            "responseTime": {
              "default": 0,
              "description": "Average response time in milliseconds.",
              "title": "Responsetime",
              "type": "integer"
            },
            "totalRequests": {
              "default": 0,
              "description": "Total number of requests.",
              "title": "Totalrequests",
              "type": "integer"
            }
          },
          "title": "RequestStats",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Request statistics for this workload.",
      "title": "Request Stats"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "User-facing workload status. a subset of :class:`protonstatus` — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.",
      "enum": [
        "unknown",
        "submitted",
        "provisioning",
        "launching",
        "running",
        "suspended",
        "interrupted",
        "stopping",
        "stopped",
        "errored",
        "terminated"
      ],
      "title": "WorkloadStatus",
      "type": "string"
    },
    "tags": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "id": {
            "description": "Unique identifier of the tag.",
            "title": "Id",
            "type": "string"
          },
          "name": {
            "description": "Name of the tag.",
            "title": "Name",
            "type": "string"
          },
          "value": {
            "description": "Value of the tag.",
            "title": "Value",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "value"
        ],
        "title": "TagInfo",
        "type": "object"
      },
      "type": "array"
    },
    "type": {
      "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
      "enum": [
        "service",
        "nim"
      ],
      "title": "ArtifactType",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updated At",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt"
  ],
  "title": "WorkloadFormatted",
  "type": "object"
}

Responses

Status Meaning Description Schema
202 Accepted Successful Response WorkloadFormatted
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Artifact is not a draft, workload has no active proton, or a replacement is in progress None

List Workload Protons By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/protons

List all protons associated with a workload.

Parameters

Name In Type Required Description
workload_id path string true Workload ID
offset query integer false Skip the specified number of values.
limit query integer false Retrieve only the specified number of values.

Example responses

200 Response

{
  "additionalProperties": false,
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "properties": {
          "artifactId": {
            "description": "Id of the artifact deployed by this proton.",
            "title": "Artifactid",
            "type": "string"
          },
          "createdAt": {
            "description": "Timestamp of when the entity was created.",
            "format": "date-time",
            "title": "Created At",
            "type": "string"
          },
          "creator": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "User information embedded in API responses.",
                "properties": {
                  "email": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User email address.",
                    "title": "Email"
                  },
                  "fullName": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User's full name.",
                    "title": "Full Name"
                  },
                  "id": {
                    "description": "User id associated with this resource.",
                    "title": "User ID",
                    "type": "string"
                  },
                  "userhash": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User's gravatar hash.",
                    "title": "Userhash"
                  },
                  "username": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Username.",
                    "title": "Username"
                  }
                },
                "required": [
                  "id"
                ],
                "title": "UserData",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Owner user details including id, username and email."
          },
          "endpoint": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "API endpoint to use to send service requests.",
            "title": "Endpoint"
          },
          "id": {
            "description": "Unique identifier of the entity.",
            "title": "ID",
            "type": "string"
          },
          "name": {
            "description": "Name of the entity.",
            "title": "Name",
            "type": "string"
          },
          "role": {
            "anyOf": [
              {
                "enum": [
                  "active",
                  "candidate"
                ],
                "title": "ProtonRole",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Role of the proton within its workload, either 'active' or 'candidate'."
          },
          "runningSince": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of when the proton entered running status.",
            "title": "Runningsince"
          },
          "runtime": {
            "additionalProperties": false,
            "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
            "properties": {
              "containerGroups": {
                "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime configuration for a single container group.",
                  "properties": {
                    "autoscaling": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Autoscaling configuration for a proton.",
                          "properties": {
                            "enabled": {
                              "default": true,
                              "description": "Whether autoscaling is enabled.",
                              "title": "Enabled",
                              "type": "boolean"
                            },
                            "policies": {
                              "items": {
                                "additionalProperties": false,
                                "description": "Base class for autoscaling policies.",
                                "properties": {
                                  "maxCount": {
                                    "description": "Maximum number of replicas.",
                                    "minimum": 0,
                                    "title": "Max Count",
                                    "type": "integer"
                                  },
                                  "minCount": {
                                    "description": "Minimum number of replicas.",
                                    "minimum": 0,
                                    "title": "Min Count",
                                    "type": "integer"
                                  },
                                  "priority": {
                                    "anyOf": [
                                      {
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Policy priority when multiple policies are defined.",
                                    "title": "Priority"
                                  },
                                  "scalingMetric": {
                                    "anyOf": [
                                      {
                                        "oneOf": [
                                          {
                                            "const": "cpuAverageUtilization",
                                            "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                            "title": "CPU Average Utilization"
                                          },
                                          {
                                            "const": "httpRequestsConcurrency",
                                            "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                            "title": "HTTP Requests Concurrency"
                                          },
                                          {
                                            "const": "gpuCacheUtilization",
                                            "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                            "title": "GPU Cache Utilization"
                                          },
                                          {
                                            "const": "gpuRequestQueueDepth",
                                            "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                            "title": "GPU Request Queue Depth"
                                          }
                                        ],
                                        "title": "ScalingMetricType",
                                        "type": "string"
                                      },
                                      {
                                        "type": "string"
                                      }
                                    ],
                                    "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                                    "title": "Scaling Metric"
                                  },
                                  "target": {
                                    "description": "Target value for the scaling metric.",
                                    "minimum": 0,
                                    "title": "Target",
                                    "type": "number"
                                  }
                                },
                                "required": [
                                  "scalingMetric",
                                  "target",
                                  "minCount",
                                  "maxCount"
                                ],
                                "title": "AutoscalingPolicy",
                                "type": "object"
                              },
                              "title": "Policies",
                              "type": "array"
                            }
                          },
                          "required": [
                            "policies"
                          ],
                          "title": "AutoscalingProperties",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Autoscaling configuration for this group. takes precedence over replicacount."
                    },
                    "bundleSelectionPolicy": {
                      "enum": [
                        "availability"
                      ],
                      "title": "BundleSelectionPolicy",
                      "type": "string"
                    },
                    "containers": {
                      "description": "Per-container overrides for this group.",
                      "items": {
                        "additionalProperties": false,
                        "description": "Runtime diff targeting a single named container within a group.",
                        "properties": {
                          "name": {
                            "description": "Container name. must match a container declared in the artifact group.",
                            "title": "Name",
                            "type": "string"
                          },
                          "resourceAllocation": {
                            "anyOf": [
                              {
                                "additionalProperties": false,
                                "description": "Per-container resource allocation declared at runtime.",
                                "properties": {
                                  "cpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0.1,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Cpu cores allocated to this container.",
                                    "title": "Cpu"
                                  },
                                  "gpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Gpus allocated to this container.",
                                    "title": "Gpu"
                                  },
                                  "memory": {
                                    "anyOf": [
                                      {
                                        "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                        "type": "string"
                                      },
                                      {
                                        "minimum": 0,
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                                    "examples": [
                                      "8GB",
                                      "512MB"
                                    ],
                                    "title": "Memory"
                                  }
                                },
                                "title": "ResourceAllocation",
                                "type": "object"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Resource allocation for this container. required for multi-container groups."
                          }
                        },
                        "required": [
                          "name"
                        ],
                        "title": "ContainerOverride",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "default": "default",
                      "description": "Group name. must match a container group name declared in the artifact.",
                      "title": "Name",
                      "type": "string"
                    },
                    "replicaCount": {
                      "anyOf": [
                        {
                          "minimum": 1,
                          "type": "integer"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "default": 1,
                      "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                      "title": "Replicacount"
                    },
                    "resolvedBundle": {
                      "anyOf": [
                        {
                          "description": "Bundle details returned in the runtime response after scheduling.",
                          "properties": {
                            "cpuCount": {
                              "description": "Number of cpu cores.",
                              "title": "CPU Count",
                              "type": "number"
                            },
                            "gpuCount": {
                              "default": 0,
                              "description": "Number of gpu units.",
                              "title": "GPU Count",
                              "type": "integer"
                            },
                            "gpuMaker": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu manufacturer.",
                              "title": "GPU Maker"
                            },
                            "gpuTypeLabel": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu type label.",
                              "title": "GPU Type Label"
                            },
                            "id": {
                              "description": "Bundle identifier that was selected.",
                              "title": "Id",
                              "type": "string"
                            },
                            "memoryBytes": {
                              "description": "Memory size in bytes.",
                              "title": "Memory Bytes",
                              "type": "integer"
                            }
                          },
                          "required": [
                            "id",
                            "cpuCount",
                            "memoryBytes"
                          ],
                          "title": "ResolvedBundle",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Full details of the bundle selected at scheduling time. read-only.",
                      "readOnly": true
                    },
                    "resourceBundles": {
                      "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                      "items": {
                        "type": "string"
                      },
                      "title": "Resourcebundles",
                      "type": "array"
                    }
                  },
                  "title": "GroupRuntime",
                  "type": "object"
                },
                "title": "Containergroups",
                "type": "array"
              }
            },
            "title": "WorkloadRuntime",
            "type": "object"
          },
          "status": {
            "enum": [
              "unknown",
              "submitted",
              "initializing",
              "provisioning",
              "launching",
              "running",
              "suspended",
              "warming",
              "draining",
              "interrupted",
              "restarting",
              "stopping",
              "stopped",
              "errored",
              "terminated"
            ],
            "title": "ProtonStatus",
            "type": "string"
          },
          "updatedAt": {
            "description": "Timestamp of when the entity was last updated.",
            "format": "date-time",
            "title": "Updated At",
            "type": "string"
          },
          "workloadId": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Id of the workload this proton belongs to.",
            "title": "Workloadid"
          }
        },
        "required": [
          "id",
          "name",
          "createdAt",
          "updatedAt",
          "status",
          "artifactId"
        ],
        "title": "ProtonFormatted",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "ProtonListResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response ProtonListResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Workload Proton By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/protons/{proton_id}

Get a specific proton for a workload.

Parameters

Name In Type Required Description
workload_id path string true Workload ID
proton_id path string true Proton ID

Example responses

200 Response

{
  "additionalProperties": false,
  "properties": {
    "artifactId": {
      "description": "Id of the artifact deployed by this proton.",
      "title": "Artifactid",
      "type": "string"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Created At",
      "type": "string"
    },
    "creator": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User information embedded in API responses.",
          "properties": {
            "email": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User email address.",
              "title": "Email"
            },
            "fullName": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's full name.",
              "title": "Full Name"
            },
            "id": {
              "description": "User id associated with this resource.",
              "title": "User ID",
              "type": "string"
            },
            "userhash": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's gravatar hash.",
              "title": "Userhash"
            },
            "username": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Username.",
              "title": "Username"
            }
          },
          "required": [
            "id"
          ],
          "title": "UserData",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Owner user details including id, username and email."
    },
    "endpoint": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "API endpoint to use to send service requests.",
      "title": "Endpoint"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "ID",
      "type": "string"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "role": {
      "anyOf": [
        {
          "enum": [
            "active",
            "candidate"
          ],
          "title": "ProtonRole",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Role of the proton within its workload, either 'active' or 'candidate'."
    },
    "runningSince": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the proton entered running status.",
      "title": "Runningsince"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "enum": [
        "unknown",
        "submitted",
        "initializing",
        "provisioning",
        "launching",
        "running",
        "suspended",
        "warming",
        "draining",
        "interrupted",
        "restarting",
        "stopping",
        "stopped",
        "errored",
        "terminated"
      ],
      "title": "ProtonStatus",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updated At",
      "type": "string"
    },
    "workloadId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the workload this proton belongs to.",
      "title": "Workloadid"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt",
    "status",
    "artifactId"
  ],
  "title": "ProtonFormatted",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response ProtonFormatted
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Workload Proton Status Details By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/protons/{proton_id}/statusDetails

Get per-replica status details for a proton.

Returns 204 if no status update has been received yet. Returns 200 with the latest replica snapshot when available.

Parameters

Name In Type Required Description
workload_id path string true Workload ID
proton_id path string true Proton ID

Example responses

200 Response

{
  "additionalProperties": false,
  "properties": {
    "overallStatus": {
      "additionalProperties": false,
      "description": "Overall status as reported by the workload-monitor service.",
      "properties": {
        "lastUpdated": {
          "description": "Rfc3339 timestamp of the last state transition.",
          "title": "Lastupdated",
          "type": "string"
        },
        "state": {
          "enum": [
            "unknown",
            "submitted",
            "initializing",
            "provisioning",
            "launching",
            "running",
            "suspended",
            "warming",
            "draining",
            "interrupted",
            "restarting",
            "stopping",
            "stopped",
            "errored",
            "terminated"
          ],
          "title": "ProtonStatus",
          "type": "string"
        },
        "summary": {
          "description": "Human-readable description of the current state.",
          "title": "Summary",
          "type": "string"
        }
      },
      "required": [
        "state",
        "summary",
        "lastUpdated"
      ],
      "title": "WorkloadMonitorOverallStatus",
      "type": "object"
    },
    "replicas": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "address": {
            "title": "Address",
            "type": "string"
          },
          "conditions": {
            "items": {
              "additionalProperties": false,
              "properties": {
                "lastTransitionTime": {
                  "title": "Lasttransitiontime",
                  "type": "string"
                },
                "message": {
                  "default": "",
                  "title": "Message",
                  "type": "string"
                },
                "reason": {
                  "default": "",
                  "title": "Reason",
                  "type": "string"
                },
                "type": {
                  "title": "Type",
                  "type": "string"
                },
                "value": {
                  "anyOf": [
                    {
                      "type": "boolean"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "title": "Value"
                }
              },
              "required": [
                "type",
                "value",
                "lastTransitionTime"
              ],
              "title": "ReplicaConditionDetail",
              "type": "object"
            },
            "title": "Conditions",
            "type": "array"
          },
          "containers": {
            "items": {
              "additionalProperties": false,
              "properties": {
                "image": {
                  "title": "Image",
                  "type": "string"
                },
                "name": {
                  "title": "Name",
                  "type": "string"
                },
                "ready": {
                  "title": "Ready",
                  "type": "boolean"
                },
                "restartCount": {
                  "title": "Restartcount",
                  "type": "integer"
                },
                "startedAt": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "title": "Startedat"
                },
                "status": {
                  "description": "Lifecycle state of a container within a deployment replica.",
                  "enum": [
                    "running",
                    "waiting",
                    "terminated",
                    "unknown"
                  ],
                  "title": "ContainerStatus",
                  "type": "string"
                }
              },
              "required": [
                "name",
                "status",
                "startedAt",
                "ready",
                "restartCount",
                "image"
              ],
              "title": "ContainerStatusDetail",
              "type": "object"
            },
            "title": "Containers",
            "type": "array"
          },
          "name": {
            "title": "Name",
            "type": "string"
          },
          "nodeAddress": {
            "title": "Nodeaddress",
            "type": "string"
          },
          "startedAt": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "title": "Startedat"
          },
          "status": {
            "description": "Lifecycle phase of a deployment replica.",
            "enum": [
              "pending",
              "running",
              "succeeded",
              "failed",
              "unknown"
            ],
            "title": "ReplicaPhase",
            "type": "string"
          }
        },
        "required": [
          "name",
          "status",
          "address",
          "nodeAddress",
          "startedAt",
          "conditions",
          "containers"
        ],
        "title": "ReplicaDetail",
        "type": "object"
      },
      "title": "Replicas",
      "type": "array"
    }
  },
  "required": [
    "overallStatus",
    "replicas"
  ],
  "title": "ReplicaStatusesSnapshot",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response ReplicaStatusesSnapshot
204 No Content No status update has been received for this proton yet None
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Operation path: GET /workloads/{workload_id}/related

List entities related to a workload, such as linked artifacts.

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "Response containing related entities.",
  "properties": {
    "count": {
      "default": 0,
      "description": "Total number of related entities.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "List of related entities.",
      "items": {
        "anyOf": [
          {
            "additionalProperties": false,
            "description": "Related entity item.",
            "properties": {
              "createdAt": {
                "description": "Timestamp of when the entity was created.",
                "format": "date-time",
                "title": "Created At",
                "type": "string"
              },
              "creator": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "User information embedded in API responses.",
                    "properties": {
                      "email": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "User email address.",
                        "title": "Email"
                      },
                      "fullName": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "User's full name.",
                        "title": "Full Name"
                      },
                      "id": {
                        "description": "User id associated with this resource.",
                        "title": "User ID",
                        "type": "string"
                      },
                      "userhash": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "User's gravatar hash.",
                        "title": "Userhash"
                      },
                      "username": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Username.",
                        "title": "Username"
                      }
                    },
                    "required": [
                      "id"
                    ],
                    "title": "UserData",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Owner user details including id, username and email.",
                "title": "Creator"
              },
              "id": {
                "description": "Unique identifier of the entity.",
                "title": "ID",
                "type": "string"
              },
              "name": {
                "description": "Name of the entity.",
                "title": "Name",
                "type": "string"
              },
              "permissions": {
                "anyOf": [
                  {
                    "items": {
                      "description": "Represents the particular role a user, group or organization holds on an entity.",
                      "enum": [
                        "CAN_VIEW",
                        "CAN_UPDATE",
                        "CAN_DELETE",
                        "CAN_SHARE",
                        "CAN_MAKE_PREDICTIONS",
                        "CAN_SHARE_ROLE_OWNER",
                        "CAN_SHARE_ROLE_READ_WRITE",
                        "CAN_SHARE_ROLE_READ_ONLY"
                      ],
                      "title": "ResourcePermission",
                      "type": "string"
                    },
                    "type": "array"
                  },
                  {
                    "items": {
                      "const": "*",
                      "type": "string"
                    },
                    "type": "array"
                  }
                ]
              },
              "type": {
                "enum": [
                  "artifact",
                  "artifact_repository",
                  "proton",
                  "workload",
                  "custom_model"
                ],
                "title": "ResourceTypes",
                "type": "string"
              },
              "updatedAt": {
                "description": "Timestamp of when the entity was last updated.",
                "format": "date-time",
                "title": "Updated At",
                "type": "string"
              }
            },
            "required": [
              "id",
              "name",
              "createdAt",
              "updatedAt",
              "type"
            ],
            "title": "RelatedItem",
            "type": "object"
          },
          {
            "additionalProperties": false,
            "description": "Basic information about a related entity, identified by its id and type.",
            "properties": {
              "id": {
                "description": "Unique identifier of the entity.",
                "title": "Id",
                "type": "string"
              },
              "type": {
                "enum": [
                  "artifact",
                  "artifact_repository",
                  "proton",
                  "workload",
                  "custom_model"
                ],
                "title": "ResourceTypes",
                "type": "string"
              }
            },
            "required": [
              "id",
              "type"
            ],
            "title": "RelatedItemID",
            "type": "object"
          }
        ]
      },
      "title": "Data",
      "type": "array"
    }
  },
  "title": "RelatedEntitiesResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response RelatedEntitiesResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Delete Replacement By Workload_ Id by workload_ ID

Operation path: DELETE /workloads/{workload_id}/replacement

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

202 Response

{}

Responses

Status Meaning Description Schema
202 Accepted Successful Response Inline
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Response Schema

Get Replacement By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/replacement

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "Store replacement information for workloads.",
  "properties": {
    "candidateArtifactId": {
      "description": "Candidate artifact id.",
      "title": "Candidateartifactid",
      "type": "string"
    },
    "candidateProtonIds": {
      "description": "Ids of protons pending promotion during artifact replacement.",
      "items": {
        "type": "string"
      },
      "title": "Candidateprotonids",
      "type": "array"
    },
    "config": {
      "additionalProperties": false,
      "description": "Configuration for workload replacement.",
      "properties": {
        "keepOldVersionMinutes": {
          "default": 0,
          "description": "Duration in minutes to keep the old version during replacement.",
          "title": "Keepoldversionminutes",
          "type": "integer"
        },
        "warmupDurationMinutes": {
          "default": 0,
          "description": "Duration in minutes for the warmup phase during replacement.",
          "title": "Warmupdurationminutes",
          "type": "integer"
        }
      },
      "title": "ReplacementConfig",
      "type": "object"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Createdat",
      "type": "string"
    },
    "deletedAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the entity was deleted.",
      "title": "Deletedat"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "Id",
      "type": "string"
    },
    "isDeleted": {
      "default": false,
      "description": "Whether this entity has been deleted.",
      "title": "Isdeleted",
      "type": "boolean"
    },
    "message": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Additional information about the replacement status, such as validation errors or reasons for failure.",
      "title": "Message"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "previousProtonIds": {
      "anyOf": [
        {
          "items": {
            "type": "string"
          },
          "type": "array"
        },
        {
          "type": "null"
        }
      ],
      "description": "Ids of protons pending decommissioning during artifact replacement.",
      "title": "Previousprotonids"
    },
    "protonStatuses": {
      "anyOf": [
        {
          "additionalProperties": {
            "additionalProperties": false,
            "properties": {
              "overallStatus": {
                "additionalProperties": false,
                "description": "Overall status as reported by the workload-monitor service.",
                "properties": {
                  "lastUpdated": {
                    "description": "Rfc3339 timestamp of the last state transition.",
                    "title": "Lastupdated",
                    "type": "string"
                  },
                  "state": {
                    "enum": [
                      "unknown",
                      "submitted",
                      "initializing",
                      "provisioning",
                      "launching",
                      "running",
                      "suspended",
                      "warming",
                      "draining",
                      "interrupted",
                      "restarting",
                      "stopping",
                      "stopped",
                      "errored",
                      "terminated"
                    ],
                    "title": "ProtonStatus",
                    "type": "string"
                  },
                  "summary": {
                    "description": "Human-readable description of the current state.",
                    "title": "Summary",
                    "type": "string"
                  }
                },
                "required": [
                  "state",
                  "summary",
                  "lastUpdated"
                ],
                "title": "WorkloadMonitorOverallStatus",
                "type": "object"
              },
              "replicas": {
                "items": {
                  "additionalProperties": false,
                  "properties": {
                    "address": {
                      "title": "Address",
                      "type": "string"
                    },
                    "conditions": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "lastTransitionTime": {
                            "title": "Lasttransitiontime",
                            "type": "string"
                          },
                          "message": {
                            "default": "",
                            "title": "Message",
                            "type": "string"
                          },
                          "reason": {
                            "default": "",
                            "title": "Reason",
                            "type": "string"
                          },
                          "type": {
                            "title": "Type",
                            "type": "string"
                          },
                          "value": {
                            "anyOf": [
                              {
                                "type": "boolean"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Value"
                          }
                        },
                        "required": [
                          "type",
                          "value",
                          "lastTransitionTime"
                        ],
                        "title": "ReplicaConditionDetail",
                        "type": "object"
                      },
                      "title": "Conditions",
                      "type": "array"
                    },
                    "containers": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "image": {
                            "title": "Image",
                            "type": "string"
                          },
                          "name": {
                            "title": "Name",
                            "type": "string"
                          },
                          "ready": {
                            "title": "Ready",
                            "type": "boolean"
                          },
                          "restartCount": {
                            "title": "Restartcount",
                            "type": "integer"
                          },
                          "startedAt": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Startedat"
                          },
                          "status": {
                            "description": "Lifecycle state of a container within a deployment replica.",
                            "enum": [
                              "running",
                              "waiting",
                              "terminated",
                              "unknown"
                            ],
                            "title": "ContainerStatus",
                            "type": "string"
                          }
                        },
                        "required": [
                          "name",
                          "status",
                          "startedAt",
                          "ready",
                          "restartCount",
                          "image"
                        ],
                        "title": "ContainerStatusDetail",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "title": "Name",
                      "type": "string"
                    },
                    "nodeAddress": {
                      "title": "Nodeaddress",
                      "type": "string"
                    },
                    "startedAt": {
                      "anyOf": [
                        {
                          "type": "string"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "title": "Startedat"
                    },
                    "status": {
                      "description": "Lifecycle phase of a deployment replica.",
                      "enum": [
                        "pending",
                        "running",
                        "succeeded",
                        "failed",
                        "unknown"
                      ],
                      "title": "ReplicaPhase",
                      "type": "string"
                    }
                  },
                  "required": [
                    "name",
                    "status",
                    "address",
                    "nodeAddress",
                    "startedAt",
                    "conditions",
                    "containers"
                  ],
                  "title": "ReplicaDetail",
                  "type": "object"
                },
                "title": "Replicas",
                "type": "array"
              }
            },
            "required": [
              "overallStatus",
              "replicas"
            ],
            "title": "ReplicaStatusesSnapshot",
            "type": "object"
          },
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Latest known status of candidate protons, used to determine replacement status transitions.",
      "title": "Protonstatuses"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "Statuses for workload replacement process.",
      "enum": [
        "unknown",
        "submitted",
        "initializing",
        "awaiting_promotion",
        "switching",
        "deleting",
        "completed",
        "errored",
        "cleaning_up"
      ],
      "title": "ReplacementStatus",
      "type": "string"
    },
    "strategy": {
      "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
      "enum": [
        "rolling"
      ],
      "title": "ReplacementStrategy",
      "type": "string"
    },
    "switchedAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the replacement take action.",
      "title": "Switchedat"
    },
    "taskiqLastHeartbeat": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last taskiq poll for this replacement; used by the cron to detect abandoned taskiq-managed replacements.",
      "title": "Taskiqlastheartbeat"
    },
    "taskiqManaged": {
      "default": false,
      "description": "When true, this replacement is managed by the taskiq worker and should be skipped by the batch cronjob.",
      "title": "Taskiqmanaged",
      "type": "boolean"
    },
    "tenantId": {
      "description": "Id of the tenant this entity belongs to.",
      "format": "uuid4",
      "title": "Tenantid",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updatedat",
      "type": "string"
    },
    "userId": {
      "description": "Id of the user who owns this entity.",
      "title": "Userid",
      "type": "string"
    },
    "workloadId": {
      "description": "Workload id.",
      "title": "Workloadid",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt",
    "userId",
    "tenantId",
    "workloadId",
    "candidateArtifactId"
  ],
  "title": "Replacement",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response Replacement
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Create Replacement By Workload_ Id by workload_ ID

Operation path: POST /workloads/{workload_id}/replacement

Body parameter

{
  "additionalProperties": false,
  "description": "Request to start a replacement for a workload.",
  "properties": {
    "artifactId": {
      "description": "Existing artifact id to deploy.",
      "title": "Artifactid",
      "type": "string"
    },
    "config": {
      "additionalProperties": false,
      "description": "Configuration for workload replacement.",
      "properties": {
        "keepOldVersionMinutes": {
          "default": 0,
          "description": "Duration in minutes to keep the old version during replacement.",
          "title": "Keepoldversionminutes",
          "type": "integer"
        },
        "warmupDurationMinutes": {
          "default": 0,
          "description": "Duration in minutes for the warmup phase during replacement.",
          "title": "Warmupdurationminutes",
          "type": "integer"
        }
      },
      "title": "ReplacementConfig",
      "type": "object"
    },
    "runtime": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
          "properties": {
            "containerGroups": {
              "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
              "items": {
                "additionalProperties": false,
                "description": "Runtime configuration for a single container group.",
                "properties": {
                  "autoscaling": {
                    "anyOf": [
                      {
                        "additionalProperties": false,
                        "description": "Autoscaling configuration for a proton.",
                        "properties": {
                          "enabled": {
                            "default": true,
                            "description": "Whether autoscaling is enabled.",
                            "title": "Enabled",
                            "type": "boolean"
                          },
                          "policies": {
                            "items": {
                              "additionalProperties": false,
                              "description": "Base class for autoscaling policies.",
                              "properties": {
                                "maxCount": {
                                  "description": "Maximum number of replicas.",
                                  "minimum": 0,
                                  "title": "Max Count",
                                  "type": "integer"
                                },
                                "minCount": {
                                  "description": "Minimum number of replicas.",
                                  "minimum": 0,
                                  "title": "Min Count",
                                  "type": "integer"
                                },
                                "priority": {
                                  "anyOf": [
                                    {
                                      "type": "integer"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Policy priority when multiple policies are defined.",
                                  "title": "Priority"
                                },
                                "scalingMetric": {
                                  "anyOf": [
                                    {
                                      "oneOf": [
                                        {
                                          "const": "cpuAverageUtilization",
                                          "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                          "title": "CPU Average Utilization"
                                        },
                                        {
                                          "const": "httpRequestsConcurrency",
                                          "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                          "title": "HTTP Requests Concurrency"
                                        },
                                        {
                                          "const": "gpuCacheUtilization",
                                          "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                          "title": "GPU Cache Utilization"
                                        },
                                        {
                                          "const": "gpuRequestQueueDepth",
                                          "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                          "title": "GPU Request Queue Depth"
                                        }
                                      ],
                                      "title": "ScalingMetricType",
                                      "type": "string"
                                    },
                                    {
                                      "type": "string"
                                    }
                                  ],
                                  "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                                  "title": "Scaling Metric"
                                },
                                "target": {
                                  "description": "Target value for the scaling metric.",
                                  "minimum": 0,
                                  "title": "Target",
                                  "type": "number"
                                }
                              },
                              "required": [
                                "scalingMetric",
                                "target",
                                "minCount",
                                "maxCount"
                              ],
                              "title": "AutoscalingPolicy",
                              "type": "object"
                            },
                            "title": "Policies",
                            "type": "array"
                          }
                        },
                        "required": [
                          "policies"
                        ],
                        "title": "AutoscalingProperties",
                        "type": "object"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Autoscaling configuration for this group. takes precedence over replicacount."
                  },
                  "bundleSelectionPolicy": {
                    "enum": [
                      "availability"
                    ],
                    "title": "BundleSelectionPolicy",
                    "type": "string"
                  },
                  "containers": {
                    "description": "Per-container overrides for this group.",
                    "items": {
                      "additionalProperties": false,
                      "description": "Runtime diff targeting a single named container within a group.",
                      "properties": {
                        "name": {
                          "description": "Container name. must match a container declared in the artifact group.",
                          "title": "Name",
                          "type": "string"
                        },
                        "resourceAllocation": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Per-container resource allocation declared at runtime.",
                              "properties": {
                                "cpu": {
                                  "anyOf": [
                                    {
                                      "minimum": 0.1,
                                      "type": "number"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Cpu cores allocated to this container.",
                                  "title": "Cpu"
                                },
                                "gpu": {
                                  "anyOf": [
                                    {
                                      "minimum": 0,
                                      "type": "number"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Gpus allocated to this container.",
                                  "title": "Gpu"
                                },
                                "memory": {
                                  "anyOf": [
                                    {
                                      "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                      "type": "string"
                                    },
                                    {
                                      "minimum": 0,
                                      "type": "integer"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                                  "examples": [
                                    "8GB",
                                    "512MB"
                                  ],
                                  "title": "Memory"
                                }
                              },
                              "title": "ResourceAllocation",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Resource allocation for this container. required for multi-container groups."
                        }
                      },
                      "required": [
                        "name"
                      ],
                      "title": "ContainerOverride",
                      "type": "object"
                    },
                    "title": "Containers",
                    "type": "array"
                  },
                  "name": {
                    "default": "default",
                    "description": "Group name. must match a container group name declared in the artifact.",
                    "title": "Name",
                    "type": "string"
                  },
                  "replicaCount": {
                    "anyOf": [
                      {
                        "minimum": 1,
                        "type": "integer"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "default": 1,
                    "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                    "title": "Replicacount"
                  },
                  "resolvedBundle": {
                    "anyOf": [
                      {
                        "description": "Bundle details returned in the runtime response after scheduling.",
                        "properties": {
                          "cpuCount": {
                            "description": "Number of cpu cores.",
                            "title": "CPU Count",
                            "type": "number"
                          },
                          "gpuCount": {
                            "default": 0,
                            "description": "Number of gpu units.",
                            "title": "GPU Count",
                            "type": "integer"
                          },
                          "gpuMaker": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Gpu manufacturer.",
                            "title": "GPU Maker"
                          },
                          "gpuTypeLabel": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Gpu type label.",
                            "title": "GPU Type Label"
                          },
                          "id": {
                            "description": "Bundle identifier that was selected.",
                            "title": "Id",
                            "type": "string"
                          },
                          "memoryBytes": {
                            "description": "Memory size in bytes.",
                            "title": "Memory Bytes",
                            "type": "integer"
                          }
                        },
                        "required": [
                          "id",
                          "cpuCount",
                          "memoryBytes"
                        ],
                        "title": "ResolvedBundle",
                        "type": "object"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Full details of the bundle selected at scheduling time. read-only.",
                    "readOnly": true
                  },
                  "resourceBundles": {
                    "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                    "items": {
                      "type": "string"
                    },
                    "title": "Resourcebundles",
                    "type": "array"
                  }
                },
                "title": "GroupRuntime",
                "type": "object"
              },
              "title": "Containergroups",
              "type": "array"
            }
          },
          "title": "WorkloadRuntime",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Runtime for the workload; if omitted, the current runtime is reused."
    },
    "strategy": {
      "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
      "enum": [
        "rolling"
      ],
      "title": "ReplacementStrategy",
      "type": "string"
    }
  },
  "required": [
    "artifactId",
    "strategy"
  ],
  "title": "StartReplacementRequest",
  "type": "object"
}

Parameters

Name In Type Required Description
workload_id path string true Workload ID
body body StartReplacementRequest true none

Example responses

202 Response

{
  "additionalProperties": false,
  "description": "Store replacement information for workloads.",
  "properties": {
    "candidateArtifactId": {
      "description": "Candidate artifact id.",
      "title": "Candidateartifactid",
      "type": "string"
    },
    "candidateProtonIds": {
      "description": "Ids of protons pending promotion during artifact replacement.",
      "items": {
        "type": "string"
      },
      "title": "Candidateprotonids",
      "type": "array"
    },
    "config": {
      "additionalProperties": false,
      "description": "Configuration for workload replacement.",
      "properties": {
        "keepOldVersionMinutes": {
          "default": 0,
          "description": "Duration in minutes to keep the old version during replacement.",
          "title": "Keepoldversionminutes",
          "type": "integer"
        },
        "warmupDurationMinutes": {
          "default": 0,
          "description": "Duration in minutes for the warmup phase during replacement.",
          "title": "Warmupdurationminutes",
          "type": "integer"
        }
      },
      "title": "ReplacementConfig",
      "type": "object"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Createdat",
      "type": "string"
    },
    "deletedAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the entity was deleted.",
      "title": "Deletedat"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "Id",
      "type": "string"
    },
    "isDeleted": {
      "default": false,
      "description": "Whether this entity has been deleted.",
      "title": "Isdeleted",
      "type": "boolean"
    },
    "message": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Additional information about the replacement status, such as validation errors or reasons for failure.",
      "title": "Message"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "previousProtonIds": {
      "anyOf": [
        {
          "items": {
            "type": "string"
          },
          "type": "array"
        },
        {
          "type": "null"
        }
      ],
      "description": "Ids of protons pending decommissioning during artifact replacement.",
      "title": "Previousprotonids"
    },
    "protonStatuses": {
      "anyOf": [
        {
          "additionalProperties": {
            "additionalProperties": false,
            "properties": {
              "overallStatus": {
                "additionalProperties": false,
                "description": "Overall status as reported by the workload-monitor service.",
                "properties": {
                  "lastUpdated": {
                    "description": "Rfc3339 timestamp of the last state transition.",
                    "title": "Lastupdated",
                    "type": "string"
                  },
                  "state": {
                    "enum": [
                      "unknown",
                      "submitted",
                      "initializing",
                      "provisioning",
                      "launching",
                      "running",
                      "suspended",
                      "warming",
                      "draining",
                      "interrupted",
                      "restarting",
                      "stopping",
                      "stopped",
                      "errored",
                      "terminated"
                    ],
                    "title": "ProtonStatus",
                    "type": "string"
                  },
                  "summary": {
                    "description": "Human-readable description of the current state.",
                    "title": "Summary",
                    "type": "string"
                  }
                },
                "required": [
                  "state",
                  "summary",
                  "lastUpdated"
                ],
                "title": "WorkloadMonitorOverallStatus",
                "type": "object"
              },
              "replicas": {
                "items": {
                  "additionalProperties": false,
                  "properties": {
                    "address": {
                      "title": "Address",
                      "type": "string"
                    },
                    "conditions": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "lastTransitionTime": {
                            "title": "Lasttransitiontime",
                            "type": "string"
                          },
                          "message": {
                            "default": "",
                            "title": "Message",
                            "type": "string"
                          },
                          "reason": {
                            "default": "",
                            "title": "Reason",
                            "type": "string"
                          },
                          "type": {
                            "title": "Type",
                            "type": "string"
                          },
                          "value": {
                            "anyOf": [
                              {
                                "type": "boolean"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Value"
                          }
                        },
                        "required": [
                          "type",
                          "value",
                          "lastTransitionTime"
                        ],
                        "title": "ReplicaConditionDetail",
                        "type": "object"
                      },
                      "title": "Conditions",
                      "type": "array"
                    },
                    "containers": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "image": {
                            "title": "Image",
                            "type": "string"
                          },
                          "name": {
                            "title": "Name",
                            "type": "string"
                          },
                          "ready": {
                            "title": "Ready",
                            "type": "boolean"
                          },
                          "restartCount": {
                            "title": "Restartcount",
                            "type": "integer"
                          },
                          "startedAt": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Startedat"
                          },
                          "status": {
                            "description": "Lifecycle state of a container within a deployment replica.",
                            "enum": [
                              "running",
                              "waiting",
                              "terminated",
                              "unknown"
                            ],
                            "title": "ContainerStatus",
                            "type": "string"
                          }
                        },
                        "required": [
                          "name",
                          "status",
                          "startedAt",
                          "ready",
                          "restartCount",
                          "image"
                        ],
                        "title": "ContainerStatusDetail",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "title": "Name",
                      "type": "string"
                    },
                    "nodeAddress": {
                      "title": "Nodeaddress",
                      "type": "string"
                    },
                    "startedAt": {
                      "anyOf": [
                        {
                          "type": "string"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "title": "Startedat"
                    },
                    "status": {
                      "description": "Lifecycle phase of a deployment replica.",
                      "enum": [
                        "pending",
                        "running",
                        "succeeded",
                        "failed",
                        "unknown"
                      ],
                      "title": "ReplicaPhase",
                      "type": "string"
                    }
                  },
                  "required": [
                    "name",
                    "status",
                    "address",
                    "nodeAddress",
                    "startedAt",
                    "conditions",
                    "containers"
                  ],
                  "title": "ReplicaDetail",
                  "type": "object"
                },
                "title": "Replicas",
                "type": "array"
              }
            },
            "required": [
              "overallStatus",
              "replicas"
            ],
            "title": "ReplicaStatusesSnapshot",
            "type": "object"
          },
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Latest known status of candidate protons, used to determine replacement status transitions.",
      "title": "Protonstatuses"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "Statuses for workload replacement process.",
      "enum": [
        "unknown",
        "submitted",
        "initializing",
        "awaiting_promotion",
        "switching",
        "deleting",
        "completed",
        "errored",
        "cleaning_up"
      ],
      "title": "ReplacementStatus",
      "type": "string"
    },
    "strategy": {
      "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
      "enum": [
        "rolling"
      ],
      "title": "ReplacementStrategy",
      "type": "string"
    },
    "switchedAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the replacement take action.",
      "title": "Switchedat"
    },
    "taskiqLastHeartbeat": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last taskiq poll for this replacement; used by the cron to detect abandoned taskiq-managed replacements.",
      "title": "Taskiqlastheartbeat"
    },
    "taskiqManaged": {
      "default": false,
      "description": "When true, this replacement is managed by the taskiq worker and should be skipped by the batch cronjob.",
      "title": "Taskiqmanaged",
      "type": "boolean"
    },
    "tenantId": {
      "description": "Id of the tenant this entity belongs to.",
      "format": "uuid4",
      "title": "Tenantid",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updatedat",
      "type": "string"
    },
    "userId": {
      "description": "Id of the user who owns this entity.",
      "title": "Userid",
      "type": "string"
    },
    "workloadId": {
      "description": "Workload id.",
      "title": "Workloadid",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt",
    "userId",
    "tenantId",
    "workloadId",
    "candidateArtifactId"
  ],
  "title": "Replacement",
  "type": "object"
}

Responses

Status Meaning Description Schema
202 Accepted Successful Response Replacement
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Workload settings By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/settings

Retrieve the configuration settings for a workload.

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "Response containing workload settings.",
  "properties": {
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    }
  },
  "title": "WorkloadSettingsResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response WorkloadSettingsResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Update Workload settings By Workload_ Id by workload_ ID

Operation path: PATCH /workloads/{workload_id}/settings

Update workload runtime settings by triggering a rolling replacement with the current artifact.

Body parameter

{
  "additionalProperties": false,
  "description": "Request to update runtime settings for a workload.",
  "properties": {
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    }
  },
  "required": [
    "runtime"
  ],
  "title": "UpdateSettingsRequest",
  "type": "object"
}

Parameters

Name In Type Required Description
workload_id path string true Workload ID
body body UpdateSettingsRequest true none

Example responses

202 Response

{
  "additionalProperties": false,
  "description": "Store replacement information for workloads.",
  "properties": {
    "candidateArtifactId": {
      "description": "Candidate artifact id.",
      "title": "Candidateartifactid",
      "type": "string"
    },
    "candidateProtonIds": {
      "description": "Ids of protons pending promotion during artifact replacement.",
      "items": {
        "type": "string"
      },
      "title": "Candidateprotonids",
      "type": "array"
    },
    "config": {
      "additionalProperties": false,
      "description": "Configuration for workload replacement.",
      "properties": {
        "keepOldVersionMinutes": {
          "default": 0,
          "description": "Duration in minutes to keep the old version during replacement.",
          "title": "Keepoldversionminutes",
          "type": "integer"
        },
        "warmupDurationMinutes": {
          "default": 0,
          "description": "Duration in minutes for the warmup phase during replacement.",
          "title": "Warmupdurationminutes",
          "type": "integer"
        }
      },
      "title": "ReplacementConfig",
      "type": "object"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Createdat",
      "type": "string"
    },
    "deletedAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the entity was deleted.",
      "title": "Deletedat"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "Id",
      "type": "string"
    },
    "isDeleted": {
      "default": false,
      "description": "Whether this entity has been deleted.",
      "title": "Isdeleted",
      "type": "boolean"
    },
    "message": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Additional information about the replacement status, such as validation errors or reasons for failure.",
      "title": "Message"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "previousProtonIds": {
      "anyOf": [
        {
          "items": {
            "type": "string"
          },
          "type": "array"
        },
        {
          "type": "null"
        }
      ],
      "description": "Ids of protons pending decommissioning during artifact replacement.",
      "title": "Previousprotonids"
    },
    "protonStatuses": {
      "anyOf": [
        {
          "additionalProperties": {
            "additionalProperties": false,
            "properties": {
              "overallStatus": {
                "additionalProperties": false,
                "description": "Overall status as reported by the workload-monitor service.",
                "properties": {
                  "lastUpdated": {
                    "description": "Rfc3339 timestamp of the last state transition.",
                    "title": "Lastupdated",
                    "type": "string"
                  },
                  "state": {
                    "enum": [
                      "unknown",
                      "submitted",
                      "initializing",
                      "provisioning",
                      "launching",
                      "running",
                      "suspended",
                      "warming",
                      "draining",
                      "interrupted",
                      "restarting",
                      "stopping",
                      "stopped",
                      "errored",
                      "terminated"
                    ],
                    "title": "ProtonStatus",
                    "type": "string"
                  },
                  "summary": {
                    "description": "Human-readable description of the current state.",
                    "title": "Summary",
                    "type": "string"
                  }
                },
                "required": [
                  "state",
                  "summary",
                  "lastUpdated"
                ],
                "title": "WorkloadMonitorOverallStatus",
                "type": "object"
              },
              "replicas": {
                "items": {
                  "additionalProperties": false,
                  "properties": {
                    "address": {
                      "title": "Address",
                      "type": "string"
                    },
                    "conditions": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "lastTransitionTime": {
                            "title": "Lasttransitiontime",
                            "type": "string"
                          },
                          "message": {
                            "default": "",
                            "title": "Message",
                            "type": "string"
                          },
                          "reason": {
                            "default": "",
                            "title": "Reason",
                            "type": "string"
                          },
                          "type": {
                            "title": "Type",
                            "type": "string"
                          },
                          "value": {
                            "anyOf": [
                              {
                                "type": "boolean"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Value"
                          }
                        },
                        "required": [
                          "type",
                          "value",
                          "lastTransitionTime"
                        ],
                        "title": "ReplicaConditionDetail",
                        "type": "object"
                      },
                      "title": "Conditions",
                      "type": "array"
                    },
                    "containers": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "image": {
                            "title": "Image",
                            "type": "string"
                          },
                          "name": {
                            "title": "Name",
                            "type": "string"
                          },
                          "ready": {
                            "title": "Ready",
                            "type": "boolean"
                          },
                          "restartCount": {
                            "title": "Restartcount",
                            "type": "integer"
                          },
                          "startedAt": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Startedat"
                          },
                          "status": {
                            "description": "Lifecycle state of a container within a deployment replica.",
                            "enum": [
                              "running",
                              "waiting",
                              "terminated",
                              "unknown"
                            ],
                            "title": "ContainerStatus",
                            "type": "string"
                          }
                        },
                        "required": [
                          "name",
                          "status",
                          "startedAt",
                          "ready",
                          "restartCount",
                          "image"
                        ],
                        "title": "ContainerStatusDetail",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "title": "Name",
                      "type": "string"
                    },
                    "nodeAddress": {
                      "title": "Nodeaddress",
                      "type": "string"
                    },
                    "startedAt": {
                      "anyOf": [
                        {
                          "type": "string"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "title": "Startedat"
                    },
                    "status": {
                      "description": "Lifecycle phase of a deployment replica.",
                      "enum": [
                        "pending",
                        "running",
                        "succeeded",
                        "failed",
                        "unknown"
                      ],
                      "title": "ReplicaPhase",
                      "type": "string"
                    }
                  },
                  "required": [
                    "name",
                    "status",
                    "address",
                    "nodeAddress",
                    "startedAt",
                    "conditions",
                    "containers"
                  ],
                  "title": "ReplicaDetail",
                  "type": "object"
                },
                "title": "Replicas",
                "type": "array"
              }
            },
            "required": [
              "overallStatus",
              "replicas"
            ],
            "title": "ReplicaStatusesSnapshot",
            "type": "object"
          },
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Latest known status of candidate protons, used to determine replacement status transitions.",
      "title": "Protonstatuses"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "Statuses for workload replacement process.",
      "enum": [
        "unknown",
        "submitted",
        "initializing",
        "awaiting_promotion",
        "switching",
        "deleting",
        "completed",
        "errored",
        "cleaning_up"
      ],
      "title": "ReplacementStatus",
      "type": "string"
    },
    "strategy": {
      "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
      "enum": [
        "rolling"
      ],
      "title": "ReplacementStrategy",
      "type": "string"
    },
    "switchedAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the replacement take action.",
      "title": "Switchedat"
    },
    "taskiqLastHeartbeat": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last taskiq poll for this replacement; used by the cron to detect abandoned taskiq-managed replacements.",
      "title": "Taskiqlastheartbeat"
    },
    "taskiqManaged": {
      "default": false,
      "description": "When true, this replacement is managed by the taskiq worker and should be skipped by the batch cronjob.",
      "title": "Taskiqmanaged",
      "type": "boolean"
    },
    "tenantId": {
      "description": "Id of the tenant this entity belongs to.",
      "format": "uuid4",
      "title": "Tenantid",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updatedat",
      "type": "string"
    },
    "userId": {
      "description": "Id of the user who owns this entity.",
      "title": "Userid",
      "type": "string"
    },
    "workloadId": {
      "description": "Workload id.",
      "title": "Workloadid",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt",
    "userId",
    "tenantId",
    "workloadId",
    "candidateArtifactId"
  ],
  "title": "Replacement",
  "type": "object"
}

Responses

Status Meaning Description Schema
202 Accepted Successful Response Replacement
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
409 Conflict A replacement is already in progress for this workload None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Workload Shared Roles By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/sharedRoles

List the shared roles granted on a workload.

Parameters

Name In Type Required Description
workload_id path string true Workload ID
offset query integer false Skip the specified number of values.
limit query integer false Retrieve only the specified number of values.

Example responses

200 Response

{
  "additionalProperties": false,
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "description": "Represents a recipient (user, group, or organization) with access to an entity. this model is used for listing and managing access control on shared resources, providing information about who has access and what role they have.",
        "properties": {
          "id": {
            "description": "The identifier of the recipient.",
            "title": "Id",
            "type": "string"
          },
          "name": {
            "description": "The name of the recipient.",
            "title": "Name",
            "type": "string"
          },
          "role": {
            "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
            "enum": [
              "NO_ROLE",
              "OWNER",
              "READ_WRITE",
              "EDITOR",
              "USER",
              "DATA_SCIENTIST",
              "ADMIN",
              "READ_ONLY",
              "CONSUMER",
              "OBSERVER"
            ],
            "title": "SharingRole",
            "type": "string"
          },
          "shareRecipientType": {
            "description": "Enum of possible subject types.",
            "enum": [
              "user",
              "group",
              "organization",
              "role"
            ],
            "title": "SubjectType",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "shareRecipientType",
          "role"
        ],
        "title": "SharedRole",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "SharedRoleListResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response SharedRoleListResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Update Workload Shared Roles By Workload_ Id by workload_ ID

Operation path: PATCH /workloads/{workload_id}/sharedRoles

Replace the shared roles for a workload with the provided list.

Body parameter

{
  "additionalProperties": false,
  "description": "Request model for updating shared roles on an entity. used to grant access, remove access, or update roles for organizations, groups, or users. up to 100 roles may be set in a single request.",
  "properties": {
    "operation": {
      "const": "updateRoles",
      "description": "Name of the action being taken. the only operation is 'updateroles'.",
      "title": "Operation",
      "type": "string"
    },
    "roles": {
      "description": "Array of grantaccesscontrol objects, up to maximum 100 objects.",
      "items": {
        "anyOf": [
          {
            "additionalProperties": false,
            "description": "Grant access control request using username for user identification.",
            "properties": {
              "role": {
                "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
                "enum": [
                  "NO_ROLE",
                  "OWNER",
                  "READ_WRITE",
                  "EDITOR",
                  "USER",
                  "DATA_SCIENTIST",
                  "ADMIN",
                  "READ_ONLY",
                  "CONSUMER",
                  "OBSERVER"
                ],
                "title": "SharingRole",
                "type": "string"
              },
              "shareRecipientType": {
                "description": "Enum of possible subject types.",
                "enum": [
                  "user",
                  "group",
                  "organization",
                  "role"
                ],
                "title": "SubjectType",
                "type": "string"
              },
              "username": {
                "description": "Username of the user to update the access role for.",
                "title": "Username",
                "type": "string"
              }
            },
            "required": [
              "shareRecipientType",
              "role",
              "username"
            ],
            "title": "GrantAccessControlWithUsername",
            "type": "object"
          },
          {
            "additionalProperties": false,
            "description": "Grant access control request using id for recipient identification. can be used for users, groups, or organizations.",
            "properties": {
              "id": {
                "description": "The id of the recipient.",
                "title": "Id",
                "type": "string"
              },
              "role": {
                "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
                "enum": [
                  "NO_ROLE",
                  "OWNER",
                  "READ_WRITE",
                  "EDITOR",
                  "USER",
                  "DATA_SCIENTIST",
                  "ADMIN",
                  "READ_ONLY",
                  "CONSUMER",
                  "OBSERVER"
                ],
                "title": "SharingRole",
                "type": "string"
              },
              "shareRecipientType": {
                "description": "Enum of possible subject types.",
                "enum": [
                  "user",
                  "group",
                  "organization",
                  "role"
                ],
                "title": "SubjectType",
                "type": "string"
              }
            },
            "required": [
              "shareRecipientType",
              "role",
              "id"
            ],
            "title": "GrantAccessControlWithId",
            "type": "object"
          }
        ]
      },
      "maxItems": 100,
      "minItems": 1,
      "title": "Roles",
      "type": "array"
    }
  },
  "required": [
    "operation",
    "roles"
  ],
  "title": "SharedRolesUpdateRequest",
  "type": "object"
}

Parameters

Name In Type Required Description
workload_id path string true Workload ID
body body SharedRolesUpdateRequest true none

Example responses

422 Response

{
  "properties": {
    "detail": {
      "items": {
        "properties": {
          "ctx": {
            "title": "Context",
            "type": "object"
          },
          "input": {
            "title": "Input"
          },
          "loc": {
            "items": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "integer"
                }
              ]
            },
            "title": "Location",
            "type": "array"
          },
          "msg": {
            "title": "Message",
            "type": "string"
          },
          "type": {
            "title": "Error Type",
            "type": "string"
          }
        },
        "required": [
          "loc",
          "msg",
          "type"
        ],
        "title": "ValidationError",
        "type": "object"
      },
      "title": "Detail",
      "type": "array"
    }
  },
  "title": "HTTPValidationError",
  "type": "object"
}

Responses

Status Meaning Description Schema
204 No Content Successful Response None
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Start Workload By Workload_ Id by workload_ ID

Operation path: POST /workloads/{workload_id}/start

Start a workload by scheduling its proton.

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

202 Response

{
  "additionalProperties": false,
  "description": "Acknowledgement returned by asynchronous workload lifecycle operations (start/stop). the operation has been accepted and queued. poll ``get /workloads/{workloadid}`` to observe the resulting status transition.",
  "properties": {
    "status": {
      "description": "Human-readable description of the operation outcome.",
      "title": "Status",
      "type": "string"
    },
    "trackVia": {
      "description": "Url to poll in order to observe the status transition.",
      "title": "Track Via",
      "type": "string"
    },
    "workloadId": {
      "description": "Id of the workload on which the operation was requested.",
      "title": "Workload ID",
      "type": "string"
    }
  },
  "required": [
    "status",
    "workloadId",
    "trackVia"
  ],
  "title": "WorkloadOperationResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
202 Accepted Successful Response WorkloadOperationResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
409 Conflict Workload proton must be stopped before starting None
422 Unprocessable Entity Validation Error HTTPValidationError

Reset Workload Stats By Workload_ Id by workload_ ID

Operation path: DELETE /workloads/{workload_id}/stats

Clear collected statistics for a workload within the specified time range.

Parameters

Name In Type Required Description
workload_id path string true Workload ID
protonId query any false Proton ID to get stats for (optional, defaults to current proton)
startTime query any false Start time for stats
endTime query any false End time for stats

Example responses

422 Response

{
  "properties": {
    "detail": {
      "items": {
        "properties": {
          "ctx": {
            "title": "Context",
            "type": "object"
          },
          "input": {
            "title": "Input"
          },
          "loc": {
            "items": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "integer"
                }
              ]
            },
            "title": "Location",
            "type": "array"
          },
          "msg": {
            "title": "Message",
            "type": "string"
          },
          "type": {
            "title": "Error Type",
            "type": "string"
          }
        },
        "required": [
          "loc",
          "msg",
          "type"
        ],
        "title": "ValidationError",
        "type": "object"
      },
      "title": "Detail",
      "type": "array"
    }
  },
  "title": "HTTPValidationError",
  "type": "object"
}

Responses

Status Meaning Description Schema
204 No Content Successful Response None
400 Bad Request Invalid time range None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Workload Stats By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/stats

Get aggregated performance statistics for a workload.

Parameters

Name In Type Required Description
workload_id path string true Workload ID
protonId query any false Proton ID to get stats for (optional, defaults to current proton)
startTime query any false Start time for stats
endTime query any false End time for stats
responseTimeQuantile query number false Response time quantile (e.g., 0.95 for p95)
slowRequestsThreshold query integer false Slow requests threshold in milliseconds

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "Proton request statistics with time period.",
  "properties": {
    "metrics": {
      "additionalProperties": false,
      "description": "Detailed request metrics.",
      "properties": {
        "concurrentRequests": {
          "default": 0,
          "description": "Current concurrent requests.",
          "title": "Concurrentrequests",
          "type": "integer"
        },
        "requestsPerMinute": {
          "default": 0,
          "description": "Average requests per minute.",
          "title": "Requestsperminute",
          "type": "integer"
        },
        "responseTime": {
          "default": 0,
          "description": "Average response time in milliseconds.",
          "title": "Responsetime",
          "type": "integer"
        },
        "serverErrorRate": {
          "default": 0,
          "description": "Server error rate.",
          "title": "Servererrorrate",
          "type": "number"
        },
        "serverErrors": {
          "default": 0,
          "description": "Number of server errors (5xx).",
          "title": "Servererrors",
          "type": "integer"
        },
        "slowRequests": {
          "default": 0,
          "description": "Number of slow requests exceeding threshold.",
          "title": "Slowrequests",
          "type": "integer"
        },
        "totalErrorRate": {
          "default": 0,
          "description": "Total error rate.",
          "title": "Totalerrorrate",
          "type": "number"
        },
        "totalRequests": {
          "default": 0,
          "description": "Total number of requests.",
          "title": "Totalrequests",
          "type": "integer"
        },
        "userErrorRate": {
          "default": 0,
          "description": "User error rate.",
          "title": "Usererrorrate",
          "type": "number"
        },
        "userErrors": {
          "default": 0,
          "description": "Number of user errors (4xx).",
          "title": "Usererrors",
          "type": "integer"
        }
      },
      "title": "RequestMetrics",
      "type": "object"
    },
    "period": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Time period definition.",
          "properties": {
            "end": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Period end time.",
              "title": "End"
            },
            "start": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Period start time.",
              "title": "Start"
            }
          },
          "title": "Period",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Time period."
    }
  },
  "title": "ProtonRequestStats",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response ProtonRequestStats
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Get Workload Stats Per Metric By Workload_ Id by workload_ ID

Operation path: GET /workloads/{workload_id}/stats/{metric_name}

Get time-series data for a specific performance metric of a workload.

Parameters

Name In Type Required Description
metric_name path WorkloadStatsMetricName true Name of the metric to retrieve
workload_id path string true Workload ID
protonId query any false Proton ID to get stats for (optional, defaults to current proton)
startTime query any false Start time for stats
endTime query any false End time for stats
responseTimeQuantile query number false Response time quantile (e.g., 0.95 for p95)
slowRequestsThreshold query integer false Slow requests threshold in milliseconds
resolution query OtelMetricResolution false Time resolution for data aggregation

Enumerated Values

Parameter Value
metric_name [totalRequests, requestsOverN, requestsPerMinute, concurrentRequests, responseTime, totalErrorRate]
resolution [PT1M, PT5M, PT1H, P1D, P7D, P1M]

Example responses

200 Response

{
  "additionalProperties": false,
  "description": "Proton request metric over time.",
  "properties": {
    "buckets": {
      "description": "Time-bucketed metric values with flexible structure.",
      "items": {
        "additionalProperties": true,
        "type": "object"
      },
      "title": "Buckets",
      "type": "array"
    },
    "metric": {
      "description": "Metric names for workload statistics.",
      "enum": [
        "totalRequests",
        "requestsOverN",
        "requestsPerMinute",
        "concurrentRequests",
        "responseTime",
        "totalErrorRate"
      ],
      "title": "WorkloadStatsMetricName",
      "type": "string"
    },
    "summary": {
      "additionalProperties": false,
      "description": "Summary information for proton statistics.",
      "properties": {
        "period": {
          "additionalProperties": false,
          "description": "Time period definition.",
          "properties": {
            "end": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Period end time.",
              "title": "End"
            },
            "start": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Period start time.",
              "title": "Start"
            }
          },
          "title": "Period",
          "type": "object"
        },
        "protonId": {
          "description": "Proton id.",
          "title": "Protonid",
          "type": "string"
        }
      },
      "required": [
        "protonId",
        "period"
      ],
      "title": "Summary",
      "type": "object"
    }
  },
  "required": [
    "metric",
    "summary"
  ],
  "title": "ProtonRequestMetricOverTime",
  "type": "object"
}

Responses

Status Meaning Description Schema
200 OK Successful Response ProtonRequestMetricOverTime
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Stop Workload By Workload_ Id by workload_ ID

Operation path: POST /workloads/{workload_id}/stop

Stop a workload by stopping its proton.

Parameters

Name In Type Required Description
workload_id path string true Workload ID

Example responses

202 Response

{
  "additionalProperties": false,
  "description": "Acknowledgement returned by asynchronous workload lifecycle operations (start/stop). the operation has been accepted and queued. poll ``get /workloads/{workloadid}`` to observe the resulting status transition.",
  "properties": {
    "status": {
      "description": "Human-readable description of the operation outcome.",
      "title": "Status",
      "type": "string"
    },
    "trackVia": {
      "description": "Url to poll in order to observe the status transition.",
      "title": "Track Via",
      "type": "string"
    },
    "workloadId": {
      "description": "Id of the workload on which the operation was requested.",
      "title": "Workload ID",
      "type": "string"
    }
  },
  "required": [
    "status",
    "workloadId",
    "trackVia"
  ],
  "title": "WorkloadOperationResponse",
  "type": "object"
}

Responses

Status Meaning Description Schema
202 Accepted Successful Response WorkloadOperationResponse
400 Bad Request Bad request None
401 Unauthorized Unauthenticated None
403 Forbidden Insufficient permissions None
404 Not Found Workload not found None
422 Unprocessable Entity Validation Error HTTPValidationError

Schemas

ANY_PERMISSION

{
  "const": "*",
  "type": "string"
}

Properties

Name Type Required Restrictions Description
anonymous string false none

ArtifactInfoFormatted

{
  "description": "Artifact basic information.",
  "properties": {
    "artifactRepositoryId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the artifact repository this artifact belongs to (for versioning).",
      "title": "Artifactrepositoryid"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "Id",
      "type": "string"
    },
    "name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Name of the entity.",
      "title": "Name"
    },
    "status": {
      "anyOf": [
        {
          "enum": [
            "draft",
            "locked"
          ],
          "title": "ArtifactStatus",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Artifact status."
    },
    "templateId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the template used to create this artifact.",
      "title": "Templateid"
    },
    "type": {
      "anyOf": [
        {
          "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
          "enum": [
            "service",
            "nim"
          ],
          "title": "ArtifactType",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Artifact type."
    },
    "version": {
      "anyOf": [
        {
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "description": "Version number of the artifact (set only for locked artifacts).",
      "title": "Version"
    }
  },
  "required": [
    "id"
  ],
  "title": "ArtifactInfoFormatted",
  "type": "object"
}

ArtifactInfoFormatted

Properties

Name Type Required Restrictions Description
artifactRepositoryId any false Id of the artifact repository this artifact belongs to (for versioning).

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
id string true Unique identifier of the entity.
name any false Name of the entity.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
status any false Artifact status.

anyOf

Name Type Required Restrictions Description
» anonymous ArtifactStatus false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
templateId any false Id of the template used to create this artifact.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
type any false Artifact type.

anyOf

Name Type Required Restrictions Description
» anonymous ArtifactType false Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to service when omitted. - service: generic service artifact. - nim: nvidia nim model artifact.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
version any false Version number of the artifact (set only for locked artifacts).

anyOf

Name Type Required Restrictions Description
» anonymous integer false none

or

Name Type Required Restrictions Description
» anonymous null false none

ArtifactStatus

{
  "enum": [
    "draft",
    "locked"
  ],
  "title": "ArtifactStatus",
  "type": "string"
}

ArtifactStatus

Properties

Name Type Required Restrictions Description
ArtifactStatus string false none

Enumerated Values

Property Value
ArtifactStatus [draft, locked]

ArtifactType

{
  "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
  "enum": [
    "service",
    "nim"
  ],
  "title": "ArtifactType",
  "type": "string"
}

ArtifactType

Properties

Name Type Required Restrictions Description
ArtifactType string false Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to service when omitted. - service: generic service artifact. - nim: nvidia nim model artifact.

Enumerated Values

Property Value
ArtifactType [service, nim]

AutoscalingPolicy

{
  "additionalProperties": false,
  "description": "Base class for autoscaling policies.",
  "properties": {
    "maxCount": {
      "description": "Maximum number of replicas.",
      "minimum": 0,
      "title": "Max Count",
      "type": "integer"
    },
    "minCount": {
      "description": "Minimum number of replicas.",
      "minimum": 0,
      "title": "Min Count",
      "type": "integer"
    },
    "priority": {
      "anyOf": [
        {
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "description": "Policy priority when multiple policies are defined.",
      "title": "Priority"
    },
    "scalingMetric": {
      "anyOf": [
        {
          "oneOf": [
            {
              "const": "cpuAverageUtilization",
              "description": "Scale replicas to maintain a target average CPU utilization across pods.",
              "title": "CPU Average Utilization"
            },
            {
              "const": "httpRequestsConcurrency",
              "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
              "title": "HTTP Requests Concurrency"
            },
            {
              "const": "gpuCacheUtilization",
              "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
              "title": "GPU Cache Utilization"
            },
            {
              "const": "gpuRequestQueueDepth",
              "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
              "title": "GPU Request Queue Depth"
            }
          ],
          "title": "ScalingMetricType",
          "type": "string"
        },
        {
          "type": "string"
        }
      ],
      "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
      "title": "Scaling Metric"
    },
    "target": {
      "description": "Target value for the scaling metric.",
      "minimum": 0,
      "title": "Target",
      "type": "number"
    }
  },
  "required": [
    "scalingMetric",
    "target",
    "minCount",
    "maxCount"
  ],
  "title": "AutoscalingPolicy",
  "type": "object"
}

AutoscalingPolicy

Properties

Name Type Required Restrictions Description
maxCount integer true minimum: 0
Maximum number of replicas.
minCount integer true minimum: 0
Minimum number of replicas.
priority any false Policy priority when multiple policies are defined.

anyOf

Name Type Required Restrictions Description
» anonymous integer false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
scalingMetric any true Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.

anyOf

Name Type Required Restrictions Description
» anonymous ScalingMetricType false none

or

Name Type Required Restrictions Description
» anonymous string false none

continued

Name Type Required Restrictions Description
target number true minimum: 0
Target value for the scaling metric.

AutoscalingProperties

{
  "additionalProperties": false,
  "description": "Autoscaling configuration for a proton.",
  "properties": {
    "enabled": {
      "default": true,
      "description": "Whether autoscaling is enabled.",
      "title": "Enabled",
      "type": "boolean"
    },
    "policies": {
      "items": {
        "additionalProperties": false,
        "description": "Base class for autoscaling policies.",
        "properties": {
          "maxCount": {
            "description": "Maximum number of replicas.",
            "minimum": 0,
            "title": "Max Count",
            "type": "integer"
          },
          "minCount": {
            "description": "Minimum number of replicas.",
            "minimum": 0,
            "title": "Min Count",
            "type": "integer"
          },
          "priority": {
            "anyOf": [
              {
                "type": "integer"
              },
              {
                "type": "null"
              }
            ],
            "description": "Policy priority when multiple policies are defined.",
            "title": "Priority"
          },
          "scalingMetric": {
            "anyOf": [
              {
                "oneOf": [
                  {
                    "const": "cpuAverageUtilization",
                    "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                    "title": "CPU Average Utilization"
                  },
                  {
                    "const": "httpRequestsConcurrency",
                    "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                    "title": "HTTP Requests Concurrency"
                  },
                  {
                    "const": "gpuCacheUtilization",
                    "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                    "title": "GPU Cache Utilization"
                  },
                  {
                    "const": "gpuRequestQueueDepth",
                    "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                    "title": "GPU Request Queue Depth"
                  }
                ],
                "title": "ScalingMetricType",
                "type": "string"
              },
              {
                "type": "string"
              }
            ],
            "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
            "title": "Scaling Metric"
          },
          "target": {
            "description": "Target value for the scaling metric.",
            "minimum": 0,
            "title": "Target",
            "type": "number"
          }
        },
        "required": [
          "scalingMetric",
          "target",
          "minCount",
          "maxCount"
        ],
        "title": "AutoscalingPolicy",
        "type": "object"
      },
      "title": "Policies",
      "type": "array"
    }
  },
  "required": [
    "policies"
  ],
  "title": "AutoscalingProperties",
  "type": "object"
}

AutoscalingProperties

Properties

Name Type Required Restrictions Description
enabled boolean false Whether autoscaling is enabled.
policies [AutoscalingPolicy] true [Base class for autoscaling policies.]

BundleSelectionPolicy

{
  "enum": [
    "availability"
  ],
  "title": "BundleSelectionPolicy",
  "type": "string"
}

BundleSelectionPolicy

Properties

Name Type Required Restrictions Description
BundleSelectionPolicy string false none

Enumerated Values

Property Value
BundleSelectionPolicy availability

Capabilities

{
  "additionalProperties": false,
  "description": "Linux capabilities to add or drop from the container.",
  "properties": {
    "add": {
      "anyOf": [
        {
          "items": {
            "type": "string"
          },
          "type": "array"
        },
        {
          "type": "null"
        }
      ],
      "description": "Capabilities to add.",
      "title": "Add"
    },
    "drop": {
      "anyOf": [
        {
          "items": {
            "type": "string"
          },
          "type": "array"
        },
        {
          "type": "null"
        }
      ],
      "description": "Capabilities to drop.",
      "title": "Drop"
    }
  },
  "title": "Capabilities",
  "type": "object"
}

Capabilities

Properties

Name Type Required Restrictions Description
add any false Capabilities to add.

anyOf

Name Type Required Restrictions Description
» anonymous [string] false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
drop any false Capabilities to drop.

anyOf

Name Type Required Restrictions Description
» anonymous [string] false none

or

Name Type Required Restrictions Description
» anonymous null false none

CodeRef

{
  "additionalProperties": false,
  "properties": {
    "datarobot": {
      "additionalProperties": false,
      "properties": {
        "catalogId": {
          "title": "Catalogid",
          "type": "string"
        },
        "catalogVersionId": {
          "title": "Catalogversionid",
          "type": "string"
        }
      },
      "required": [
        "catalogId",
        "catalogVersionId"
      ],
      "title": "DataRobotCodeRef",
      "type": "object"
    },
    "provider": {
      "const": "datarobot",
      "default": "datarobot",
      "title": "Provider",
      "type": "string"
    },
    "type": {
      "const": "datarobot",
      "default": "datarobot",
      "title": "Type",
      "type": "string"
    }
  },
  "required": [
    "datarobot"
  ],
  "title": "CodeRef",
  "type": "object"
}

CodeRef

Properties

Name Type Required Restrictions Description
datarobot DataRobotCodeRef true none
provider string false none
type string false none

CommonSortQueryParams

{
  "anyOf": [
    {
      "const": "name",
      "description": "Sort by name in ascending order (A-Z)",
      "title": "Name Ascending",
      "type": "string"
    },
    {
      "const": "-name",
      "description": "Sort by name in descending order (Z-A)",
      "title": "Name Descending",
      "type": "string"
    },
    {
      "const": "createdAt",
      "description": "Sort by creation date in ascending order (oldest first)",
      "title": "Creation Date Ascending",
      "type": "string"
    },
    {
      "const": "-createdAt",
      "description": "Sort by creation date in descending order (newest first)",
      "title": "Creation Date Descending",
      "type": "string"
    },
    {
      "const": "updatedAt",
      "description": "Sort by update date in ascending order (oldest first)",
      "title": "Update Date Ascending",
      "type": "string"
    },
    {
      "const": "-updatedAt",
      "description": "Sort by update date in descending order (newest first)",
      "title": "Update Date Descending",
      "type": "string"
    }
  ]
}

Properties

anyOf

Name Type Required Restrictions Description
anonymous string false Sort by name in ascending order (A-Z)

or

Name Type Required Restrictions Description
anonymous string false Sort by name in descending order (Z-A)

or

Name Type Required Restrictions Description
anonymous string false Sort by creation date in ascending order (oldest first)

or

Name Type Required Restrictions Description
anonymous string false Sort by creation date in descending order (newest first)

or

Name Type Required Restrictions Description
anonymous string false Sort by update date in ascending order (oldest first)

or

Name Type Required Restrictions Description
anonymous string false Sort by update date in descending order (newest first)

Container

{
  "additionalProperties": false,
  "properties": {
    "build": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Build reference embedded in a container spec when an image build is triggered.",
          "properties": {
            "artifactImageBuildId": {
              "description": "Artifact image build id.",
              "title": "Artifactimagebuildid",
              "type": "string"
            },
            "createdAt": {
              "description": "Build creation timestamp (utc).",
              "format": "date-time",
              "title": "Createdat",
              "type": "string"
            },
            "status": {
              "description": "Image build reported status at submit time.",
              "title": "Status",
              "type": "string"
            }
          },
          "required": [
            "artifactImageBuildId",
            "status",
            "createdAt"
          ],
          "title": "ContainerBuildInfo",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
    },
    "description": {
      "default": "",
      "description": "Description of the container.",
      "title": "Description",
      "type": "string"
    },
    "entrypoint": {
      "anyOf": [
        {
          "items": {
            "type": "string"
          },
          "type": "array"
        },
        {
          "type": "null"
        }
      ],
      "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
      "title": "Entrypoint"
    },
    "environmentVars": {
      "default": [],
      "description": "Environment variables.",
      "items": {
        "anyOf": [
          {
            "properties": {
              "name": {
                "description": "Name of the environment variable.",
                "title": "Name",
                "type": "string"
              },
              "source": {
                "const": "string",
                "default": "string",
                "title": "Source",
                "type": "string"
              },
              "value": {
                "description": "Value of the environment variable.",
                "title": "Value",
                "type": "string"
              }
            },
            "required": [
              "name",
              "value"
            ],
            "title": "StringEnvironmentVariable",
            "type": "object"
          },
          {
            "properties": {
              "drCredentialId": {
                "description": "Id of the datarobot credential to use.",
                "title": "DR Credential ID",
                "type": "string"
              },
              "key": {
                "description": "Key within the credential.",
                "title": "Key",
                "type": "string"
              },
              "name": {
                "description": "Name of the environment variable.",
                "title": "Name",
                "type": "string"
              },
              "source": {
                "const": "dr-credential",
                "title": "Source",
                "type": "string"
              }
            },
            "required": [
              "source",
              "name",
              "drCredentialId",
              "key"
            ],
            "title": "CredentialEnvironmentVariable",
            "type": "object"
          },
          {
            "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
            "properties": {
              "name": {
                "description": "Name of the environment variable.",
                "title": "Name",
                "type": "string"
              },
              "source": {
                "const": "dr-api-token",
                "title": "Source",
                "type": "string"
              }
            },
            "required": [
              "source",
              "name"
            ],
            "title": "DrApiTokenEnvironmentVariable",
            "type": "object"
          }
        ]
      },
      "title": "Environmentvars",
      "type": "array"
    },
    "imageBuildConfig": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User-provided configuration for server-side image builds from source code.",
          "properties": {
            "codeRef": {
              "anyOf": [
                {
                  "additionalProperties": false,
                  "properties": {
                    "datarobot": {
                      "additionalProperties": false,
                      "properties": {
                        "catalogId": {
                          "title": "Catalogid",
                          "type": "string"
                        },
                        "catalogVersionId": {
                          "title": "Catalogversionid",
                          "type": "string"
                        }
                      },
                      "required": [
                        "catalogId",
                        "catalogVersionId"
                      ],
                      "title": "DataRobotCodeRef",
                      "type": "object"
                    },
                    "provider": {
                      "const": "datarobot",
                      "default": "datarobot",
                      "title": "Provider",
                      "type": "string"
                    },
                    "type": {
                      "const": "datarobot",
                      "default": "datarobot",
                      "title": "Type",
                      "type": "string"
                    }
                  },
                  "required": [
                    "datarobot"
                  ],
                  "title": "CodeRef",
                  "type": "object"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
            },
            "dockerfile": {
              "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
              "discriminator": {
                "mapping": {
                  "generated": "#/components/schemas/GeneratedDockerfile",
                  "provided": "#/components/schemas/ProvidedDockerfile"
                },
                "propertyName": "source"
              },
              "oneOf": [
                {
                  "additionalProperties": false,
                  "description": "User supplies a dockerfile in the uploaded source code.",
                  "properties": {
                    "path": {
                      "default": "./Dockerfile",
                      "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                      "title": "Path",
                      "type": "string"
                    },
                    "source": {
                      "const": "provided",
                      "default": "provided",
                      "title": "Source",
                      "type": "string"
                    }
                  },
                  "title": "ProvidedDockerfile",
                  "type": "object"
                },
                {
                  "additionalProperties": false,
                  "description": "System generates a dockerfile from execution environment metadata.",
                  "properties": {
                    "entrypoint": {
                      "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                      "items": {
                        "type": "string"
                      },
                      "minItems": 1,
                      "title": "Entrypoint",
                      "type": "array"
                    },
                    "executionEnvironmentId": {
                      "description": "Execution environment id used to resolve the base Docker image.",
                      "title": "Execution Environment ID",
                      "type": "string"
                    },
                    "executionEnvironmentVersionId": {
                      "description": "Execution environment version id that pins the exact base image tag.",
                      "title": "Execution Environment Version ID",
                      "type": "string"
                    },
                    "source": {
                      "const": "generated",
                      "default": "generated",
                      "title": "Source",
                      "type": "string"
                    }
                  },
                  "required": [
                    "executionEnvironmentId",
                    "executionEnvironmentVersionId",
                    "entrypoint"
                  ],
                  "title": "GeneratedDockerfile",
                  "type": "object"
                }
              ],
              "title": "Dockerfile"
            }
          },
          "title": "ImageBuildConfig",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Configuration for server-side image builds from source code."
    },
    "imageUri": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
      "title": "Imageuri"
    },
    "livenessProbe": {
      "anyOf": [
        {
          "additionalProperties": false,
          "properties": {
            "failureThreshold": {
              "default": 3,
              "description": "Minimum consecutive failures for the probe to be considered failed.",
              "title": "Failurethreshold",
              "type": "integer"
            },
            "host": {
              "anyOf": [
                {
                  "minLength": 0,
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Host name to connect to, defaults to the pod ip.",
              "title": "Host"
            },
            "httpHeaders": {
              "additionalProperties": {
                "type": "string"
              },
              "description": "HTTP headers for probe.",
              "title": "Httpheaders",
              "type": "object"
            },
            "initialDelaySeconds": {
              "default": 30,
              "description": "Number of seconds to wait before the first probe is executed.",
              "title": "Initialdelayseconds",
              "type": "integer"
            },
            "path": {
              "description": "Url path to query for health check.",
              "title": "Path",
              "type": "string"
            },
            "periodSeconds": {
              "default": 30,
              "description": "How often (in seconds) to perform the probe.",
              "title": "Periodseconds",
              "type": "integer"
            },
            "port": {
              "default": 8080,
              "description": "Port number to access on the container.",
              "maximum": 65535,
              "minimum": 1,
              "title": "Port",
              "type": "integer"
            },
            "scheme": {
              "default": "HTTP",
              "description": "Scheme to use for connecting to the host.",
              "enum": [
                "HTTP",
                "HTTPS"
              ],
              "title": "Scheme",
              "type": "string"
            },
            "timeoutSeconds": {
              "default": 30,
              "description": "Number of seconds after which the probe times out.",
              "title": "Timeoutseconds",
              "type": "integer"
            }
          },
          "required": [
            "path"
          ],
          "title": "ProbeConfig",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Container liveness check configuration."
    },
    "name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
      "title": "Name"
    },
    "port": {
      "anyOf": [
        {
          "maximum": 65535,
          "minimum": 1024,
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
      "title": "Port"
    },
    "primary": {
      "anyOf": [
        {
          "type": "boolean"
        },
        {
          "type": "null"
        }
      ],
      "default": false,
      "description": "Whether this is the primary container.",
      "title": "Primary"
    },
    "readinessProbe": {
      "anyOf": [
        {
          "additionalProperties": false,
          "properties": {
            "failureThreshold": {
              "default": 3,
              "description": "Minimum consecutive failures for the probe to be considered failed.",
              "title": "Failurethreshold",
              "type": "integer"
            },
            "host": {
              "anyOf": [
                {
                  "minLength": 0,
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Host name to connect to, defaults to the pod ip.",
              "title": "Host"
            },
            "httpHeaders": {
              "additionalProperties": {
                "type": "string"
              },
              "description": "HTTP headers for probe.",
              "title": "Httpheaders",
              "type": "object"
            },
            "initialDelaySeconds": {
              "default": 30,
              "description": "Number of seconds to wait before the first probe is executed.",
              "title": "Initialdelayseconds",
              "type": "integer"
            },
            "path": {
              "description": "Url path to query for health check.",
              "title": "Path",
              "type": "string"
            },
            "periodSeconds": {
              "default": 30,
              "description": "How often (in seconds) to perform the probe.",
              "title": "Periodseconds",
              "type": "integer"
            },
            "port": {
              "default": 8080,
              "description": "Port number to access on the container.",
              "maximum": 65535,
              "minimum": 1,
              "title": "Port",
              "type": "integer"
            },
            "scheme": {
              "default": "HTTP",
              "description": "Scheme to use for connecting to the host.",
              "enum": [
                "HTTP",
                "HTTPS"
              ],
              "title": "Scheme",
              "type": "string"
            },
            "timeoutSeconds": {
              "default": 30,
              "description": "Number of seconds after which the probe times out.",
              "title": "Timeoutseconds",
              "type": "integer"
            }
          },
          "required": [
            "path"
          ],
          "title": "ProbeConfig",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Container readiness check configuration."
    },
    "securityContext": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
          "properties": {
            "allowPrivilegeEscalation": {
              "anyOf": [
                {
                  "type": "boolean"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
              "title": "Allowprivilegeescalation"
            },
            "capabilities": {
              "anyOf": [
                {
                  "additionalProperties": false,
                  "description": "Linux capabilities to add or drop from the container.",
                  "properties": {
                    "add": {
                      "anyOf": [
                        {
                          "items": {
                            "type": "string"
                          },
                          "type": "array"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Capabilities to add.",
                      "title": "Add"
                    },
                    "drop": {
                      "anyOf": [
                        {
                          "items": {
                            "type": "string"
                          },
                          "type": "array"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Capabilities to drop.",
                      "title": "Drop"
                    }
                  },
                  "title": "Capabilities",
                  "type": "object"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Linux capabilities to add or drop."
            },
            "readOnlyRootFilesystem": {
              "anyOf": [
                {
                  "type": "boolean"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Whether the root filesystem is read-only.",
              "title": "Readonlyrootfilesystem"
            },
            "seccompProfile": {
              "anyOf": [
                {
                  "additionalProperties": false,
                  "description": "Seccomp profile configuration.",
                  "properties": {
                    "localhostProfile": {
                      "anyOf": [
                        {
                          "type": "string"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                      "title": "Localhostprofile"
                    },
                    "type": {
                      "description": "Allowed seccomp profile types.",
                      "enum": [
                        "RuntimeDefault",
                        "Unconfined",
                        "Localhost"
                      ],
                      "title": "SeccompProfileType",
                      "type": "string"
                    }
                  },
                  "required": [
                    "type"
                  ],
                  "title": "SeccompProfile",
                  "type": "object"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Seccomp profile for the container."
            }
          },
          "title": "SecurityContext",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Container security context."
    },
    "startupProbe": {
      "anyOf": [
        {
          "additionalProperties": false,
          "properties": {
            "failureThreshold": {
              "default": 3,
              "description": "Minimum consecutive failures for the probe to be considered failed.",
              "title": "Failurethreshold",
              "type": "integer"
            },
            "host": {
              "anyOf": [
                {
                  "minLength": 0,
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Host name to connect to, defaults to the pod ip.",
              "title": "Host"
            },
            "httpHeaders": {
              "additionalProperties": {
                "type": "string"
              },
              "description": "HTTP headers for probe.",
              "title": "Httpheaders",
              "type": "object"
            },
            "initialDelaySeconds": {
              "default": 30,
              "description": "Number of seconds to wait before the first probe is executed.",
              "title": "Initialdelayseconds",
              "type": "integer"
            },
            "path": {
              "description": "Url path to query for health check.",
              "title": "Path",
              "type": "string"
            },
            "periodSeconds": {
              "default": 30,
              "description": "How often (in seconds) to perform the probe.",
              "title": "Periodseconds",
              "type": "integer"
            },
            "port": {
              "default": 8080,
              "description": "Port number to access on the container.",
              "maximum": 65535,
              "minimum": 1,
              "title": "Port",
              "type": "integer"
            },
            "scheme": {
              "default": "HTTP",
              "description": "Scheme to use for connecting to the host.",
              "enum": [
                "HTTP",
                "HTTPS"
              ],
              "title": "Scheme",
              "type": "string"
            },
            "timeoutSeconds": {
              "default": 30,
              "description": "Number of seconds after which the probe times out.",
              "title": "Timeoutseconds",
              "type": "integer"
            }
          },
          "required": [
            "path"
          ],
          "title": "ProbeConfig",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Container startup check configuration."
    }
  },
  "title": "Container",
  "type": "object"
}

Container

Properties

Name Type Required Restrictions Description
build any false Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it.

anyOf

Name Type Required Restrictions Description
» anonymous ContainerBuildInfo false Build reference embedded in a container spec when an image build is triggered.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
description string false Description of the container.
entrypoint any false Runtime entrypoint override for the container command. independent of build entrypoint.

anyOf

Name Type Required Restrictions Description
» anonymous [string] false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
environmentVars [anyOf] false Environment variables.

anyOf

Name Type Required Restrictions Description
» anonymous StringEnvironmentVariable false none

or

Name Type Required Restrictions Description
» anonymous CredentialEnvironmentVariable false none

or

Name Type Required Restrictions Description
» anonymous DrApiTokenEnvironmentVariable false A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload workload <workloadid> API key scoped to the invoking user); no value or id is supplied by the user.

continued

Name Type Required Restrictions Description
imageBuildConfig any false Configuration for server-side image builds from source code.

anyOf

Name Type Required Restrictions Description
» anonymous ImageBuildConfig false User-provided configuration for server-side image builds from source code.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
imageUri any false Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
livenessProbe any false Container liveness check configuration.

anyOf

Name Type Required Restrictions Description
» anonymous ProbeConfig false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
name any false Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
port any false Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.

anyOf

Name Type Required Restrictions Description
» anonymous integer false maximum: 65535
minimum: 1024
none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
primary any false Whether this is the primary container.

anyOf

Name Type Required Restrictions Description
» anonymous boolean false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
readinessProbe any false Container readiness check configuration.

anyOf

Name Type Required Restrictions Description
» anonymous ProbeConfig false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
securityContext any false Container security context.

anyOf

Name Type Required Restrictions Description
» anonymous SecurityContext false Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
startupProbe any false Container startup check configuration.

anyOf

Name Type Required Restrictions Description
» anonymous ProbeConfig false none

or

Name Type Required Restrictions Description
» anonymous null false none

ContainerBuildInfo

{
  "additionalProperties": false,
  "description": "Build reference embedded in a container spec when an image build is triggered.",
  "properties": {
    "artifactImageBuildId": {
      "description": "Artifact image build id.",
      "title": "Artifactimagebuildid",
      "type": "string"
    },
    "createdAt": {
      "description": "Build creation timestamp (utc).",
      "format": "date-time",
      "title": "Createdat",
      "type": "string"
    },
    "status": {
      "description": "Image build reported status at submit time.",
      "title": "Status",
      "type": "string"
    }
  },
  "required": [
    "artifactImageBuildId",
    "status",
    "createdAt"
  ],
  "title": "ContainerBuildInfo",
  "type": "object"
}

ContainerBuildInfo

Properties

Name Type Required Restrictions Description
artifactImageBuildId string true Artifact image build id.
createdAt string(date-time) true Build creation timestamp (utc).
status string true Image build reported status at submit time.

ContainerGroup

{
  "additionalProperties": false,
  "properties": {
    "containers": {
      "default": [],
      "description": "List of containers making this container group.",
      "items": {
        "additionalProperties": false,
        "properties": {
          "build": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "Build reference embedded in a container spec when an image build is triggered.",
                "properties": {
                  "artifactImageBuildId": {
                    "description": "Artifact image build id.",
                    "title": "Artifactimagebuildid",
                    "type": "string"
                  },
                  "createdAt": {
                    "description": "Build creation timestamp (utc).",
                    "format": "date-time",
                    "title": "Createdat",
                    "type": "string"
                  },
                  "status": {
                    "description": "Image build reported status at submit time.",
                    "title": "Status",
                    "type": "string"
                  }
                },
                "required": [
                  "artifactImageBuildId",
                  "status",
                  "createdAt"
                ],
                "title": "ContainerBuildInfo",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
          },
          "description": {
            "default": "",
            "description": "Description of the container.",
            "title": "Description",
            "type": "string"
          },
          "entrypoint": {
            "anyOf": [
              {
                "items": {
                  "type": "string"
                },
                "type": "array"
              },
              {
                "type": "null"
              }
            ],
            "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
            "title": "Entrypoint"
          },
          "environmentVars": {
            "default": [],
            "description": "Environment variables.",
            "items": {
              "anyOf": [
                {
                  "properties": {
                    "name": {
                      "description": "Name of the environment variable.",
                      "title": "Name",
                      "type": "string"
                    },
                    "source": {
                      "const": "string",
                      "default": "string",
                      "title": "Source",
                      "type": "string"
                    },
                    "value": {
                      "description": "Value of the environment variable.",
                      "title": "Value",
                      "type": "string"
                    }
                  },
                  "required": [
                    "name",
                    "value"
                  ],
                  "title": "StringEnvironmentVariable",
                  "type": "object"
                },
                {
                  "properties": {
                    "drCredentialId": {
                      "description": "Id of the datarobot credential to use.",
                      "title": "DR Credential ID",
                      "type": "string"
                    },
                    "key": {
                      "description": "Key within the credential.",
                      "title": "Key",
                      "type": "string"
                    },
                    "name": {
                      "description": "Name of the environment variable.",
                      "title": "Name",
                      "type": "string"
                    },
                    "source": {
                      "const": "dr-credential",
                      "title": "Source",
                      "type": "string"
                    }
                  },
                  "required": [
                    "source",
                    "name",
                    "drCredentialId",
                    "key"
                  ],
                  "title": "CredentialEnvironmentVariable",
                  "type": "object"
                },
                {
                  "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                  "properties": {
                    "name": {
                      "description": "Name of the environment variable.",
                      "title": "Name",
                      "type": "string"
                    },
                    "source": {
                      "const": "dr-api-token",
                      "title": "Source",
                      "type": "string"
                    }
                  },
                  "required": [
                    "source",
                    "name"
                  ],
                  "title": "DrApiTokenEnvironmentVariable",
                  "type": "object"
                }
              ]
            },
            "title": "Environmentvars",
            "type": "array"
          },
          "imageBuildConfig": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "User-provided configuration for server-side image builds from source code.",
                "properties": {
                  "codeRef": {
                    "anyOf": [
                      {
                        "additionalProperties": false,
                        "properties": {
                          "datarobot": {
                            "additionalProperties": false,
                            "properties": {
                              "catalogId": {
                                "title": "Catalogid",
                                "type": "string"
                              },
                              "catalogVersionId": {
                                "title": "Catalogversionid",
                                "type": "string"
                              }
                            },
                            "required": [
                              "catalogId",
                              "catalogVersionId"
                            ],
                            "title": "DataRobotCodeRef",
                            "type": "object"
                          },
                          "provider": {
                            "const": "datarobot",
                            "default": "datarobot",
                            "title": "Provider",
                            "type": "string"
                          },
                          "type": {
                            "const": "datarobot",
                            "default": "datarobot",
                            "title": "Type",
                            "type": "string"
                          }
                        },
                        "required": [
                          "datarobot"
                        ],
                        "title": "CodeRef",
                        "type": "object"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                  },
                  "dockerfile": {
                    "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                    "discriminator": {
                      "mapping": {
                        "generated": "#/components/schemas/GeneratedDockerfile",
                        "provided": "#/components/schemas/ProvidedDockerfile"
                      },
                      "propertyName": "source"
                    },
                    "oneOf": [
                      {
                        "additionalProperties": false,
                        "description": "User supplies a dockerfile in the uploaded source code.",
                        "properties": {
                          "path": {
                            "default": "./Dockerfile",
                            "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                            "title": "Path",
                            "type": "string"
                          },
                          "source": {
                            "const": "provided",
                            "default": "provided",
                            "title": "Source",
                            "type": "string"
                          }
                        },
                        "title": "ProvidedDockerfile",
                        "type": "object"
                      },
                      {
                        "additionalProperties": false,
                        "description": "System generates a dockerfile from execution environment metadata.",
                        "properties": {
                          "entrypoint": {
                            "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                            "items": {
                              "type": "string"
                            },
                            "minItems": 1,
                            "title": "Entrypoint",
                            "type": "array"
                          },
                          "executionEnvironmentId": {
                            "description": "Execution environment id used to resolve the base Docker image.",
                            "title": "Execution Environment ID",
                            "type": "string"
                          },
                          "executionEnvironmentVersionId": {
                            "description": "Execution environment version id that pins the exact base image tag.",
                            "title": "Execution Environment Version ID",
                            "type": "string"
                          },
                          "source": {
                            "const": "generated",
                            "default": "generated",
                            "title": "Source",
                            "type": "string"
                          }
                        },
                        "required": [
                          "executionEnvironmentId",
                          "executionEnvironmentVersionId",
                          "entrypoint"
                        ],
                        "title": "GeneratedDockerfile",
                        "type": "object"
                      }
                    ],
                    "title": "Dockerfile"
                  }
                },
                "title": "ImageBuildConfig",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Configuration for server-side image builds from source code."
          },
          "imageUri": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
            "title": "Imageuri"
          },
          "livenessProbe": {
            "anyOf": [
              {
                "additionalProperties": false,
                "properties": {
                  "failureThreshold": {
                    "default": 3,
                    "description": "Minimum consecutive failures for the probe to be considered failed.",
                    "title": "Failurethreshold",
                    "type": "integer"
                  },
                  "host": {
                    "anyOf": [
                      {
                        "minLength": 0,
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Host name to connect to, defaults to the pod ip.",
                    "title": "Host"
                  },
                  "httpHeaders": {
                    "additionalProperties": {
                      "type": "string"
                    },
                    "description": "HTTP headers for probe.",
                    "title": "Httpheaders",
                    "type": "object"
                  },
                  "initialDelaySeconds": {
                    "default": 30,
                    "description": "Number of seconds to wait before the first probe is executed.",
                    "title": "Initialdelayseconds",
                    "type": "integer"
                  },
                  "path": {
                    "description": "Url path to query for health check.",
                    "title": "Path",
                    "type": "string"
                  },
                  "periodSeconds": {
                    "default": 30,
                    "description": "How often (in seconds) to perform the probe.",
                    "title": "Periodseconds",
                    "type": "integer"
                  },
                  "port": {
                    "default": 8080,
                    "description": "Port number to access on the container.",
                    "maximum": 65535,
                    "minimum": 1,
                    "title": "Port",
                    "type": "integer"
                  },
                  "scheme": {
                    "default": "HTTP",
                    "description": "Scheme to use for connecting to the host.",
                    "enum": [
                      "HTTP",
                      "HTTPS"
                    ],
                    "title": "Scheme",
                    "type": "string"
                  },
                  "timeoutSeconds": {
                    "default": 30,
                    "description": "Number of seconds after which the probe times out.",
                    "title": "Timeoutseconds",
                    "type": "integer"
                  }
                },
                "required": [
                  "path"
                ],
                "title": "ProbeConfig",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Container liveness check configuration."
          },
          "name": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
            "title": "Name"
          },
          "port": {
            "anyOf": [
              {
                "maximum": 65535,
                "minimum": 1024,
                "type": "integer"
              },
              {
                "type": "null"
              }
            ],
            "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
            "title": "Port"
          },
          "primary": {
            "anyOf": [
              {
                "type": "boolean"
              },
              {
                "type": "null"
              }
            ],
            "default": false,
            "description": "Whether this is the primary container.",
            "title": "Primary"
          },
          "readinessProbe": {
            "anyOf": [
              {
                "additionalProperties": false,
                "properties": {
                  "failureThreshold": {
                    "default": 3,
                    "description": "Minimum consecutive failures for the probe to be considered failed.",
                    "title": "Failurethreshold",
                    "type": "integer"
                  },
                  "host": {
                    "anyOf": [
                      {
                        "minLength": 0,
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Host name to connect to, defaults to the pod ip.",
                    "title": "Host"
                  },
                  "httpHeaders": {
                    "additionalProperties": {
                      "type": "string"
                    },
                    "description": "HTTP headers for probe.",
                    "title": "Httpheaders",
                    "type": "object"
                  },
                  "initialDelaySeconds": {
                    "default": 30,
                    "description": "Number of seconds to wait before the first probe is executed.",
                    "title": "Initialdelayseconds",
                    "type": "integer"
                  },
                  "path": {
                    "description": "Url path to query for health check.",
                    "title": "Path",
                    "type": "string"
                  },
                  "periodSeconds": {
                    "default": 30,
                    "description": "How often (in seconds) to perform the probe.",
                    "title": "Periodseconds",
                    "type": "integer"
                  },
                  "port": {
                    "default": 8080,
                    "description": "Port number to access on the container.",
                    "maximum": 65535,
                    "minimum": 1,
                    "title": "Port",
                    "type": "integer"
                  },
                  "scheme": {
                    "default": "HTTP",
                    "description": "Scheme to use for connecting to the host.",
                    "enum": [
                      "HTTP",
                      "HTTPS"
                    ],
                    "title": "Scheme",
                    "type": "string"
                  },
                  "timeoutSeconds": {
                    "default": 30,
                    "description": "Number of seconds after which the probe times out.",
                    "title": "Timeoutseconds",
                    "type": "integer"
                  }
                },
                "required": [
                  "path"
                ],
                "title": "ProbeConfig",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Container readiness check configuration."
          },
          "securityContext": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                "properties": {
                  "allowPrivilegeEscalation": {
                    "anyOf": [
                      {
                        "type": "boolean"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                    "title": "Allowprivilegeescalation"
                  },
                  "capabilities": {
                    "anyOf": [
                      {
                        "additionalProperties": false,
                        "description": "Linux capabilities to add or drop from the container.",
                        "properties": {
                          "add": {
                            "anyOf": [
                              {
                                "items": {
                                  "type": "string"
                                },
                                "type": "array"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Capabilities to add.",
                            "title": "Add"
                          },
                          "drop": {
                            "anyOf": [
                              {
                                "items": {
                                  "type": "string"
                                },
                                "type": "array"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Capabilities to drop.",
                            "title": "Drop"
                          }
                        },
                        "title": "Capabilities",
                        "type": "object"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Linux capabilities to add or drop."
                  },
                  "readOnlyRootFilesystem": {
                    "anyOf": [
                      {
                        "type": "boolean"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Whether the root filesystem is read-only.",
                    "title": "Readonlyrootfilesystem"
                  },
                  "seccompProfile": {
                    "anyOf": [
                      {
                        "additionalProperties": false,
                        "description": "Seccomp profile configuration.",
                        "properties": {
                          "localhostProfile": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                            "title": "Localhostprofile"
                          },
                          "type": {
                            "description": "Allowed seccomp profile types.",
                            "enum": [
                              "RuntimeDefault",
                              "Unconfined",
                              "Localhost"
                            ],
                            "title": "SeccompProfileType",
                            "type": "string"
                          }
                        },
                        "required": [
                          "type"
                        ],
                        "title": "SeccompProfile",
                        "type": "object"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Seccomp profile for the container."
                  }
                },
                "title": "SecurityContext",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Container security context."
          },
          "startupProbe": {
            "anyOf": [
              {
                "additionalProperties": false,
                "properties": {
                  "failureThreshold": {
                    "default": 3,
                    "description": "Minimum consecutive failures for the probe to be considered failed.",
                    "title": "Failurethreshold",
                    "type": "integer"
                  },
                  "host": {
                    "anyOf": [
                      {
                        "minLength": 0,
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Host name to connect to, defaults to the pod ip.",
                    "title": "Host"
                  },
                  "httpHeaders": {
                    "additionalProperties": {
                      "type": "string"
                    },
                    "description": "HTTP headers for probe.",
                    "title": "Httpheaders",
                    "type": "object"
                  },
                  "initialDelaySeconds": {
                    "default": 30,
                    "description": "Number of seconds to wait before the first probe is executed.",
                    "title": "Initialdelayseconds",
                    "type": "integer"
                  },
                  "path": {
                    "description": "Url path to query for health check.",
                    "title": "Path",
                    "type": "string"
                  },
                  "periodSeconds": {
                    "default": 30,
                    "description": "How often (in seconds) to perform the probe.",
                    "title": "Periodseconds",
                    "type": "integer"
                  },
                  "port": {
                    "default": 8080,
                    "description": "Port number to access on the container.",
                    "maximum": 65535,
                    "minimum": 1,
                    "title": "Port",
                    "type": "integer"
                  },
                  "scheme": {
                    "default": "HTTP",
                    "description": "Scheme to use for connecting to the host.",
                    "enum": [
                      "HTTP",
                      "HTTPS"
                    ],
                    "title": "Scheme",
                    "type": "string"
                  },
                  "timeoutSeconds": {
                    "default": 30,
                    "description": "Number of seconds after which the probe times out.",
                    "title": "Timeoutseconds",
                    "type": "integer"
                  }
                },
                "required": [
                  "path"
                ],
                "title": "ProbeConfig",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Container startup check configuration."
          }
        },
        "title": "Container",
        "type": "object"
      },
      "title": "Containers",
      "type": "array"
    },
    "name": {
      "default": "default",
      "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
      "title": "Name",
      "type": "string"
    }
  },
  "title": "ContainerGroup",
  "type": "object"
}

ContainerGroup

Properties

Name Type Required Restrictions Description
containers [Container] false List of containers making this container group.
name string false Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.

ContainerOverride

{
  "additionalProperties": false,
  "description": "Runtime diff targeting a single named container within a group.",
  "properties": {
    "name": {
      "description": "Container name. must match a container declared in the artifact group.",
      "title": "Name",
      "type": "string"
    },
    "resourceAllocation": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Per-container resource allocation declared at runtime.",
          "properties": {
            "cpu": {
              "anyOf": [
                {
                  "minimum": 0.1,
                  "type": "number"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Cpu cores allocated to this container.",
              "title": "Cpu"
            },
            "gpu": {
              "anyOf": [
                {
                  "minimum": 0,
                  "type": "number"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Gpus allocated to this container.",
              "title": "Gpu"
            },
            "memory": {
              "anyOf": [
                {
                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                  "type": "string"
                },
                {
                  "minimum": 0,
                  "type": "integer"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
              "examples": [
                "8GB",
                "512MB"
              ],
              "title": "Memory"
            }
          },
          "title": "ResourceAllocation",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Resource allocation for this container. required for multi-container groups."
    }
  },
  "required": [
    "name"
  ],
  "title": "ContainerOverride",
  "type": "object"
}

ContainerOverride

Properties

Name Type Required Restrictions Description
name string true Container name. must match a container declared in the artifact group.
resourceAllocation any false Resource allocation for this container. required for multi-container groups.

anyOf

Name Type Required Restrictions Description
» anonymous ResourceAllocation false Per-container resource allocation declared at runtime.

or

Name Type Required Restrictions Description
» anonymous null false none

ContainerStatus

{
  "description": "Lifecycle state of a container within a deployment replica.",
  "enum": [
    "running",
    "waiting",
    "terminated",
    "unknown"
  ],
  "title": "ContainerStatus",
  "type": "string"
}

ContainerStatus

Properties

Name Type Required Restrictions Description
ContainerStatus string false Lifecycle state of a container within a deployment replica.

Enumerated Values

Property Value
ContainerStatus [running, waiting, terminated, unknown]

ContainerStatusDetail

{
  "additionalProperties": false,
  "properties": {
    "image": {
      "title": "Image",
      "type": "string"
    },
    "name": {
      "title": "Name",
      "type": "string"
    },
    "ready": {
      "title": "Ready",
      "type": "boolean"
    },
    "restartCount": {
      "title": "Restartcount",
      "type": "integer"
    },
    "startedAt": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Startedat"
    },
    "status": {
      "description": "Lifecycle state of a container within a deployment replica.",
      "enum": [
        "running",
        "waiting",
        "terminated",
        "unknown"
      ],
      "title": "ContainerStatus",
      "type": "string"
    }
  },
  "required": [
    "name",
    "status",
    "startedAt",
    "ready",
    "restartCount",
    "image"
  ],
  "title": "ContainerStatusDetail",
  "type": "object"
}

ContainerStatusDetail

Properties

Name Type Required Restrictions Description
image string true none
name string true none
ready boolean true none
restartCount integer true none
startedAt any true none

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
status ContainerStatus true Lifecycle state of a container within a deployment replica.

CreateWorkloadRequest

{
  "additionalProperties": false,
  "description": "Request to create a new workload.",
  "properties": {
    "artifact": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Request to create an artifact. an artifact is always created as a draft.",
          "properties": {
            "artifactRepositoryId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the artifact repository this artifact belongs to (for versioning support).",
              "title": "Artifactrepositoryid"
            },
            "description": {
              "default": "",
              "description": "Description of the artifact.",
              "title": "Description",
              "type": "string"
            },
            "name": {
              "description": "Name of the artifact.",
              "maxLength": 5000,
              "minLength": 1,
              "title": "Name",
              "type": "string"
            },
            "spec": {
              "description": "Artifact specification.",
              "discriminator": {
                "mapping": {
                  "nim": "#/components/schemas/NimArtifactSpec",
                  "service": "#/components/schemas/ServiceArtifactSpec"
                },
                "propertyName": "type"
              },
              "oneOf": [
                {
                  "additionalProperties": false,
                  "properties": {
                    "containerGroups": {
                      "default": [],
                      "description": "List of container groups.",
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "containers": {
                            "default": [],
                            "description": "List of containers making this container group.",
                            "items": {
                              "additionalProperties": false,
                              "properties": {
                                "build": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Build reference embedded in a container spec when an image build is triggered.",
                                      "properties": {
                                        "artifactImageBuildId": {
                                          "description": "Artifact image build id.",
                                          "title": "Artifactimagebuildid",
                                          "type": "string"
                                        },
                                        "createdAt": {
                                          "description": "Build creation timestamp (utc).",
                                          "format": "date-time",
                                          "title": "Createdat",
                                          "type": "string"
                                        },
                                        "status": {
                                          "description": "Image build reported status at submit time.",
                                          "title": "Status",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "artifactImageBuildId",
                                        "status",
                                        "createdAt"
                                      ],
                                      "title": "ContainerBuildInfo",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
                                },
                                "description": {
                                  "default": "",
                                  "description": "Description of the container.",
                                  "title": "Description",
                                  "type": "string"
                                },
                                "entrypoint": {
                                  "anyOf": [
                                    {
                                      "items": {
                                        "type": "string"
                                      },
                                      "type": "array"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
                                  "title": "Entrypoint"
                                },
                                "environmentVars": {
                                  "default": [],
                                  "description": "Environment variables.",
                                  "items": {
                                    "anyOf": [
                                      {
                                        "properties": {
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "string",
                                            "default": "string",
                                            "title": "Source",
                                            "type": "string"
                                          },
                                          "value": {
                                            "description": "Value of the environment variable.",
                                            "title": "Value",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "name",
                                          "value"
                                        ],
                                        "title": "StringEnvironmentVariable",
                                        "type": "object"
                                      },
                                      {
                                        "properties": {
                                          "drCredentialId": {
                                            "description": "Id of the datarobot credential to use.",
                                            "title": "DR Credential ID",
                                            "type": "string"
                                          },
                                          "key": {
                                            "description": "Key within the credential.",
                                            "title": "Key",
                                            "type": "string"
                                          },
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "dr-credential",
                                            "title": "Source",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "source",
                                          "name",
                                          "drCredentialId",
                                          "key"
                                        ],
                                        "title": "CredentialEnvironmentVariable",
                                        "type": "object"
                                      },
                                      {
                                        "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                                        "properties": {
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "dr-api-token",
                                            "title": "Source",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "source",
                                          "name"
                                        ],
                                        "title": "DrApiTokenEnvironmentVariable",
                                        "type": "object"
                                      }
                                    ]
                                  },
                                  "title": "Environmentvars",
                                  "type": "array"
                                },
                                "imageBuildConfig": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "User-provided configuration for server-side image builds from source code.",
                                      "properties": {
                                        "codeRef": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "properties": {
                                                "datarobot": {
                                                  "additionalProperties": false,
                                                  "properties": {
                                                    "catalogId": {
                                                      "title": "Catalogid",
                                                      "type": "string"
                                                    },
                                                    "catalogVersionId": {
                                                      "title": "Catalogversionid",
                                                      "type": "string"
                                                    }
                                                  },
                                                  "required": [
                                                    "catalogId",
                                                    "catalogVersionId"
                                                  ],
                                                  "title": "DataRobotCodeRef",
                                                  "type": "object"
                                                },
                                                "provider": {
                                                  "const": "datarobot",
                                                  "default": "datarobot",
                                                  "title": "Provider",
                                                  "type": "string"
                                                },
                                                "type": {
                                                  "const": "datarobot",
                                                  "default": "datarobot",
                                                  "title": "Type",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "datarobot"
                                              ],
                                              "title": "CodeRef",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                                        },
                                        "dockerfile": {
                                          "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                                          "discriminator": {
                                            "mapping": {
                                              "generated": "#/components/schemas/GeneratedDockerfile",
                                              "provided": "#/components/schemas/ProvidedDockerfile"
                                            },
                                            "propertyName": "source"
                                          },
                                          "oneOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "User supplies a dockerfile in the uploaded source code.",
                                              "properties": {
                                                "path": {
                                                  "default": "./Dockerfile",
                                                  "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                                                  "title": "Path",
                                                  "type": "string"
                                                },
                                                "source": {
                                                  "const": "provided",
                                                  "default": "provided",
                                                  "title": "Source",
                                                  "type": "string"
                                                }
                                              },
                                              "title": "ProvidedDockerfile",
                                              "type": "object"
                                            },
                                            {
                                              "additionalProperties": false,
                                              "description": "System generates a dockerfile from execution environment metadata.",
                                              "properties": {
                                                "entrypoint": {
                                                  "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                                                  "items": {
                                                    "type": "string"
                                                  },
                                                  "minItems": 1,
                                                  "title": "Entrypoint",
                                                  "type": "array"
                                                },
                                                "executionEnvironmentId": {
                                                  "description": "Execution environment id used to resolve the base Docker image.",
                                                  "title": "Execution Environment ID",
                                                  "type": "string"
                                                },
                                                "executionEnvironmentVersionId": {
                                                  "description": "Execution environment version id that pins the exact base image tag.",
                                                  "title": "Execution Environment Version ID",
                                                  "type": "string"
                                                },
                                                "source": {
                                                  "const": "generated",
                                                  "default": "generated",
                                                  "title": "Source",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "executionEnvironmentId",
                                                "executionEnvironmentVersionId",
                                                "entrypoint"
                                              ],
                                              "title": "GeneratedDockerfile",
                                              "type": "object"
                                            }
                                          ],
                                          "title": "Dockerfile"
                                        }
                                      },
                                      "title": "ImageBuildConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Configuration for server-side image builds from source code."
                                },
                                "imageUri": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
                                  "title": "Imageuri"
                                },
                                "livenessProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container liveness check configuration."
                                },
                                "name": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                                  "title": "Name"
                                },
                                "port": {
                                  "anyOf": [
                                    {
                                      "maximum": 65535,
                                      "minimum": 1024,
                                      "type": "integer"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
                                  "title": "Port"
                                },
                                "primary": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "default": false,
                                  "description": "Whether this is the primary container.",
                                  "title": "Primary"
                                },
                                "readinessProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container readiness check configuration."
                                },
                                "securityContext": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                                      "properties": {
                                        "allowPrivilegeEscalation": {
                                          "anyOf": [
                                            {
                                              "type": "boolean"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                                          "title": "Allowprivilegeescalation"
                                        },
                                        "capabilities": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "Linux capabilities to add or drop from the container.",
                                              "properties": {
                                                "add": {
                                                  "anyOf": [
                                                    {
                                                      "items": {
                                                        "type": "string"
                                                      },
                                                      "type": "array"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Capabilities to add.",
                                                  "title": "Add"
                                                },
                                                "drop": {
                                                  "anyOf": [
                                                    {
                                                      "items": {
                                                        "type": "string"
                                                      },
                                                      "type": "array"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Capabilities to drop.",
                                                  "title": "Drop"
                                                }
                                              },
                                              "title": "Capabilities",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Linux capabilities to add or drop."
                                        },
                                        "readOnlyRootFilesystem": {
                                          "anyOf": [
                                            {
                                              "type": "boolean"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Whether the root filesystem is read-only.",
                                          "title": "Readonlyrootfilesystem"
                                        },
                                        "seccompProfile": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "Seccomp profile configuration.",
                                              "properties": {
                                                "localhostProfile": {
                                                  "anyOf": [
                                                    {
                                                      "type": "string"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                                                  "title": "Localhostprofile"
                                                },
                                                "type": {
                                                  "description": "Allowed seccomp profile types.",
                                                  "enum": [
                                                    "RuntimeDefault",
                                                    "Unconfined",
                                                    "Localhost"
                                                  ],
                                                  "title": "SeccompProfileType",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "type"
                                              ],
                                              "title": "SeccompProfile",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Seccomp profile for the container."
                                        }
                                      },
                                      "title": "SecurityContext",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container security context."
                                },
                                "startupProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container startup check configuration."
                                }
                              },
                              "title": "Container",
                              "type": "object"
                            },
                            "title": "Containers",
                            "type": "array"
                          },
                          "name": {
                            "default": "default",
                            "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                            "title": "Name",
                            "type": "string"
                          }
                        },
                        "title": "ContainerGroup",
                        "type": "object"
                      },
                      "title": "Containergroups",
                      "type": "array"
                    },
                    "type": {
                      "const": "service",
                      "default": "service",
                      "description": "Artifact type discriminator. injected automatically from the top-level `type` field — do not set this directly.",
                      "title": "Type",
                      "type": "string"
                    }
                  },
                  "title": "ServiceArtifactSpec",
                  "type": "object"
                },
                {
                  "additionalProperties": false,
                  "properties": {
                    "containerGroups": {
                      "default": [],
                      "description": "List of container groups.",
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "containers": {
                            "default": [],
                            "description": "List of containers making this container group.",
                            "items": {
                              "additionalProperties": false,
                              "properties": {
                                "build": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Build reference embedded in a container spec when an image build is triggered.",
                                      "properties": {
                                        "artifactImageBuildId": {
                                          "description": "Artifact image build id.",
                                          "title": "Artifactimagebuildid",
                                          "type": "string"
                                        },
                                        "createdAt": {
                                          "description": "Build creation timestamp (utc).",
                                          "format": "date-time",
                                          "title": "Createdat",
                                          "type": "string"
                                        },
                                        "status": {
                                          "description": "Image build reported status at submit time.",
                                          "title": "Status",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "artifactImageBuildId",
                                        "status",
                                        "createdAt"
                                      ],
                                      "title": "ContainerBuildInfo",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
                                },
                                "description": {
                                  "default": "",
                                  "description": "Description of the container.",
                                  "title": "Description",
                                  "type": "string"
                                },
                                "entrypoint": {
                                  "anyOf": [
                                    {
                                      "items": {
                                        "type": "string"
                                      },
                                      "type": "array"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
                                  "title": "Entrypoint"
                                },
                                "environmentVars": {
                                  "default": [],
                                  "description": "Environment variables.",
                                  "items": {
                                    "anyOf": [
                                      {
                                        "properties": {
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "string",
                                            "default": "string",
                                            "title": "Source",
                                            "type": "string"
                                          },
                                          "value": {
                                            "description": "Value of the environment variable.",
                                            "title": "Value",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "name",
                                          "value"
                                        ],
                                        "title": "StringEnvironmentVariable",
                                        "type": "object"
                                      },
                                      {
                                        "properties": {
                                          "drCredentialId": {
                                            "description": "Id of the datarobot credential to use.",
                                            "title": "DR Credential ID",
                                            "type": "string"
                                          },
                                          "key": {
                                            "description": "Key within the credential.",
                                            "title": "Key",
                                            "type": "string"
                                          },
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "dr-credential",
                                            "title": "Source",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "source",
                                          "name",
                                          "drCredentialId",
                                          "key"
                                        ],
                                        "title": "CredentialEnvironmentVariable",
                                        "type": "object"
                                      },
                                      {
                                        "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                                        "properties": {
                                          "name": {
                                            "description": "Name of the environment variable.",
                                            "title": "Name",
                                            "type": "string"
                                          },
                                          "source": {
                                            "const": "dr-api-token",
                                            "title": "Source",
                                            "type": "string"
                                          }
                                        },
                                        "required": [
                                          "source",
                                          "name"
                                        ],
                                        "title": "DrApiTokenEnvironmentVariable",
                                        "type": "object"
                                      }
                                    ]
                                  },
                                  "title": "Environmentvars",
                                  "type": "array"
                                },
                                "imageBuildConfig": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "User-provided configuration for server-side image builds from source code.",
                                      "properties": {
                                        "codeRef": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "properties": {
                                                "datarobot": {
                                                  "additionalProperties": false,
                                                  "properties": {
                                                    "catalogId": {
                                                      "title": "Catalogid",
                                                      "type": "string"
                                                    },
                                                    "catalogVersionId": {
                                                      "title": "Catalogversionid",
                                                      "type": "string"
                                                    }
                                                  },
                                                  "required": [
                                                    "catalogId",
                                                    "catalogVersionId"
                                                  ],
                                                  "title": "DataRobotCodeRef",
                                                  "type": "object"
                                                },
                                                "provider": {
                                                  "const": "datarobot",
                                                  "default": "datarobot",
                                                  "title": "Provider",
                                                  "type": "string"
                                                },
                                                "type": {
                                                  "const": "datarobot",
                                                  "default": "datarobot",
                                                  "title": "Type",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "datarobot"
                                              ],
                                              "title": "CodeRef",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                                        },
                                        "dockerfile": {
                                          "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                                          "discriminator": {
                                            "mapping": {
                                              "generated": "#/components/schemas/GeneratedDockerfile",
                                              "provided": "#/components/schemas/ProvidedDockerfile"
                                            },
                                            "propertyName": "source"
                                          },
                                          "oneOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "User supplies a dockerfile in the uploaded source code.",
                                              "properties": {
                                                "path": {
                                                  "default": "./Dockerfile",
                                                  "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                                                  "title": "Path",
                                                  "type": "string"
                                                },
                                                "source": {
                                                  "const": "provided",
                                                  "default": "provided",
                                                  "title": "Source",
                                                  "type": "string"
                                                }
                                              },
                                              "title": "ProvidedDockerfile",
                                              "type": "object"
                                            },
                                            {
                                              "additionalProperties": false,
                                              "description": "System generates a dockerfile from execution environment metadata.",
                                              "properties": {
                                                "entrypoint": {
                                                  "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                                                  "items": {
                                                    "type": "string"
                                                  },
                                                  "minItems": 1,
                                                  "title": "Entrypoint",
                                                  "type": "array"
                                                },
                                                "executionEnvironmentId": {
                                                  "description": "Execution environment id used to resolve the base Docker image.",
                                                  "title": "Execution Environment ID",
                                                  "type": "string"
                                                },
                                                "executionEnvironmentVersionId": {
                                                  "description": "Execution environment version id that pins the exact base image tag.",
                                                  "title": "Execution Environment Version ID",
                                                  "type": "string"
                                                },
                                                "source": {
                                                  "const": "generated",
                                                  "default": "generated",
                                                  "title": "Source",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "executionEnvironmentId",
                                                "executionEnvironmentVersionId",
                                                "entrypoint"
                                              ],
                                              "title": "GeneratedDockerfile",
                                              "type": "object"
                                            }
                                          ],
                                          "title": "Dockerfile"
                                        }
                                      },
                                      "title": "ImageBuildConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Configuration for server-side image builds from source code."
                                },
                                "imageUri": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
                                  "title": "Imageuri"
                                },
                                "livenessProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container liveness check configuration."
                                },
                                "name": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                                  "title": "Name"
                                },
                                "port": {
                                  "anyOf": [
                                    {
                                      "maximum": 65535,
                                      "minimum": 1024,
                                      "type": "integer"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
                                  "title": "Port"
                                },
                                "primary": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "default": false,
                                  "description": "Whether this is the primary container.",
                                  "title": "Primary"
                                },
                                "readinessProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container readiness check configuration."
                                },
                                "securityContext": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                                      "properties": {
                                        "allowPrivilegeEscalation": {
                                          "anyOf": [
                                            {
                                              "type": "boolean"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                                          "title": "Allowprivilegeescalation"
                                        },
                                        "capabilities": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "Linux capabilities to add or drop from the container.",
                                              "properties": {
                                                "add": {
                                                  "anyOf": [
                                                    {
                                                      "items": {
                                                        "type": "string"
                                                      },
                                                      "type": "array"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Capabilities to add.",
                                                  "title": "Add"
                                                },
                                                "drop": {
                                                  "anyOf": [
                                                    {
                                                      "items": {
                                                        "type": "string"
                                                      },
                                                      "type": "array"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Capabilities to drop.",
                                                  "title": "Drop"
                                                }
                                              },
                                              "title": "Capabilities",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Linux capabilities to add or drop."
                                        },
                                        "readOnlyRootFilesystem": {
                                          "anyOf": [
                                            {
                                              "type": "boolean"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Whether the root filesystem is read-only.",
                                          "title": "Readonlyrootfilesystem"
                                        },
                                        "seccompProfile": {
                                          "anyOf": [
                                            {
                                              "additionalProperties": false,
                                              "description": "Seccomp profile configuration.",
                                              "properties": {
                                                "localhostProfile": {
                                                  "anyOf": [
                                                    {
                                                      "type": "string"
                                                    },
                                                    {
                                                      "type": "null"
                                                    }
                                                  ],
                                                  "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                                                  "title": "Localhostprofile"
                                                },
                                                "type": {
                                                  "description": "Allowed seccomp profile types.",
                                                  "enum": [
                                                    "RuntimeDefault",
                                                    "Unconfined",
                                                    "Localhost"
                                                  ],
                                                  "title": "SeccompProfileType",
                                                  "type": "string"
                                                }
                                              },
                                              "required": [
                                                "type"
                                              ],
                                              "title": "SeccompProfile",
                                              "type": "object"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Seccomp profile for the container."
                                        }
                                      },
                                      "title": "SecurityContext",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container security context."
                                },
                                "startupProbe": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "failureThreshold": {
                                          "default": 3,
                                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                                          "title": "Failurethreshold",
                                          "type": "integer"
                                        },
                                        "host": {
                                          "anyOf": [
                                            {
                                              "minLength": 0,
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Host name to connect to, defaults to the pod ip.",
                                          "title": "Host"
                                        },
                                        "httpHeaders": {
                                          "additionalProperties": {
                                            "type": "string"
                                          },
                                          "description": "HTTP headers for probe.",
                                          "title": "Httpheaders",
                                          "type": "object"
                                        },
                                        "initialDelaySeconds": {
                                          "default": 30,
                                          "description": "Number of seconds to wait before the first probe is executed.",
                                          "title": "Initialdelayseconds",
                                          "type": "integer"
                                        },
                                        "path": {
                                          "description": "Url path to query for health check.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "periodSeconds": {
                                          "default": 30,
                                          "description": "How often (in seconds) to perform the probe.",
                                          "title": "Periodseconds",
                                          "type": "integer"
                                        },
                                        "port": {
                                          "default": 8080,
                                          "description": "Port number to access on the container.",
                                          "maximum": 65535,
                                          "minimum": 1,
                                          "title": "Port",
                                          "type": "integer"
                                        },
                                        "scheme": {
                                          "default": "HTTP",
                                          "description": "Scheme to use for connecting to the host.",
                                          "enum": [
                                            "HTTP",
                                            "HTTPS"
                                          ],
                                          "title": "Scheme",
                                          "type": "string"
                                        },
                                        "timeoutSeconds": {
                                          "default": 30,
                                          "description": "Number of seconds after which the probe times out.",
                                          "title": "Timeoutseconds",
                                          "type": "integer"
                                        }
                                      },
                                      "required": [
                                        "path"
                                      ],
                                      "title": "ProbeConfig",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Container startup check configuration."
                                }
                              },
                              "title": "Container",
                              "type": "object"
                            },
                            "title": "Containers",
                            "type": "array"
                          },
                          "name": {
                            "default": "default",
                            "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                            "title": "Name",
                            "type": "string"
                          }
                        },
                        "title": "ContainerGroup",
                        "type": "object"
                      },
                      "title": "Containergroups",
                      "type": "array"
                    },
                    "storage": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Model weight storage configuration for nim artifacts.",
                          "properties": {
                            "mode": {
                              "default": "dedicatedPvc",
                              "description": "Storage mode for model weights. `dedicatedpvc` (default) provisions a separate pvc owned exclusively by this workload. `nimcache` reuses a single cluster-wide pvc per model image, shared across all workloads using the same model.",
                              "enum": [
                                "dedicatedPvc",
                                "nimCache"
                              ],
                              "title": "Mode",
                              "type": "string"
                            },
                            "pvcSize": {
                              "anyOf": [
                                {
                                  "pattern": "^\\d+(\\.\\d+)?(Gi|Mi|Ti)$",
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Pvc size for dedicated storage (e.g. '150gi'). only applies when mode is `dedicatedpvc`. when omitted, the platform-configured default is used.",
                              "title": "Pvcsize"
                            }
                          },
                          "title": "NimStorageConfig",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Model weight storage configuration. when omitted, defaults to a dedicated per-workload pvc provisioned exclusively for this workload."
                    },
                    "templateId": {
                      "anyOf": [
                        {
                          "type": "string"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Id of the template used to create this nim artifact.",
                      "title": "Templateid"
                    },
                    "type": {
                      "const": "nim",
                      "default": "nim",
                      "description": "Artifact type discriminator. injected automatically from the top-level `type` field — do not set this directly.",
                      "title": "Type",
                      "type": "string"
                    }
                  },
                  "title": "NimArtifactSpec",
                  "type": "object"
                }
              ],
              "title": "Spec"
            },
            "status": {
              "enum": [
                "draft",
                "locked"
              ],
              "title": "ArtifactStatus",
              "type": "string"
            },
            "type": {
              "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
              "enum": [
                "service",
                "nim"
              ],
              "title": "ArtifactType",
              "type": "string"
            }
          },
          "required": [
            "name",
            "spec"
          ],
          "title": "InputArtifact",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Inline artifact spec to create and deploy in one step.",
      "title": "Artifact"
    },
    "artifactId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of an existing artifact to deploy.",
      "title": "Artifact ID"
    },
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Workload description.",
      "title": "Description"
    },
    "importance": {
      "description": "Importance level for workloads.",
      "enum": [
        "critical",
        "high",
        "moderate",
        "low"
      ],
      "title": "WorkloadImportance",
      "type": "string"
    },
    "name": {
      "description": "Workload name.",
      "maxLength": 5000,
      "minLength": 1,
      "title": "Name",
      "type": "string"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    }
  },
  "required": [
    "name"
  ],
  "title": "CreateWorkloadRequest",
  "type": "object"
}

CreateWorkloadRequest

Properties

Name Type Required Restrictions Description
artifact any false Inline artifact spec to create and deploy in one step.

anyOf

Name Type Required Restrictions Description
» anonymous InputArtifact false Request to create an artifact. an artifact is always created as a draft.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
artifactId any false Id of an existing artifact to deploy.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
description any false Workload description.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
importance WorkloadImportance false Workload importance level.
name string true maxLength: 5000
minLength: 1
minLength: 1
Workload name.
runtime WorkloadRuntime false Runtime configuration.

CredentialEnvironmentVariable

{
  "properties": {
    "drCredentialId": {
      "description": "Id of the datarobot credential to use.",
      "title": "DR Credential ID",
      "type": "string"
    },
    "key": {
      "description": "Key within the credential.",
      "title": "Key",
      "type": "string"
    },
    "name": {
      "description": "Name of the environment variable.",
      "title": "Name",
      "type": "string"
    },
    "source": {
      "const": "dr-credential",
      "title": "Source",
      "type": "string"
    }
  },
  "required": [
    "source",
    "name",
    "drCredentialId",
    "key"
  ],
  "title": "CredentialEnvironmentVariable",
  "type": "object"
}

CredentialEnvironmentVariable

Properties

Name Type Required Restrictions Description
drCredentialId string true Id of the datarobot credential to use.
key string true Key within the credential.
name string true Name of the environment variable.
source string true none

DataRobotCodeRef

{
  "additionalProperties": false,
  "properties": {
    "catalogId": {
      "title": "Catalogid",
      "type": "string"
    },
    "catalogVersionId": {
      "title": "Catalogversionid",
      "type": "string"
    }
  },
  "required": [
    "catalogId",
    "catalogVersionId"
  ],
  "title": "DataRobotCodeRef",
  "type": "object"
}

DataRobotCodeRef

Properties

Name Type Required Restrictions Description
catalogId string true none
catalogVersionId string true none

DrApiTokenEnvironmentVariable

{
  "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
  "properties": {
    "name": {
      "description": "Name of the environment variable.",
      "title": "Name",
      "type": "string"
    },
    "source": {
      "const": "dr-api-token",
      "title": "Source",
      "type": "string"
    }
  },
  "required": [
    "source",
    "name"
  ],
  "title": "DrApiTokenEnvironmentVariable",
  "type": "object"
}

DrApiTokenEnvironmentVariable

Properties

Name Type Required Restrictions Description
name string true Name of the environment variable.
source string true none

GeneratedDockerfile

{
  "additionalProperties": false,
  "description": "System generates a dockerfile from execution environment metadata.",
  "properties": {
    "entrypoint": {
      "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
      "items": {
        "type": "string"
      },
      "minItems": 1,
      "title": "Entrypoint",
      "type": "array"
    },
    "executionEnvironmentId": {
      "description": "Execution environment id used to resolve the base Docker image.",
      "title": "Execution Environment ID",
      "type": "string"
    },
    "executionEnvironmentVersionId": {
      "description": "Execution environment version id that pins the exact base image tag.",
      "title": "Execution Environment Version ID",
      "type": "string"
    },
    "source": {
      "const": "generated",
      "default": "generated",
      "title": "Source",
      "type": "string"
    }
  },
  "required": [
    "executionEnvironmentId",
    "executionEnvironmentVersionId",
    "entrypoint"
  ],
  "title": "GeneratedDockerfile",
  "type": "object"
}

GeneratedDockerfile

Properties

Name Type Required Restrictions Description
entrypoint [string] true minItems: 1
Entrypoint baked into the generated dockerfile cmd (e.g. ["python", "app.py"]).
executionEnvironmentId string true Execution environment id used to resolve the base Docker image.
executionEnvironmentVersionId string true Execution environment version id that pins the exact base image tag.
source string false none

GrantAccessControlWithId

{
  "additionalProperties": false,
  "description": "Grant access control request using id for recipient identification. can be used for users, groups, or organizations.",
  "properties": {
    "id": {
      "description": "The id of the recipient.",
      "title": "Id",
      "type": "string"
    },
    "role": {
      "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
      "enum": [
        "NO_ROLE",
        "OWNER",
        "READ_WRITE",
        "EDITOR",
        "USER",
        "DATA_SCIENTIST",
        "ADMIN",
        "READ_ONLY",
        "CONSUMER",
        "OBSERVER"
      ],
      "title": "SharingRole",
      "type": "string"
    },
    "shareRecipientType": {
      "description": "Enum of possible subject types.",
      "enum": [
        "user",
        "group",
        "organization",
        "role"
      ],
      "title": "SubjectType",
      "type": "string"
    }
  },
  "required": [
    "shareRecipientType",
    "role",
    "id"
  ],
  "title": "GrantAccessControlWithId",
  "type": "object"
}

GrantAccessControlWithId

Properties

Name Type Required Restrictions Description
id string true The id of the recipient.
role SharingRole true The role of the recipient on this entity.
shareRecipientType SubjectType true The type of the recipient.

GrantAccessControlWithUsername

{
  "additionalProperties": false,
  "description": "Grant access control request using username for user identification.",
  "properties": {
    "role": {
      "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
      "enum": [
        "NO_ROLE",
        "OWNER",
        "READ_WRITE",
        "EDITOR",
        "USER",
        "DATA_SCIENTIST",
        "ADMIN",
        "READ_ONLY",
        "CONSUMER",
        "OBSERVER"
      ],
      "title": "SharingRole",
      "type": "string"
    },
    "shareRecipientType": {
      "description": "Enum of possible subject types.",
      "enum": [
        "user",
        "group",
        "organization",
        "role"
      ],
      "title": "SubjectType",
      "type": "string"
    },
    "username": {
      "description": "Username of the user to update the access role for.",
      "title": "Username",
      "type": "string"
    }
  },
  "required": [
    "shareRecipientType",
    "role",
    "username"
  ],
  "title": "GrantAccessControlWithUsername",
  "type": "object"
}

GrantAccessControlWithUsername

Properties

Name Type Required Restrictions Description
role SharingRole true The role of the recipient on this entity.
shareRecipientType SubjectType true The type of the recipient.
username string true Username of the user to update the access role for.

GroupRuntime

{
  "additionalProperties": false,
  "description": "Runtime configuration for a single container group.",
  "properties": {
    "autoscaling": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Autoscaling configuration for a proton.",
          "properties": {
            "enabled": {
              "default": true,
              "description": "Whether autoscaling is enabled.",
              "title": "Enabled",
              "type": "boolean"
            },
            "policies": {
              "items": {
                "additionalProperties": false,
                "description": "Base class for autoscaling policies.",
                "properties": {
                  "maxCount": {
                    "description": "Maximum number of replicas.",
                    "minimum": 0,
                    "title": "Max Count",
                    "type": "integer"
                  },
                  "minCount": {
                    "description": "Minimum number of replicas.",
                    "minimum": 0,
                    "title": "Min Count",
                    "type": "integer"
                  },
                  "priority": {
                    "anyOf": [
                      {
                        "type": "integer"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Policy priority when multiple policies are defined.",
                    "title": "Priority"
                  },
                  "scalingMetric": {
                    "anyOf": [
                      {
                        "oneOf": [
                          {
                            "const": "cpuAverageUtilization",
                            "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                            "title": "CPU Average Utilization"
                          },
                          {
                            "const": "httpRequestsConcurrency",
                            "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                            "title": "HTTP Requests Concurrency"
                          },
                          {
                            "const": "gpuCacheUtilization",
                            "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                            "title": "GPU Cache Utilization"
                          },
                          {
                            "const": "gpuRequestQueueDepth",
                            "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                            "title": "GPU Request Queue Depth"
                          }
                        ],
                        "title": "ScalingMetricType",
                        "type": "string"
                      },
                      {
                        "type": "string"
                      }
                    ],
                    "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                    "title": "Scaling Metric"
                  },
                  "target": {
                    "description": "Target value for the scaling metric.",
                    "minimum": 0,
                    "title": "Target",
                    "type": "number"
                  }
                },
                "required": [
                  "scalingMetric",
                  "target",
                  "minCount",
                  "maxCount"
                ],
                "title": "AutoscalingPolicy",
                "type": "object"
              },
              "title": "Policies",
              "type": "array"
            }
          },
          "required": [
            "policies"
          ],
          "title": "AutoscalingProperties",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Autoscaling configuration for this group. takes precedence over replicacount."
    },
    "bundleSelectionPolicy": {
      "enum": [
        "availability"
      ],
      "title": "BundleSelectionPolicy",
      "type": "string"
    },
    "containers": {
      "description": "Per-container overrides for this group.",
      "items": {
        "additionalProperties": false,
        "description": "Runtime diff targeting a single named container within a group.",
        "properties": {
          "name": {
            "description": "Container name. must match a container declared in the artifact group.",
            "title": "Name",
            "type": "string"
          },
          "resourceAllocation": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "Per-container resource allocation declared at runtime.",
                "properties": {
                  "cpu": {
                    "anyOf": [
                      {
                        "minimum": 0.1,
                        "type": "number"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Cpu cores allocated to this container.",
                    "title": "Cpu"
                  },
                  "gpu": {
                    "anyOf": [
                      {
                        "minimum": 0,
                        "type": "number"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Gpus allocated to this container.",
                    "title": "Gpu"
                  },
                  "memory": {
                    "anyOf": [
                      {
                        "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                        "type": "string"
                      },
                      {
                        "minimum": 0,
                        "type": "integer"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                    "examples": [
                      "8GB",
                      "512MB"
                    ],
                    "title": "Memory"
                  }
                },
                "title": "ResourceAllocation",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Resource allocation for this container. required for multi-container groups."
          }
        },
        "required": [
          "name"
        ],
        "title": "ContainerOverride",
        "type": "object"
      },
      "title": "Containers",
      "type": "array"
    },
    "name": {
      "default": "default",
      "description": "Group name. must match a container group name declared in the artifact.",
      "title": "Name",
      "type": "string"
    },
    "replicaCount": {
      "anyOf": [
        {
          "minimum": 1,
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "default": 1,
      "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
      "title": "Replicacount"
    },
    "resolvedBundle": {
      "anyOf": [
        {
          "description": "Bundle details returned in the runtime response after scheduling.",
          "properties": {
            "cpuCount": {
              "description": "Number of cpu cores.",
              "title": "CPU Count",
              "type": "number"
            },
            "gpuCount": {
              "default": 0,
              "description": "Number of gpu units.",
              "title": "GPU Count",
              "type": "integer"
            },
            "gpuMaker": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Gpu manufacturer.",
              "title": "GPU Maker"
            },
            "gpuTypeLabel": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Gpu type label.",
              "title": "GPU Type Label"
            },
            "id": {
              "description": "Bundle identifier that was selected.",
              "title": "Id",
              "type": "string"
            },
            "memoryBytes": {
              "description": "Memory size in bytes.",
              "title": "Memory Bytes",
              "type": "integer"
            }
          },
          "required": [
            "id",
            "cpuCount",
            "memoryBytes"
          ],
          "title": "ResolvedBundle",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Full details of the bundle selected at scheduling time. read-only.",
      "readOnly": true
    },
    "resourceBundles": {
      "description": "Ordered list of bundle ids. one is selected at scheduling time.",
      "items": {
        "type": "string"
      },
      "title": "Resourcebundles",
      "type": "array"
    }
  },
  "title": "GroupRuntime",
  "type": "object"
}

GroupRuntime

Properties

Name Type Required Restrictions Description
autoscaling any false Autoscaling configuration for this group. takes precedence over replicacount.

anyOf

Name Type Required Restrictions Description
» anonymous AutoscalingProperties false Autoscaling configuration for a proton.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
bundleSelectionPolicy BundleSelectionPolicy false How to select among resourcebundles. default: availability.
containers [ContainerOverride] false Per-container overrides for this group.
name string false Group name. must match a container group name declared in the artifact.
replicaCount any false Number of replicas. cannot be set alongside autoscaling.enabled=true.

anyOf

Name Type Required Restrictions Description
» anonymous integer false minimum: 1
none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
resolvedBundle any false Full details of the bundle selected at scheduling time. read-only.

anyOf

Name Type Required Restrictions Description
» anonymous ResolvedBundle false Bundle details returned in the runtime response after scheduling.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
resourceBundles [string] false Ordered list of bundle ids. one is selected at scheduling time.

HTTPValidationError

{
  "properties": {
    "detail": {
      "items": {
        "properties": {
          "ctx": {
            "title": "Context",
            "type": "object"
          },
          "input": {
            "title": "Input"
          },
          "loc": {
            "items": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "integer"
                }
              ]
            },
            "title": "Location",
            "type": "array"
          },
          "msg": {
            "title": "Message",
            "type": "string"
          },
          "type": {
            "title": "Error Type",
            "type": "string"
          }
        },
        "required": [
          "loc",
          "msg",
          "type"
        ],
        "title": "ValidationError",
        "type": "object"
      },
      "title": "Detail",
      "type": "array"
    }
  },
  "title": "HTTPValidationError",
  "type": "object"
}

HTTPValidationError

Properties

Name Type Required Restrictions Description
detail [ValidationError] false none

ImageBuildConfig

{
  "additionalProperties": false,
  "description": "User-provided configuration for server-side image builds from source code.",
  "properties": {
    "codeRef": {
      "anyOf": [
        {
          "additionalProperties": false,
          "properties": {
            "datarobot": {
              "additionalProperties": false,
              "properties": {
                "catalogId": {
                  "title": "Catalogid",
                  "type": "string"
                },
                "catalogVersionId": {
                  "title": "Catalogversionid",
                  "type": "string"
                }
              },
              "required": [
                "catalogId",
                "catalogVersionId"
              ],
              "title": "DataRobotCodeRef",
              "type": "object"
            },
            "provider": {
              "const": "datarobot",
              "default": "datarobot",
              "title": "Provider",
              "type": "string"
            },
            "type": {
              "const": "datarobot",
              "default": "datarobot",
              "title": "Type",
              "type": "string"
            }
          },
          "required": [
            "datarobot"
          ],
          "title": "CodeRef",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
    },
    "dockerfile": {
      "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
      "discriminator": {
        "mapping": {
          "generated": "#/components/schemas/GeneratedDockerfile",
          "provided": "#/components/schemas/ProvidedDockerfile"
        },
        "propertyName": "source"
      },
      "oneOf": [
        {
          "additionalProperties": false,
          "description": "User supplies a dockerfile in the uploaded source code.",
          "properties": {
            "path": {
              "default": "./Dockerfile",
              "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
              "title": "Path",
              "type": "string"
            },
            "source": {
              "const": "provided",
              "default": "provided",
              "title": "Source",
              "type": "string"
            }
          },
          "title": "ProvidedDockerfile",
          "type": "object"
        },
        {
          "additionalProperties": false,
          "description": "System generates a dockerfile from execution environment metadata.",
          "properties": {
            "entrypoint": {
              "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
              "items": {
                "type": "string"
              },
              "minItems": 1,
              "title": "Entrypoint",
              "type": "array"
            },
            "executionEnvironmentId": {
              "description": "Execution environment id used to resolve the base Docker image.",
              "title": "Execution Environment ID",
              "type": "string"
            },
            "executionEnvironmentVersionId": {
              "description": "Execution environment version id that pins the exact base image tag.",
              "title": "Execution Environment Version ID",
              "type": "string"
            },
            "source": {
              "const": "generated",
              "default": "generated",
              "title": "Source",
              "type": "string"
            }
          },
          "required": [
            "executionEnvironmentId",
            "executionEnvironmentVersionId",
            "entrypoint"
          ],
          "title": "GeneratedDockerfile",
          "type": "object"
        }
      ],
      "title": "Dockerfile"
    }
  },
  "title": "ImageBuildConfig",
  "type": "object"
}

ImageBuildConfig

Properties

Name Type Required Restrictions Description
codeRef any false Reference to source code (e.g. files API catalog). optional at create time; required before build or lock.

anyOf

Name Type Required Restrictions Description
» anonymous CodeRef false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
dockerfile any false How the dockerfile is obtained. defaults to using ./dockerfile from the source code.

oneOf

Name Type Required Restrictions Description
» anonymous ProvidedDockerfile false User supplies a dockerfile in the uploaded source code.

xor

Name Type Required Restrictions Description
» anonymous GeneratedDockerfile false System generates a dockerfile from execution environment metadata.

InputArtifact

{
  "additionalProperties": false,
  "description": "Request to create an artifact. an artifact is always created as a draft.",
  "properties": {
    "artifactRepositoryId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the artifact repository this artifact belongs to (for versioning support).",
      "title": "Artifactrepositoryid"
    },
    "description": {
      "default": "",
      "description": "Description of the artifact.",
      "title": "Description",
      "type": "string"
    },
    "name": {
      "description": "Name of the artifact.",
      "maxLength": 5000,
      "minLength": 1,
      "title": "Name",
      "type": "string"
    },
    "spec": {
      "description": "Artifact specification.",
      "discriminator": {
        "mapping": {
          "nim": "#/components/schemas/NimArtifactSpec",
          "service": "#/components/schemas/ServiceArtifactSpec"
        },
        "propertyName": "type"
      },
      "oneOf": [
        {
          "additionalProperties": false,
          "properties": {
            "containerGroups": {
              "default": [],
              "description": "List of container groups.",
              "items": {
                "additionalProperties": false,
                "properties": {
                  "containers": {
                    "default": [],
                    "description": "List of containers making this container group.",
                    "items": {
                      "additionalProperties": false,
                      "properties": {
                        "build": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Build reference embedded in a container spec when an image build is triggered.",
                              "properties": {
                                "artifactImageBuildId": {
                                  "description": "Artifact image build id.",
                                  "title": "Artifactimagebuildid",
                                  "type": "string"
                                },
                                "createdAt": {
                                  "description": "Build creation timestamp (utc).",
                                  "format": "date-time",
                                  "title": "Createdat",
                                  "type": "string"
                                },
                                "status": {
                                  "description": "Image build reported status at submit time.",
                                  "title": "Status",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "artifactImageBuildId",
                                "status",
                                "createdAt"
                              ],
                              "title": "ContainerBuildInfo",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
                        },
                        "description": {
                          "default": "",
                          "description": "Description of the container.",
                          "title": "Description",
                          "type": "string"
                        },
                        "entrypoint": {
                          "anyOf": [
                            {
                              "items": {
                                "type": "string"
                              },
                              "type": "array"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
                          "title": "Entrypoint"
                        },
                        "environmentVars": {
                          "default": [],
                          "description": "Environment variables.",
                          "items": {
                            "anyOf": [
                              {
                                "properties": {
                                  "name": {
                                    "description": "Name of the environment variable.",
                                    "title": "Name",
                                    "type": "string"
                                  },
                                  "source": {
                                    "const": "string",
                                    "default": "string",
                                    "title": "Source",
                                    "type": "string"
                                  },
                                  "value": {
                                    "description": "Value of the environment variable.",
                                    "title": "Value",
                                    "type": "string"
                                  }
                                },
                                "required": [
                                  "name",
                                  "value"
                                ],
                                "title": "StringEnvironmentVariable",
                                "type": "object"
                              },
                              {
                                "properties": {
                                  "drCredentialId": {
                                    "description": "Id of the datarobot credential to use.",
                                    "title": "DR Credential ID",
                                    "type": "string"
                                  },
                                  "key": {
                                    "description": "Key within the credential.",
                                    "title": "Key",
                                    "type": "string"
                                  },
                                  "name": {
                                    "description": "Name of the environment variable.",
                                    "title": "Name",
                                    "type": "string"
                                  },
                                  "source": {
                                    "const": "dr-credential",
                                    "title": "Source",
                                    "type": "string"
                                  }
                                },
                                "required": [
                                  "source",
                                  "name",
                                  "drCredentialId",
                                  "key"
                                ],
                                "title": "CredentialEnvironmentVariable",
                                "type": "object"
                              },
                              {
                                "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                                "properties": {
                                  "name": {
                                    "description": "Name of the environment variable.",
                                    "title": "Name",
                                    "type": "string"
                                  },
                                  "source": {
                                    "const": "dr-api-token",
                                    "title": "Source",
                                    "type": "string"
                                  }
                                },
                                "required": [
                                  "source",
                                  "name"
                                ],
                                "title": "DrApiTokenEnvironmentVariable",
                                "type": "object"
                              }
                            ]
                          },
                          "title": "Environmentvars",
                          "type": "array"
                        },
                        "imageBuildConfig": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "User-provided configuration for server-side image builds from source code.",
                              "properties": {
                                "codeRef": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "datarobot": {
                                          "additionalProperties": false,
                                          "properties": {
                                            "catalogId": {
                                              "title": "Catalogid",
                                              "type": "string"
                                            },
                                            "catalogVersionId": {
                                              "title": "Catalogversionid",
                                              "type": "string"
                                            }
                                          },
                                          "required": [
                                            "catalogId",
                                            "catalogVersionId"
                                          ],
                                          "title": "DataRobotCodeRef",
                                          "type": "object"
                                        },
                                        "provider": {
                                          "const": "datarobot",
                                          "default": "datarobot",
                                          "title": "Provider",
                                          "type": "string"
                                        },
                                        "type": {
                                          "const": "datarobot",
                                          "default": "datarobot",
                                          "title": "Type",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "datarobot"
                                      ],
                                      "title": "CodeRef",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                                },
                                "dockerfile": {
                                  "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                                  "discriminator": {
                                    "mapping": {
                                      "generated": "#/components/schemas/GeneratedDockerfile",
                                      "provided": "#/components/schemas/ProvidedDockerfile"
                                    },
                                    "propertyName": "source"
                                  },
                                  "oneOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "User supplies a dockerfile in the uploaded source code.",
                                      "properties": {
                                        "path": {
                                          "default": "./Dockerfile",
                                          "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "source": {
                                          "const": "provided",
                                          "default": "provided",
                                          "title": "Source",
                                          "type": "string"
                                        }
                                      },
                                      "title": "ProvidedDockerfile",
                                      "type": "object"
                                    },
                                    {
                                      "additionalProperties": false,
                                      "description": "System generates a dockerfile from execution environment metadata.",
                                      "properties": {
                                        "entrypoint": {
                                          "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                                          "items": {
                                            "type": "string"
                                          },
                                          "minItems": 1,
                                          "title": "Entrypoint",
                                          "type": "array"
                                        },
                                        "executionEnvironmentId": {
                                          "description": "Execution environment id used to resolve the base Docker image.",
                                          "title": "Execution Environment ID",
                                          "type": "string"
                                        },
                                        "executionEnvironmentVersionId": {
                                          "description": "Execution environment version id that pins the exact base image tag.",
                                          "title": "Execution Environment Version ID",
                                          "type": "string"
                                        },
                                        "source": {
                                          "const": "generated",
                                          "default": "generated",
                                          "title": "Source",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "executionEnvironmentId",
                                        "executionEnvironmentVersionId",
                                        "entrypoint"
                                      ],
                                      "title": "GeneratedDockerfile",
                                      "type": "object"
                                    }
                                  ],
                                  "title": "Dockerfile"
                                }
                              },
                              "title": "ImageBuildConfig",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Configuration for server-side image builds from source code."
                        },
                        "imageUri": {
                          "anyOf": [
                            {
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
                          "title": "Imageuri"
                        },
                        "livenessProbe": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "properties": {
                                "failureThreshold": {
                                  "default": 3,
                                  "description": "Minimum consecutive failures for the probe to be considered failed.",
                                  "title": "Failurethreshold",
                                  "type": "integer"
                                },
                                "host": {
                                  "anyOf": [
                                    {
                                      "minLength": 0,
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Host name to connect to, defaults to the pod ip.",
                                  "title": "Host"
                                },
                                "httpHeaders": {
                                  "additionalProperties": {
                                    "type": "string"
                                  },
                                  "description": "HTTP headers for probe.",
                                  "title": "Httpheaders",
                                  "type": "object"
                                },
                                "initialDelaySeconds": {
                                  "default": 30,
                                  "description": "Number of seconds to wait before the first probe is executed.",
                                  "title": "Initialdelayseconds",
                                  "type": "integer"
                                },
                                "path": {
                                  "description": "Url path to query for health check.",
                                  "title": "Path",
                                  "type": "string"
                                },
                                "periodSeconds": {
                                  "default": 30,
                                  "description": "How often (in seconds) to perform the probe.",
                                  "title": "Periodseconds",
                                  "type": "integer"
                                },
                                "port": {
                                  "default": 8080,
                                  "description": "Port number to access on the container.",
                                  "maximum": 65535,
                                  "minimum": 1,
                                  "title": "Port",
                                  "type": "integer"
                                },
                                "scheme": {
                                  "default": "HTTP",
                                  "description": "Scheme to use for connecting to the host.",
                                  "enum": [
                                    "HTTP",
                                    "HTTPS"
                                  ],
                                  "title": "Scheme",
                                  "type": "string"
                                },
                                "timeoutSeconds": {
                                  "default": 30,
                                  "description": "Number of seconds after which the probe times out.",
                                  "title": "Timeoutseconds",
                                  "type": "integer"
                                }
                              },
                              "required": [
                                "path"
                              ],
                              "title": "ProbeConfig",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container liveness check configuration."
                        },
                        "name": {
                          "anyOf": [
                            {
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                          "title": "Name"
                        },
                        "port": {
                          "anyOf": [
                            {
                              "maximum": 65535,
                              "minimum": 1024,
                              "type": "integer"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
                          "title": "Port"
                        },
                        "primary": {
                          "anyOf": [
                            {
                              "type": "boolean"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "default": false,
                          "description": "Whether this is the primary container.",
                          "title": "Primary"
                        },
                        "readinessProbe": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "properties": {
                                "failureThreshold": {
                                  "default": 3,
                                  "description": "Minimum consecutive failures for the probe to be considered failed.",
                                  "title": "Failurethreshold",
                                  "type": "integer"
                                },
                                "host": {
                                  "anyOf": [
                                    {
                                      "minLength": 0,
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Host name to connect to, defaults to the pod ip.",
                                  "title": "Host"
                                },
                                "httpHeaders": {
                                  "additionalProperties": {
                                    "type": "string"
                                  },
                                  "description": "HTTP headers for probe.",
                                  "title": "Httpheaders",
                                  "type": "object"
                                },
                                "initialDelaySeconds": {
                                  "default": 30,
                                  "description": "Number of seconds to wait before the first probe is executed.",
                                  "title": "Initialdelayseconds",
                                  "type": "integer"
                                },
                                "path": {
                                  "description": "Url path to query for health check.",
                                  "title": "Path",
                                  "type": "string"
                                },
                                "periodSeconds": {
                                  "default": 30,
                                  "description": "How often (in seconds) to perform the probe.",
                                  "title": "Periodseconds",
                                  "type": "integer"
                                },
                                "port": {
                                  "default": 8080,
                                  "description": "Port number to access on the container.",
                                  "maximum": 65535,
                                  "minimum": 1,
                                  "title": "Port",
                                  "type": "integer"
                                },
                                "scheme": {
                                  "default": "HTTP",
                                  "description": "Scheme to use for connecting to the host.",
                                  "enum": [
                                    "HTTP",
                                    "HTTPS"
                                  ],
                                  "title": "Scheme",
                                  "type": "string"
                                },
                                "timeoutSeconds": {
                                  "default": 30,
                                  "description": "Number of seconds after which the probe times out.",
                                  "title": "Timeoutseconds",
                                  "type": "integer"
                                }
                              },
                              "required": [
                                "path"
                              ],
                              "title": "ProbeConfig",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container readiness check configuration."
                        },
                        "securityContext": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                              "properties": {
                                "allowPrivilegeEscalation": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                                  "title": "Allowprivilegeescalation"
                                },
                                "capabilities": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Linux capabilities to add or drop from the container.",
                                      "properties": {
                                        "add": {
                                          "anyOf": [
                                            {
                                              "items": {
                                                "type": "string"
                                              },
                                              "type": "array"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Capabilities to add.",
                                          "title": "Add"
                                        },
                                        "drop": {
                                          "anyOf": [
                                            {
                                              "items": {
                                                "type": "string"
                                              },
                                              "type": "array"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Capabilities to drop.",
                                          "title": "Drop"
                                        }
                                      },
                                      "title": "Capabilities",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Linux capabilities to add or drop."
                                },
                                "readOnlyRootFilesystem": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Whether the root filesystem is read-only.",
                                  "title": "Readonlyrootfilesystem"
                                },
                                "seccompProfile": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Seccomp profile configuration.",
                                      "properties": {
                                        "localhostProfile": {
                                          "anyOf": [
                                            {
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                                          "title": "Localhostprofile"
                                        },
                                        "type": {
                                          "description": "Allowed seccomp profile types.",
                                          "enum": [
                                            "RuntimeDefault",
                                            "Unconfined",
                                            "Localhost"
                                          ],
                                          "title": "SeccompProfileType",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "type"
                                      ],
                                      "title": "SeccompProfile",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Seccomp profile for the container."
                                }
                              },
                              "title": "SecurityContext",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container security context."
                        },
                        "startupProbe": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "properties": {
                                "failureThreshold": {
                                  "default": 3,
                                  "description": "Minimum consecutive failures for the probe to be considered failed.",
                                  "title": "Failurethreshold",
                                  "type": "integer"
                                },
                                "host": {
                                  "anyOf": [
                                    {
                                      "minLength": 0,
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Host name to connect to, defaults to the pod ip.",
                                  "title": "Host"
                                },
                                "httpHeaders": {
                                  "additionalProperties": {
                                    "type": "string"
                                  },
                                  "description": "HTTP headers for probe.",
                                  "title": "Httpheaders",
                                  "type": "object"
                                },
                                "initialDelaySeconds": {
                                  "default": 30,
                                  "description": "Number of seconds to wait before the first probe is executed.",
                                  "title": "Initialdelayseconds",
                                  "type": "integer"
                                },
                                "path": {
                                  "description": "Url path to query for health check.",
                                  "title": "Path",
                                  "type": "string"
                                },
                                "periodSeconds": {
                                  "default": 30,
                                  "description": "How often (in seconds) to perform the probe.",
                                  "title": "Periodseconds",
                                  "type": "integer"
                                },
                                "port": {
                                  "default": 8080,
                                  "description": "Port number to access on the container.",
                                  "maximum": 65535,
                                  "minimum": 1,
                                  "title": "Port",
                                  "type": "integer"
                                },
                                "scheme": {
                                  "default": "HTTP",
                                  "description": "Scheme to use for connecting to the host.",
                                  "enum": [
                                    "HTTP",
                                    "HTTPS"
                                  ],
                                  "title": "Scheme",
                                  "type": "string"
                                },
                                "timeoutSeconds": {
                                  "default": 30,
                                  "description": "Number of seconds after which the probe times out.",
                                  "title": "Timeoutseconds",
                                  "type": "integer"
                                }
                              },
                              "required": [
                                "path"
                              ],
                              "title": "ProbeConfig",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container startup check configuration."
                        }
                      },
                      "title": "Container",
                      "type": "object"
                    },
                    "title": "Containers",
                    "type": "array"
                  },
                  "name": {
                    "default": "default",
                    "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                    "title": "Name",
                    "type": "string"
                  }
                },
                "title": "ContainerGroup",
                "type": "object"
              },
              "title": "Containergroups",
              "type": "array"
            },
            "type": {
              "const": "service",
              "default": "service",
              "description": "Artifact type discriminator. injected automatically from the top-level `type` field — do not set this directly.",
              "title": "Type",
              "type": "string"
            }
          },
          "title": "ServiceArtifactSpec",
          "type": "object"
        },
        {
          "additionalProperties": false,
          "properties": {
            "containerGroups": {
              "default": [],
              "description": "List of container groups.",
              "items": {
                "additionalProperties": false,
                "properties": {
                  "containers": {
                    "default": [],
                    "description": "List of containers making this container group.",
                    "items": {
                      "additionalProperties": false,
                      "properties": {
                        "build": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Build reference embedded in a container spec when an image build is triggered.",
                              "properties": {
                                "artifactImageBuildId": {
                                  "description": "Artifact image build id.",
                                  "title": "Artifactimagebuildid",
                                  "type": "string"
                                },
                                "createdAt": {
                                  "description": "Build creation timestamp (utc).",
                                  "format": "date-time",
                                  "title": "Createdat",
                                  "type": "string"
                                },
                                "status": {
                                  "description": "Image build reported status at submit time.",
                                  "title": "Status",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "artifactImageBuildId",
                                "status",
                                "createdAt"
                              ],
                              "title": "ContainerBuildInfo",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
                        },
                        "description": {
                          "default": "",
                          "description": "Description of the container.",
                          "title": "Description",
                          "type": "string"
                        },
                        "entrypoint": {
                          "anyOf": [
                            {
                              "items": {
                                "type": "string"
                              },
                              "type": "array"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
                          "title": "Entrypoint"
                        },
                        "environmentVars": {
                          "default": [],
                          "description": "Environment variables.",
                          "items": {
                            "anyOf": [
                              {
                                "properties": {
                                  "name": {
                                    "description": "Name of the environment variable.",
                                    "title": "Name",
                                    "type": "string"
                                  },
                                  "source": {
                                    "const": "string",
                                    "default": "string",
                                    "title": "Source",
                                    "type": "string"
                                  },
                                  "value": {
                                    "description": "Value of the environment variable.",
                                    "title": "Value",
                                    "type": "string"
                                  }
                                },
                                "required": [
                                  "name",
                                  "value"
                                ],
                                "title": "StringEnvironmentVariable",
                                "type": "object"
                              },
                              {
                                "properties": {
                                  "drCredentialId": {
                                    "description": "Id of the datarobot credential to use.",
                                    "title": "DR Credential ID",
                                    "type": "string"
                                  },
                                  "key": {
                                    "description": "Key within the credential.",
                                    "title": "Key",
                                    "type": "string"
                                  },
                                  "name": {
                                    "description": "Name of the environment variable.",
                                    "title": "Name",
                                    "type": "string"
                                  },
                                  "source": {
                                    "const": "dr-credential",
                                    "title": "Source",
                                    "type": "string"
                                  }
                                },
                                "required": [
                                  "source",
                                  "name",
                                  "drCredentialId",
                                  "key"
                                ],
                                "title": "CredentialEnvironmentVariable",
                                "type": "object"
                              },
                              {
                                "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                                "properties": {
                                  "name": {
                                    "description": "Name of the environment variable.",
                                    "title": "Name",
                                    "type": "string"
                                  },
                                  "source": {
                                    "const": "dr-api-token",
                                    "title": "Source",
                                    "type": "string"
                                  }
                                },
                                "required": [
                                  "source",
                                  "name"
                                ],
                                "title": "DrApiTokenEnvironmentVariable",
                                "type": "object"
                              }
                            ]
                          },
                          "title": "Environmentvars",
                          "type": "array"
                        },
                        "imageBuildConfig": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "User-provided configuration for server-side image builds from source code.",
                              "properties": {
                                "codeRef": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "properties": {
                                        "datarobot": {
                                          "additionalProperties": false,
                                          "properties": {
                                            "catalogId": {
                                              "title": "Catalogid",
                                              "type": "string"
                                            },
                                            "catalogVersionId": {
                                              "title": "Catalogversionid",
                                              "type": "string"
                                            }
                                          },
                                          "required": [
                                            "catalogId",
                                            "catalogVersionId"
                                          ],
                                          "title": "DataRobotCodeRef",
                                          "type": "object"
                                        },
                                        "provider": {
                                          "const": "datarobot",
                                          "default": "datarobot",
                                          "title": "Provider",
                                          "type": "string"
                                        },
                                        "type": {
                                          "const": "datarobot",
                                          "default": "datarobot",
                                          "title": "Type",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "datarobot"
                                      ],
                                      "title": "CodeRef",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                                },
                                "dockerfile": {
                                  "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                                  "discriminator": {
                                    "mapping": {
                                      "generated": "#/components/schemas/GeneratedDockerfile",
                                      "provided": "#/components/schemas/ProvidedDockerfile"
                                    },
                                    "propertyName": "source"
                                  },
                                  "oneOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "User supplies a dockerfile in the uploaded source code.",
                                      "properties": {
                                        "path": {
                                          "default": "./Dockerfile",
                                          "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                                          "title": "Path",
                                          "type": "string"
                                        },
                                        "source": {
                                          "const": "provided",
                                          "default": "provided",
                                          "title": "Source",
                                          "type": "string"
                                        }
                                      },
                                      "title": "ProvidedDockerfile",
                                      "type": "object"
                                    },
                                    {
                                      "additionalProperties": false,
                                      "description": "System generates a dockerfile from execution environment metadata.",
                                      "properties": {
                                        "entrypoint": {
                                          "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                                          "items": {
                                            "type": "string"
                                          },
                                          "minItems": 1,
                                          "title": "Entrypoint",
                                          "type": "array"
                                        },
                                        "executionEnvironmentId": {
                                          "description": "Execution environment id used to resolve the base Docker image.",
                                          "title": "Execution Environment ID",
                                          "type": "string"
                                        },
                                        "executionEnvironmentVersionId": {
                                          "description": "Execution environment version id that pins the exact base image tag.",
                                          "title": "Execution Environment Version ID",
                                          "type": "string"
                                        },
                                        "source": {
                                          "const": "generated",
                                          "default": "generated",
                                          "title": "Source",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "executionEnvironmentId",
                                        "executionEnvironmentVersionId",
                                        "entrypoint"
                                      ],
                                      "title": "GeneratedDockerfile",
                                      "type": "object"
                                    }
                                  ],
                                  "title": "Dockerfile"
                                }
                              },
                              "title": "ImageBuildConfig",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Configuration for server-side image builds from source code."
                        },
                        "imageUri": {
                          "anyOf": [
                            {
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
                          "title": "Imageuri"
                        },
                        "livenessProbe": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "properties": {
                                "failureThreshold": {
                                  "default": 3,
                                  "description": "Minimum consecutive failures for the probe to be considered failed.",
                                  "title": "Failurethreshold",
                                  "type": "integer"
                                },
                                "host": {
                                  "anyOf": [
                                    {
                                      "minLength": 0,
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Host name to connect to, defaults to the pod ip.",
                                  "title": "Host"
                                },
                                "httpHeaders": {
                                  "additionalProperties": {
                                    "type": "string"
                                  },
                                  "description": "HTTP headers for probe.",
                                  "title": "Httpheaders",
                                  "type": "object"
                                },
                                "initialDelaySeconds": {
                                  "default": 30,
                                  "description": "Number of seconds to wait before the first probe is executed.",
                                  "title": "Initialdelayseconds",
                                  "type": "integer"
                                },
                                "path": {
                                  "description": "Url path to query for health check.",
                                  "title": "Path",
                                  "type": "string"
                                },
                                "periodSeconds": {
                                  "default": 30,
                                  "description": "How often (in seconds) to perform the probe.",
                                  "title": "Periodseconds",
                                  "type": "integer"
                                },
                                "port": {
                                  "default": 8080,
                                  "description": "Port number to access on the container.",
                                  "maximum": 65535,
                                  "minimum": 1,
                                  "title": "Port",
                                  "type": "integer"
                                },
                                "scheme": {
                                  "default": "HTTP",
                                  "description": "Scheme to use for connecting to the host.",
                                  "enum": [
                                    "HTTP",
                                    "HTTPS"
                                  ],
                                  "title": "Scheme",
                                  "type": "string"
                                },
                                "timeoutSeconds": {
                                  "default": 30,
                                  "description": "Number of seconds after which the probe times out.",
                                  "title": "Timeoutseconds",
                                  "type": "integer"
                                }
                              },
                              "required": [
                                "path"
                              ],
                              "title": "ProbeConfig",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container liveness check configuration."
                        },
                        "name": {
                          "anyOf": [
                            {
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                          "title": "Name"
                        },
                        "port": {
                          "anyOf": [
                            {
                              "maximum": 65535,
                              "minimum": 1024,
                              "type": "integer"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
                          "title": "Port"
                        },
                        "primary": {
                          "anyOf": [
                            {
                              "type": "boolean"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "default": false,
                          "description": "Whether this is the primary container.",
                          "title": "Primary"
                        },
                        "readinessProbe": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "properties": {
                                "failureThreshold": {
                                  "default": 3,
                                  "description": "Minimum consecutive failures for the probe to be considered failed.",
                                  "title": "Failurethreshold",
                                  "type": "integer"
                                },
                                "host": {
                                  "anyOf": [
                                    {
                                      "minLength": 0,
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Host name to connect to, defaults to the pod ip.",
                                  "title": "Host"
                                },
                                "httpHeaders": {
                                  "additionalProperties": {
                                    "type": "string"
                                  },
                                  "description": "HTTP headers for probe.",
                                  "title": "Httpheaders",
                                  "type": "object"
                                },
                                "initialDelaySeconds": {
                                  "default": 30,
                                  "description": "Number of seconds to wait before the first probe is executed.",
                                  "title": "Initialdelayseconds",
                                  "type": "integer"
                                },
                                "path": {
                                  "description": "Url path to query for health check.",
                                  "title": "Path",
                                  "type": "string"
                                },
                                "periodSeconds": {
                                  "default": 30,
                                  "description": "How often (in seconds) to perform the probe.",
                                  "title": "Periodseconds",
                                  "type": "integer"
                                },
                                "port": {
                                  "default": 8080,
                                  "description": "Port number to access on the container.",
                                  "maximum": 65535,
                                  "minimum": 1,
                                  "title": "Port",
                                  "type": "integer"
                                },
                                "scheme": {
                                  "default": "HTTP",
                                  "description": "Scheme to use for connecting to the host.",
                                  "enum": [
                                    "HTTP",
                                    "HTTPS"
                                  ],
                                  "title": "Scheme",
                                  "type": "string"
                                },
                                "timeoutSeconds": {
                                  "default": 30,
                                  "description": "Number of seconds after which the probe times out.",
                                  "title": "Timeoutseconds",
                                  "type": "integer"
                                }
                              },
                              "required": [
                                "path"
                              ],
                              "title": "ProbeConfig",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container readiness check configuration."
                        },
                        "securityContext": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                              "properties": {
                                "allowPrivilegeEscalation": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                                  "title": "Allowprivilegeescalation"
                                },
                                "capabilities": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Linux capabilities to add or drop from the container.",
                                      "properties": {
                                        "add": {
                                          "anyOf": [
                                            {
                                              "items": {
                                                "type": "string"
                                              },
                                              "type": "array"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Capabilities to add.",
                                          "title": "Add"
                                        },
                                        "drop": {
                                          "anyOf": [
                                            {
                                              "items": {
                                                "type": "string"
                                              },
                                              "type": "array"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Capabilities to drop.",
                                          "title": "Drop"
                                        }
                                      },
                                      "title": "Capabilities",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Linux capabilities to add or drop."
                                },
                                "readOnlyRootFilesystem": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Whether the root filesystem is read-only.",
                                  "title": "Readonlyrootfilesystem"
                                },
                                "seccompProfile": {
                                  "anyOf": [
                                    {
                                      "additionalProperties": false,
                                      "description": "Seccomp profile configuration.",
                                      "properties": {
                                        "localhostProfile": {
                                          "anyOf": [
                                            {
                                              "type": "string"
                                            },
                                            {
                                              "type": "null"
                                            }
                                          ],
                                          "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                                          "title": "Localhostprofile"
                                        },
                                        "type": {
                                          "description": "Allowed seccomp profile types.",
                                          "enum": [
                                            "RuntimeDefault",
                                            "Unconfined",
                                            "Localhost"
                                          ],
                                          "title": "SeccompProfileType",
                                          "type": "string"
                                        }
                                      },
                                      "required": [
                                        "type"
                                      ],
                                      "title": "SeccompProfile",
                                      "type": "object"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Seccomp profile for the container."
                                }
                              },
                              "title": "SecurityContext",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container security context."
                        },
                        "startupProbe": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "properties": {
                                "failureThreshold": {
                                  "default": 3,
                                  "description": "Minimum consecutive failures for the probe to be considered failed.",
                                  "title": "Failurethreshold",
                                  "type": "integer"
                                },
                                "host": {
                                  "anyOf": [
                                    {
                                      "minLength": 0,
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Host name to connect to, defaults to the pod ip.",
                                  "title": "Host"
                                },
                                "httpHeaders": {
                                  "additionalProperties": {
                                    "type": "string"
                                  },
                                  "description": "HTTP headers for probe.",
                                  "title": "Httpheaders",
                                  "type": "object"
                                },
                                "initialDelaySeconds": {
                                  "default": 30,
                                  "description": "Number of seconds to wait before the first probe is executed.",
                                  "title": "Initialdelayseconds",
                                  "type": "integer"
                                },
                                "path": {
                                  "description": "Url path to query for health check.",
                                  "title": "Path",
                                  "type": "string"
                                },
                                "periodSeconds": {
                                  "default": 30,
                                  "description": "How often (in seconds) to perform the probe.",
                                  "title": "Periodseconds",
                                  "type": "integer"
                                },
                                "port": {
                                  "default": 8080,
                                  "description": "Port number to access on the container.",
                                  "maximum": 65535,
                                  "minimum": 1,
                                  "title": "Port",
                                  "type": "integer"
                                },
                                "scheme": {
                                  "default": "HTTP",
                                  "description": "Scheme to use for connecting to the host.",
                                  "enum": [
                                    "HTTP",
                                    "HTTPS"
                                  ],
                                  "title": "Scheme",
                                  "type": "string"
                                },
                                "timeoutSeconds": {
                                  "default": 30,
                                  "description": "Number of seconds after which the probe times out.",
                                  "title": "Timeoutseconds",
                                  "type": "integer"
                                }
                              },
                              "required": [
                                "path"
                              ],
                              "title": "ProbeConfig",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Container startup check configuration."
                        }
                      },
                      "title": "Container",
                      "type": "object"
                    },
                    "title": "Containers",
                    "type": "array"
                  },
                  "name": {
                    "default": "default",
                    "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                    "title": "Name",
                    "type": "string"
                  }
                },
                "title": "ContainerGroup",
                "type": "object"
              },
              "title": "Containergroups",
              "type": "array"
            },
            "storage": {
              "anyOf": [
                {
                  "additionalProperties": false,
                  "description": "Model weight storage configuration for nim artifacts.",
                  "properties": {
                    "mode": {
                      "default": "dedicatedPvc",
                      "description": "Storage mode for model weights. `dedicatedpvc` (default) provisions a separate pvc owned exclusively by this workload. `nimcache` reuses a single cluster-wide pvc per model image, shared across all workloads using the same model.",
                      "enum": [
                        "dedicatedPvc",
                        "nimCache"
                      ],
                      "title": "Mode",
                      "type": "string"
                    },
                    "pvcSize": {
                      "anyOf": [
                        {
                          "pattern": "^\\d+(\\.\\d+)?(Gi|Mi|Ti)$",
                          "type": "string"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Pvc size for dedicated storage (e.g. '150gi'). only applies when mode is `dedicatedpvc`. when omitted, the platform-configured default is used.",
                      "title": "Pvcsize"
                    }
                  },
                  "title": "NimStorageConfig",
                  "type": "object"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Model weight storage configuration. when omitted, defaults to a dedicated per-workload pvc provisioned exclusively for this workload."
            },
            "templateId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the template used to create this nim artifact.",
              "title": "Templateid"
            },
            "type": {
              "const": "nim",
              "default": "nim",
              "description": "Artifact type discriminator. injected automatically from the top-level `type` field — do not set this directly.",
              "title": "Type",
              "type": "string"
            }
          },
          "title": "NimArtifactSpec",
          "type": "object"
        }
      ],
      "title": "Spec"
    },
    "status": {
      "enum": [
        "draft",
        "locked"
      ],
      "title": "ArtifactStatus",
      "type": "string"
    },
    "type": {
      "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
      "enum": [
        "service",
        "nim"
      ],
      "title": "ArtifactType",
      "type": "string"
    }
  },
  "required": [
    "name",
    "spec"
  ],
  "title": "InputArtifact",
  "type": "object"
}

InputArtifact

Properties

Name Type Required Restrictions Description
artifactRepositoryId any false Id of the artifact repository this artifact belongs to (for versioning support).

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
description string false Description of the artifact.
name string true maxLength: 5000
minLength: 1
minLength: 1
Name of the artifact.
spec any true Artifact specification.

oneOf

Name Type Required Restrictions Description
» anonymous ServiceArtifactSpec false none

xor

Name Type Required Restrictions Description
» anonymous NimArtifactSpec false none

continued

Name Type Required Restrictions Description
status ArtifactStatus false Artifact status.
type ArtifactType false Artifact type.

NimArtifactSpec

{
  "additionalProperties": false,
  "properties": {
    "containerGroups": {
      "default": [],
      "description": "List of container groups.",
      "items": {
        "additionalProperties": false,
        "properties": {
          "containers": {
            "default": [],
            "description": "List of containers making this container group.",
            "items": {
              "additionalProperties": false,
              "properties": {
                "build": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "description": "Build reference embedded in a container spec when an image build is triggered.",
                      "properties": {
                        "artifactImageBuildId": {
                          "description": "Artifact image build id.",
                          "title": "Artifactimagebuildid",
                          "type": "string"
                        },
                        "createdAt": {
                          "description": "Build creation timestamp (utc).",
                          "format": "date-time",
                          "title": "Createdat",
                          "type": "string"
                        },
                        "status": {
                          "description": "Image build reported status at submit time.",
                          "title": "Status",
                          "type": "string"
                        }
                      },
                      "required": [
                        "artifactImageBuildId",
                        "status",
                        "createdAt"
                      ],
                      "title": "ContainerBuildInfo",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
                },
                "description": {
                  "default": "",
                  "description": "Description of the container.",
                  "title": "Description",
                  "type": "string"
                },
                "entrypoint": {
                  "anyOf": [
                    {
                      "items": {
                        "type": "string"
                      },
                      "type": "array"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
                  "title": "Entrypoint"
                },
                "environmentVars": {
                  "default": [],
                  "description": "Environment variables.",
                  "items": {
                    "anyOf": [
                      {
                        "properties": {
                          "name": {
                            "description": "Name of the environment variable.",
                            "title": "Name",
                            "type": "string"
                          },
                          "source": {
                            "const": "string",
                            "default": "string",
                            "title": "Source",
                            "type": "string"
                          },
                          "value": {
                            "description": "Value of the environment variable.",
                            "title": "Value",
                            "type": "string"
                          }
                        },
                        "required": [
                          "name",
                          "value"
                        ],
                        "title": "StringEnvironmentVariable",
                        "type": "object"
                      },
                      {
                        "properties": {
                          "drCredentialId": {
                            "description": "Id of the datarobot credential to use.",
                            "title": "DR Credential ID",
                            "type": "string"
                          },
                          "key": {
                            "description": "Key within the credential.",
                            "title": "Key",
                            "type": "string"
                          },
                          "name": {
                            "description": "Name of the environment variable.",
                            "title": "Name",
                            "type": "string"
                          },
                          "source": {
                            "const": "dr-credential",
                            "title": "Source",
                            "type": "string"
                          }
                        },
                        "required": [
                          "source",
                          "name",
                          "drCredentialId",
                          "key"
                        ],
                        "title": "CredentialEnvironmentVariable",
                        "type": "object"
                      },
                      {
                        "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                        "properties": {
                          "name": {
                            "description": "Name of the environment variable.",
                            "title": "Name",
                            "type": "string"
                          },
                          "source": {
                            "const": "dr-api-token",
                            "title": "Source",
                            "type": "string"
                          }
                        },
                        "required": [
                          "source",
                          "name"
                        ],
                        "title": "DrApiTokenEnvironmentVariable",
                        "type": "object"
                      }
                    ]
                  },
                  "title": "Environmentvars",
                  "type": "array"
                },
                "imageBuildConfig": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "description": "User-provided configuration for server-side image builds from source code.",
                      "properties": {
                        "codeRef": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "properties": {
                                "datarobot": {
                                  "additionalProperties": false,
                                  "properties": {
                                    "catalogId": {
                                      "title": "Catalogid",
                                      "type": "string"
                                    },
                                    "catalogVersionId": {
                                      "title": "Catalogversionid",
                                      "type": "string"
                                    }
                                  },
                                  "required": [
                                    "catalogId",
                                    "catalogVersionId"
                                  ],
                                  "title": "DataRobotCodeRef",
                                  "type": "object"
                                },
                                "provider": {
                                  "const": "datarobot",
                                  "default": "datarobot",
                                  "title": "Provider",
                                  "type": "string"
                                },
                                "type": {
                                  "const": "datarobot",
                                  "default": "datarobot",
                                  "title": "Type",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "datarobot"
                              ],
                              "title": "CodeRef",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                        },
                        "dockerfile": {
                          "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                          "discriminator": {
                            "mapping": {
                              "generated": "#/components/schemas/GeneratedDockerfile",
                              "provided": "#/components/schemas/ProvidedDockerfile"
                            },
                            "propertyName": "source"
                          },
                          "oneOf": [
                            {
                              "additionalProperties": false,
                              "description": "User supplies a dockerfile in the uploaded source code.",
                              "properties": {
                                "path": {
                                  "default": "./Dockerfile",
                                  "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                                  "title": "Path",
                                  "type": "string"
                                },
                                "source": {
                                  "const": "provided",
                                  "default": "provided",
                                  "title": "Source",
                                  "type": "string"
                                }
                              },
                              "title": "ProvidedDockerfile",
                              "type": "object"
                            },
                            {
                              "additionalProperties": false,
                              "description": "System generates a dockerfile from execution environment metadata.",
                              "properties": {
                                "entrypoint": {
                                  "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                                  "items": {
                                    "type": "string"
                                  },
                                  "minItems": 1,
                                  "title": "Entrypoint",
                                  "type": "array"
                                },
                                "executionEnvironmentId": {
                                  "description": "Execution environment id used to resolve the base Docker image.",
                                  "title": "Execution Environment ID",
                                  "type": "string"
                                },
                                "executionEnvironmentVersionId": {
                                  "description": "Execution environment version id that pins the exact base image tag.",
                                  "title": "Execution Environment Version ID",
                                  "type": "string"
                                },
                                "source": {
                                  "const": "generated",
                                  "default": "generated",
                                  "title": "Source",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "executionEnvironmentId",
                                "executionEnvironmentVersionId",
                                "entrypoint"
                              ],
                              "title": "GeneratedDockerfile",
                              "type": "object"
                            }
                          ],
                          "title": "Dockerfile"
                        }
                      },
                      "title": "ImageBuildConfig",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Configuration for server-side image builds from source code."
                },
                "imageUri": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
                  "title": "Imageuri"
                },
                "livenessProbe": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "properties": {
                        "failureThreshold": {
                          "default": 3,
                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                          "title": "Failurethreshold",
                          "type": "integer"
                        },
                        "host": {
                          "anyOf": [
                            {
                              "minLength": 0,
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Host name to connect to, defaults to the pod ip.",
                          "title": "Host"
                        },
                        "httpHeaders": {
                          "additionalProperties": {
                            "type": "string"
                          },
                          "description": "HTTP headers for probe.",
                          "title": "Httpheaders",
                          "type": "object"
                        },
                        "initialDelaySeconds": {
                          "default": 30,
                          "description": "Number of seconds to wait before the first probe is executed.",
                          "title": "Initialdelayseconds",
                          "type": "integer"
                        },
                        "path": {
                          "description": "Url path to query for health check.",
                          "title": "Path",
                          "type": "string"
                        },
                        "periodSeconds": {
                          "default": 30,
                          "description": "How often (in seconds) to perform the probe.",
                          "title": "Periodseconds",
                          "type": "integer"
                        },
                        "port": {
                          "default": 8080,
                          "description": "Port number to access on the container.",
                          "maximum": 65535,
                          "minimum": 1,
                          "title": "Port",
                          "type": "integer"
                        },
                        "scheme": {
                          "default": "HTTP",
                          "description": "Scheme to use for connecting to the host.",
                          "enum": [
                            "HTTP",
                            "HTTPS"
                          ],
                          "title": "Scheme",
                          "type": "string"
                        },
                        "timeoutSeconds": {
                          "default": 30,
                          "description": "Number of seconds after which the probe times out.",
                          "title": "Timeoutseconds",
                          "type": "integer"
                        }
                      },
                      "required": [
                        "path"
                      ],
                      "title": "ProbeConfig",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container liveness check configuration."
                },
                "name": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                  "title": "Name"
                },
                "port": {
                  "anyOf": [
                    {
                      "maximum": 65535,
                      "minimum": 1024,
                      "type": "integer"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
                  "title": "Port"
                },
                "primary": {
                  "anyOf": [
                    {
                      "type": "boolean"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "default": false,
                  "description": "Whether this is the primary container.",
                  "title": "Primary"
                },
                "readinessProbe": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "properties": {
                        "failureThreshold": {
                          "default": 3,
                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                          "title": "Failurethreshold",
                          "type": "integer"
                        },
                        "host": {
                          "anyOf": [
                            {
                              "minLength": 0,
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Host name to connect to, defaults to the pod ip.",
                          "title": "Host"
                        },
                        "httpHeaders": {
                          "additionalProperties": {
                            "type": "string"
                          },
                          "description": "HTTP headers for probe.",
                          "title": "Httpheaders",
                          "type": "object"
                        },
                        "initialDelaySeconds": {
                          "default": 30,
                          "description": "Number of seconds to wait before the first probe is executed.",
                          "title": "Initialdelayseconds",
                          "type": "integer"
                        },
                        "path": {
                          "description": "Url path to query for health check.",
                          "title": "Path",
                          "type": "string"
                        },
                        "periodSeconds": {
                          "default": 30,
                          "description": "How often (in seconds) to perform the probe.",
                          "title": "Periodseconds",
                          "type": "integer"
                        },
                        "port": {
                          "default": 8080,
                          "description": "Port number to access on the container.",
                          "maximum": 65535,
                          "minimum": 1,
                          "title": "Port",
                          "type": "integer"
                        },
                        "scheme": {
                          "default": "HTTP",
                          "description": "Scheme to use for connecting to the host.",
                          "enum": [
                            "HTTP",
                            "HTTPS"
                          ],
                          "title": "Scheme",
                          "type": "string"
                        },
                        "timeoutSeconds": {
                          "default": 30,
                          "description": "Number of seconds after which the probe times out.",
                          "title": "Timeoutseconds",
                          "type": "integer"
                        }
                      },
                      "required": [
                        "path"
                      ],
                      "title": "ProbeConfig",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container readiness check configuration."
                },
                "securityContext": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                      "properties": {
                        "allowPrivilegeEscalation": {
                          "anyOf": [
                            {
                              "type": "boolean"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                          "title": "Allowprivilegeescalation"
                        },
                        "capabilities": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Linux capabilities to add or drop from the container.",
                              "properties": {
                                "add": {
                                  "anyOf": [
                                    {
                                      "items": {
                                        "type": "string"
                                      },
                                      "type": "array"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Capabilities to add.",
                                  "title": "Add"
                                },
                                "drop": {
                                  "anyOf": [
                                    {
                                      "items": {
                                        "type": "string"
                                      },
                                      "type": "array"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Capabilities to drop.",
                                  "title": "Drop"
                                }
                              },
                              "title": "Capabilities",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Linux capabilities to add or drop."
                        },
                        "readOnlyRootFilesystem": {
                          "anyOf": [
                            {
                              "type": "boolean"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Whether the root filesystem is read-only.",
                          "title": "Readonlyrootfilesystem"
                        },
                        "seccompProfile": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Seccomp profile configuration.",
                              "properties": {
                                "localhostProfile": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                                  "title": "Localhostprofile"
                                },
                                "type": {
                                  "description": "Allowed seccomp profile types.",
                                  "enum": [
                                    "RuntimeDefault",
                                    "Unconfined",
                                    "Localhost"
                                  ],
                                  "title": "SeccompProfileType",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "type"
                              ],
                              "title": "SeccompProfile",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Seccomp profile for the container."
                        }
                      },
                      "title": "SecurityContext",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container security context."
                },
                "startupProbe": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "properties": {
                        "failureThreshold": {
                          "default": 3,
                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                          "title": "Failurethreshold",
                          "type": "integer"
                        },
                        "host": {
                          "anyOf": [
                            {
                              "minLength": 0,
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Host name to connect to, defaults to the pod ip.",
                          "title": "Host"
                        },
                        "httpHeaders": {
                          "additionalProperties": {
                            "type": "string"
                          },
                          "description": "HTTP headers for probe.",
                          "title": "Httpheaders",
                          "type": "object"
                        },
                        "initialDelaySeconds": {
                          "default": 30,
                          "description": "Number of seconds to wait before the first probe is executed.",
                          "title": "Initialdelayseconds",
                          "type": "integer"
                        },
                        "path": {
                          "description": "Url path to query for health check.",
                          "title": "Path",
                          "type": "string"
                        },
                        "periodSeconds": {
                          "default": 30,
                          "description": "How often (in seconds) to perform the probe.",
                          "title": "Periodseconds",
                          "type": "integer"
                        },
                        "port": {
                          "default": 8080,
                          "description": "Port number to access on the container.",
                          "maximum": 65535,
                          "minimum": 1,
                          "title": "Port",
                          "type": "integer"
                        },
                        "scheme": {
                          "default": "HTTP",
                          "description": "Scheme to use for connecting to the host.",
                          "enum": [
                            "HTTP",
                            "HTTPS"
                          ],
                          "title": "Scheme",
                          "type": "string"
                        },
                        "timeoutSeconds": {
                          "default": 30,
                          "description": "Number of seconds after which the probe times out.",
                          "title": "Timeoutseconds",
                          "type": "integer"
                        }
                      },
                      "required": [
                        "path"
                      ],
                      "title": "ProbeConfig",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container startup check configuration."
                }
              },
              "title": "Container",
              "type": "object"
            },
            "title": "Containers",
            "type": "array"
          },
          "name": {
            "default": "default",
            "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
            "title": "Name",
            "type": "string"
          }
        },
        "title": "ContainerGroup",
        "type": "object"
      },
      "title": "Containergroups",
      "type": "array"
    },
    "storage": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Model weight storage configuration for nim artifacts.",
          "properties": {
            "mode": {
              "default": "dedicatedPvc",
              "description": "Storage mode for model weights. `dedicatedpvc` (default) provisions a separate pvc owned exclusively by this workload. `nimcache` reuses a single cluster-wide pvc per model image, shared across all workloads using the same model.",
              "enum": [
                "dedicatedPvc",
                "nimCache"
              ],
              "title": "Mode",
              "type": "string"
            },
            "pvcSize": {
              "anyOf": [
                {
                  "pattern": "^\\d+(\\.\\d+)?(Gi|Mi|Ti)$",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Pvc size for dedicated storage (e.g. '150gi'). only applies when mode is `dedicatedpvc`. when omitted, the platform-configured default is used.",
              "title": "Pvcsize"
            }
          },
          "title": "NimStorageConfig",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Model weight storage configuration. when omitted, defaults to a dedicated per-workload pvc provisioned exclusively for this workload."
    },
    "templateId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the template used to create this nim artifact.",
      "title": "Templateid"
    },
    "type": {
      "const": "nim",
      "default": "nim",
      "description": "Artifact type discriminator. injected automatically from the top-level `type` field — do not set this directly.",
      "title": "Type",
      "type": "string"
    }
  },
  "title": "NimArtifactSpec",
  "type": "object"
}

NimArtifactSpec

Properties

Name Type Required Restrictions Description
containerGroups [ContainerGroup] false List of container groups.
storage any false Model weight storage configuration. when omitted, defaults to a dedicated per-workload pvc provisioned exclusively for this workload.

anyOf

Name Type Required Restrictions Description
» anonymous NimStorageConfig false Model weight storage configuration for nim artifacts.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
templateId any false Id of the template used to create this nim artifact.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
type string false Artifact type discriminator. injected automatically from the top-level type field — do not set this directly.

NimStorageConfig

{
  "additionalProperties": false,
  "description": "Model weight storage configuration for nim artifacts.",
  "properties": {
    "mode": {
      "default": "dedicatedPvc",
      "description": "Storage mode for model weights. `dedicatedpvc` (default) provisions a separate pvc owned exclusively by this workload. `nimcache` reuses a single cluster-wide pvc per model image, shared across all workloads using the same model.",
      "enum": [
        "dedicatedPvc",
        "nimCache"
      ],
      "title": "Mode",
      "type": "string"
    },
    "pvcSize": {
      "anyOf": [
        {
          "pattern": "^\\d+(\\.\\d+)?(Gi|Mi|Ti)$",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Pvc size for dedicated storage (e.g. '150gi'). only applies when mode is `dedicatedpvc`. when omitted, the platform-configured default is used.",
      "title": "Pvcsize"
    }
  },
  "title": "NimStorageConfig",
  "type": "object"
}

NimStorageConfig

Properties

Name Type Required Restrictions Description
mode string false Storage mode for model weights. dedicatedpvc (default) provisions a separate pvc owned exclusively by this workload. nimcache reuses a single cluster-wide pvc per model image, shared across all workloads using the same model.
pvcSize any false Pvc size for dedicated storage (e.g. '150gi'). only applies when mode is dedicatedpvc. when omitted, the platform-configured default is used.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

Enumerated Values

Property Value
mode [dedicatedPvc, nimCache]

OtelMetricResolution

{
  "enum": [
    "PT1M",
    "PT5M",
    "PT1H",
    "P1D",
    "P7D",
    "P1M"
  ],
  "title": "OtelMetricResolution",
  "type": "string"
}

OtelMetricResolution

Properties

Name Type Required Restrictions Description
OtelMetricResolution string false none

Enumerated Values

Property Value
OtelMetricResolution [PT1M, PT5M, PT1H, P1D, P7D, P1M]

Period

{
  "additionalProperties": false,
  "description": "Time period definition.",
  "properties": {
    "end": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Period end time.",
      "title": "End"
    },
    "start": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Period start time.",
      "title": "Start"
    }
  },
  "title": "Period",
  "type": "object"
}

Period

Properties

Name Type Required Restrictions Description
end any false Period end time.

anyOf

Name Type Required Restrictions Description
» anonymous string(date-time) false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
start any false Period start time.

anyOf

Name Type Required Restrictions Description
» anonymous string(date-time) false none

or

Name Type Required Restrictions Description
» anonymous null false none

ProbeConfig

{
  "additionalProperties": false,
  "properties": {
    "failureThreshold": {
      "default": 3,
      "description": "Minimum consecutive failures for the probe to be considered failed.",
      "title": "Failurethreshold",
      "type": "integer"
    },
    "host": {
      "anyOf": [
        {
          "minLength": 0,
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Host name to connect to, defaults to the pod ip.",
      "title": "Host"
    },
    "httpHeaders": {
      "additionalProperties": {
        "type": "string"
      },
      "description": "HTTP headers for probe.",
      "title": "Httpheaders",
      "type": "object"
    },
    "initialDelaySeconds": {
      "default": 30,
      "description": "Number of seconds to wait before the first probe is executed.",
      "title": "Initialdelayseconds",
      "type": "integer"
    },
    "path": {
      "description": "Url path to query for health check.",
      "title": "Path",
      "type": "string"
    },
    "periodSeconds": {
      "default": 30,
      "description": "How often (in seconds) to perform the probe.",
      "title": "Periodseconds",
      "type": "integer"
    },
    "port": {
      "default": 8080,
      "description": "Port number to access on the container.",
      "maximum": 65535,
      "minimum": 1,
      "title": "Port",
      "type": "integer"
    },
    "scheme": {
      "default": "HTTP",
      "description": "Scheme to use for connecting to the host.",
      "enum": [
        "HTTP",
        "HTTPS"
      ],
      "title": "Scheme",
      "type": "string"
    },
    "timeoutSeconds": {
      "default": 30,
      "description": "Number of seconds after which the probe times out.",
      "title": "Timeoutseconds",
      "type": "integer"
    }
  },
  "required": [
    "path"
  ],
  "title": "ProbeConfig",
  "type": "object"
}

ProbeConfig

Properties

Name Type Required Restrictions Description
failureThreshold integer false Minimum consecutive failures for the probe to be considered failed.
host any false Host name to connect to, defaults to the pod ip.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
httpHeaders object false HTTP headers for probe.
» additionalProperties string false none
initialDelaySeconds integer false Number of seconds to wait before the first probe is executed.
path string true Url path to query for health check.
periodSeconds integer false How often (in seconds) to perform the probe.
port integer false maximum: 65535
minimum: 1
Port number to access on the container.
scheme string false Scheme to use for connecting to the host.
timeoutSeconds integer false Number of seconds after which the probe times out.

Enumerated Values

Property Value
scheme [HTTP, HTTPS]

ProtonFormatted

{
  "additionalProperties": false,
  "properties": {
    "artifactId": {
      "description": "Id of the artifact deployed by this proton.",
      "title": "Artifactid",
      "type": "string"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Created At",
      "type": "string"
    },
    "creator": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User information embedded in API responses.",
          "properties": {
            "email": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User email address.",
              "title": "Email"
            },
            "fullName": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's full name.",
              "title": "Full Name"
            },
            "id": {
              "description": "User id associated with this resource.",
              "title": "User ID",
              "type": "string"
            },
            "userhash": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's gravatar hash.",
              "title": "Userhash"
            },
            "username": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Username.",
              "title": "Username"
            }
          },
          "required": [
            "id"
          ],
          "title": "UserData",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Owner user details including id, username and email."
    },
    "endpoint": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "API endpoint to use to send service requests.",
      "title": "Endpoint"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "ID",
      "type": "string"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "role": {
      "anyOf": [
        {
          "enum": [
            "active",
            "candidate"
          ],
          "title": "ProtonRole",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Role of the proton within its workload, either 'active' or 'candidate'."
    },
    "runningSince": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the proton entered running status.",
      "title": "Runningsince"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "enum": [
        "unknown",
        "submitted",
        "initializing",
        "provisioning",
        "launching",
        "running",
        "suspended",
        "warming",
        "draining",
        "interrupted",
        "restarting",
        "stopping",
        "stopped",
        "errored",
        "terminated"
      ],
      "title": "ProtonStatus",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updated At",
      "type": "string"
    },
    "workloadId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the workload this proton belongs to.",
      "title": "Workloadid"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt",
    "status",
    "artifactId"
  ],
  "title": "ProtonFormatted",
  "type": "object"
}

ProtonFormatted

Properties

Name Type Required Restrictions Description
artifactId string true Id of the artifact deployed by this proton.
createdAt string(date-time) true Timestamp of when the entity was created.
creator any false Owner user details including id, username and email.

anyOf

Name Type Required Restrictions Description
» anonymous UserData false User information embedded in API responses.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
endpoint any false API endpoint to use to send service requests.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
id string true Unique identifier of the entity.
name string true Name of the entity.
role any false Role of the proton within its workload, either 'active' or 'candidate'.

anyOf

Name Type Required Restrictions Description
» anonymous ProtonRole false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
runningSince any false Timestamp of when the proton entered running status.

anyOf

Name Type Required Restrictions Description
» anonymous string(date-time) false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
runtime WorkloadRuntime false Runtime configuration for this proton.
status ProtonStatus true Proton status.
updatedAt string(date-time) true Timestamp of when the entity was last updated.
workloadId any false Id of the workload this proton belongs to.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

ProtonListResponse

{
  "additionalProperties": false,
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "properties": {
          "artifactId": {
            "description": "Id of the artifact deployed by this proton.",
            "title": "Artifactid",
            "type": "string"
          },
          "createdAt": {
            "description": "Timestamp of when the entity was created.",
            "format": "date-time",
            "title": "Created At",
            "type": "string"
          },
          "creator": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "User information embedded in API responses.",
                "properties": {
                  "email": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User email address.",
                    "title": "Email"
                  },
                  "fullName": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User's full name.",
                    "title": "Full Name"
                  },
                  "id": {
                    "description": "User id associated with this resource.",
                    "title": "User ID",
                    "type": "string"
                  },
                  "userhash": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User's gravatar hash.",
                    "title": "Userhash"
                  },
                  "username": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Username.",
                    "title": "Username"
                  }
                },
                "required": [
                  "id"
                ],
                "title": "UserData",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Owner user details including id, username and email."
          },
          "endpoint": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "API endpoint to use to send service requests.",
            "title": "Endpoint"
          },
          "id": {
            "description": "Unique identifier of the entity.",
            "title": "ID",
            "type": "string"
          },
          "name": {
            "description": "Name of the entity.",
            "title": "Name",
            "type": "string"
          },
          "role": {
            "anyOf": [
              {
                "enum": [
                  "active",
                  "candidate"
                ],
                "title": "ProtonRole",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Role of the proton within its workload, either 'active' or 'candidate'."
          },
          "runningSince": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of when the proton entered running status.",
            "title": "Runningsince"
          },
          "runtime": {
            "additionalProperties": false,
            "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
            "properties": {
              "containerGroups": {
                "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime configuration for a single container group.",
                  "properties": {
                    "autoscaling": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Autoscaling configuration for a proton.",
                          "properties": {
                            "enabled": {
                              "default": true,
                              "description": "Whether autoscaling is enabled.",
                              "title": "Enabled",
                              "type": "boolean"
                            },
                            "policies": {
                              "items": {
                                "additionalProperties": false,
                                "description": "Base class for autoscaling policies.",
                                "properties": {
                                  "maxCount": {
                                    "description": "Maximum number of replicas.",
                                    "minimum": 0,
                                    "title": "Max Count",
                                    "type": "integer"
                                  },
                                  "minCount": {
                                    "description": "Minimum number of replicas.",
                                    "minimum": 0,
                                    "title": "Min Count",
                                    "type": "integer"
                                  },
                                  "priority": {
                                    "anyOf": [
                                      {
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Policy priority when multiple policies are defined.",
                                    "title": "Priority"
                                  },
                                  "scalingMetric": {
                                    "anyOf": [
                                      {
                                        "oneOf": [
                                          {
                                            "const": "cpuAverageUtilization",
                                            "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                            "title": "CPU Average Utilization"
                                          },
                                          {
                                            "const": "httpRequestsConcurrency",
                                            "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                            "title": "HTTP Requests Concurrency"
                                          },
                                          {
                                            "const": "gpuCacheUtilization",
                                            "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                            "title": "GPU Cache Utilization"
                                          },
                                          {
                                            "const": "gpuRequestQueueDepth",
                                            "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                            "title": "GPU Request Queue Depth"
                                          }
                                        ],
                                        "title": "ScalingMetricType",
                                        "type": "string"
                                      },
                                      {
                                        "type": "string"
                                      }
                                    ],
                                    "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                                    "title": "Scaling Metric"
                                  },
                                  "target": {
                                    "description": "Target value for the scaling metric.",
                                    "minimum": 0,
                                    "title": "Target",
                                    "type": "number"
                                  }
                                },
                                "required": [
                                  "scalingMetric",
                                  "target",
                                  "minCount",
                                  "maxCount"
                                ],
                                "title": "AutoscalingPolicy",
                                "type": "object"
                              },
                              "title": "Policies",
                              "type": "array"
                            }
                          },
                          "required": [
                            "policies"
                          ],
                          "title": "AutoscalingProperties",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Autoscaling configuration for this group. takes precedence over replicacount."
                    },
                    "bundleSelectionPolicy": {
                      "enum": [
                        "availability"
                      ],
                      "title": "BundleSelectionPolicy",
                      "type": "string"
                    },
                    "containers": {
                      "description": "Per-container overrides for this group.",
                      "items": {
                        "additionalProperties": false,
                        "description": "Runtime diff targeting a single named container within a group.",
                        "properties": {
                          "name": {
                            "description": "Container name. must match a container declared in the artifact group.",
                            "title": "Name",
                            "type": "string"
                          },
                          "resourceAllocation": {
                            "anyOf": [
                              {
                                "additionalProperties": false,
                                "description": "Per-container resource allocation declared at runtime.",
                                "properties": {
                                  "cpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0.1,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Cpu cores allocated to this container.",
                                    "title": "Cpu"
                                  },
                                  "gpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Gpus allocated to this container.",
                                    "title": "Gpu"
                                  },
                                  "memory": {
                                    "anyOf": [
                                      {
                                        "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                        "type": "string"
                                      },
                                      {
                                        "minimum": 0,
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                                    "examples": [
                                      "8GB",
                                      "512MB"
                                    ],
                                    "title": "Memory"
                                  }
                                },
                                "title": "ResourceAllocation",
                                "type": "object"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Resource allocation for this container. required for multi-container groups."
                          }
                        },
                        "required": [
                          "name"
                        ],
                        "title": "ContainerOverride",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "default": "default",
                      "description": "Group name. must match a container group name declared in the artifact.",
                      "title": "Name",
                      "type": "string"
                    },
                    "replicaCount": {
                      "anyOf": [
                        {
                          "minimum": 1,
                          "type": "integer"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "default": 1,
                      "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                      "title": "Replicacount"
                    },
                    "resolvedBundle": {
                      "anyOf": [
                        {
                          "description": "Bundle details returned in the runtime response after scheduling.",
                          "properties": {
                            "cpuCount": {
                              "description": "Number of cpu cores.",
                              "title": "CPU Count",
                              "type": "number"
                            },
                            "gpuCount": {
                              "default": 0,
                              "description": "Number of gpu units.",
                              "title": "GPU Count",
                              "type": "integer"
                            },
                            "gpuMaker": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu manufacturer.",
                              "title": "GPU Maker"
                            },
                            "gpuTypeLabel": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu type label.",
                              "title": "GPU Type Label"
                            },
                            "id": {
                              "description": "Bundle identifier that was selected.",
                              "title": "Id",
                              "type": "string"
                            },
                            "memoryBytes": {
                              "description": "Memory size in bytes.",
                              "title": "Memory Bytes",
                              "type": "integer"
                            }
                          },
                          "required": [
                            "id",
                            "cpuCount",
                            "memoryBytes"
                          ],
                          "title": "ResolvedBundle",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Full details of the bundle selected at scheduling time. read-only.",
                      "readOnly": true
                    },
                    "resourceBundles": {
                      "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                      "items": {
                        "type": "string"
                      },
                      "title": "Resourcebundles",
                      "type": "array"
                    }
                  },
                  "title": "GroupRuntime",
                  "type": "object"
                },
                "title": "Containergroups",
                "type": "array"
              }
            },
            "title": "WorkloadRuntime",
            "type": "object"
          },
          "status": {
            "enum": [
              "unknown",
              "submitted",
              "initializing",
              "provisioning",
              "launching",
              "running",
              "suspended",
              "warming",
              "draining",
              "interrupted",
              "restarting",
              "stopping",
              "stopped",
              "errored",
              "terminated"
            ],
            "title": "ProtonStatus",
            "type": "string"
          },
          "updatedAt": {
            "description": "Timestamp of when the entity was last updated.",
            "format": "date-time",
            "title": "Updated At",
            "type": "string"
          },
          "workloadId": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Id of the workload this proton belongs to.",
            "title": "Workloadid"
          }
        },
        "required": [
          "id",
          "name",
          "createdAt",
          "updatedAt",
          "status",
          "artifactId"
        ],
        "title": "ProtonFormatted",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "ProtonListResponse",
  "type": "object"
}

ProtonListResponse

Properties

Name Type Required Restrictions Description
count integer true The number of records on this page.
data [ProtonFormatted] true The list of records.
next any true The url to the next page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
previous any true The url to the previous page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
totalCount integer true The total number of records.

ProtonRequestMetricOverTime

{
  "additionalProperties": false,
  "description": "Proton request metric over time.",
  "properties": {
    "buckets": {
      "description": "Time-bucketed metric values with flexible structure.",
      "items": {
        "additionalProperties": true,
        "type": "object"
      },
      "title": "Buckets",
      "type": "array"
    },
    "metric": {
      "description": "Metric names for workload statistics.",
      "enum": [
        "totalRequests",
        "requestsOverN",
        "requestsPerMinute",
        "concurrentRequests",
        "responseTime",
        "totalErrorRate"
      ],
      "title": "WorkloadStatsMetricName",
      "type": "string"
    },
    "summary": {
      "additionalProperties": false,
      "description": "Summary information for proton statistics.",
      "properties": {
        "period": {
          "additionalProperties": false,
          "description": "Time period definition.",
          "properties": {
            "end": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Period end time.",
              "title": "End"
            },
            "start": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Period start time.",
              "title": "Start"
            }
          },
          "title": "Period",
          "type": "object"
        },
        "protonId": {
          "description": "Proton id.",
          "title": "Protonid",
          "type": "string"
        }
      },
      "required": [
        "protonId",
        "period"
      ],
      "title": "Summary",
      "type": "object"
    }
  },
  "required": [
    "metric",
    "summary"
  ],
  "title": "ProtonRequestMetricOverTime",
  "type": "object"
}

ProtonRequestMetricOverTime

Properties

Name Type Required Restrictions Description
buckets [object] false Time-bucketed metric values with flexible structure.
metric WorkloadStatsMetricName true Metric name being tracked.
summary Summary true Summary information for the metric.

ProtonRequestStats

{
  "additionalProperties": false,
  "description": "Proton request statistics with time period.",
  "properties": {
    "metrics": {
      "additionalProperties": false,
      "description": "Detailed request metrics.",
      "properties": {
        "concurrentRequests": {
          "default": 0,
          "description": "Current concurrent requests.",
          "title": "Concurrentrequests",
          "type": "integer"
        },
        "requestsPerMinute": {
          "default": 0,
          "description": "Average requests per minute.",
          "title": "Requestsperminute",
          "type": "integer"
        },
        "responseTime": {
          "default": 0,
          "description": "Average response time in milliseconds.",
          "title": "Responsetime",
          "type": "integer"
        },
        "serverErrorRate": {
          "default": 0,
          "description": "Server error rate.",
          "title": "Servererrorrate",
          "type": "number"
        },
        "serverErrors": {
          "default": 0,
          "description": "Number of server errors (5xx).",
          "title": "Servererrors",
          "type": "integer"
        },
        "slowRequests": {
          "default": 0,
          "description": "Number of slow requests exceeding threshold.",
          "title": "Slowrequests",
          "type": "integer"
        },
        "totalErrorRate": {
          "default": 0,
          "description": "Total error rate.",
          "title": "Totalerrorrate",
          "type": "number"
        },
        "totalRequests": {
          "default": 0,
          "description": "Total number of requests.",
          "title": "Totalrequests",
          "type": "integer"
        },
        "userErrorRate": {
          "default": 0,
          "description": "User error rate.",
          "title": "Usererrorrate",
          "type": "number"
        },
        "userErrors": {
          "default": 0,
          "description": "Number of user errors (4xx).",
          "title": "Usererrors",
          "type": "integer"
        }
      },
      "title": "RequestMetrics",
      "type": "object"
    },
    "period": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Time period definition.",
          "properties": {
            "end": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Period end time.",
              "title": "End"
            },
            "start": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Period start time.",
              "title": "Start"
            }
          },
          "title": "Period",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Time period."
    }
  },
  "title": "ProtonRequestStats",
  "type": "object"
}

ProtonRequestStats

Properties

Name Type Required Restrictions Description
metrics RequestMetrics false Request metrics.
period any false Time period.

anyOf

Name Type Required Restrictions Description
» anonymous Period false Time period definition.

or

Name Type Required Restrictions Description
» anonymous null false none

ProtonRole

{
  "enum": [
    "active",
    "candidate"
  ],
  "title": "ProtonRole",
  "type": "string"
}

ProtonRole

Properties

Name Type Required Restrictions Description
ProtonRole string false none

Enumerated Values

Property Value
ProtonRole [active, candidate]

ProtonStatus

{
  "enum": [
    "unknown",
    "submitted",
    "initializing",
    "provisioning",
    "launching",
    "running",
    "suspended",
    "warming",
    "draining",
    "interrupted",
    "restarting",
    "stopping",
    "stopped",
    "errored",
    "terminated"
  ],
  "title": "ProtonStatus",
  "type": "string"
}

ProtonStatus

Properties

Name Type Required Restrictions Description
ProtonStatus string false none

Enumerated Values

Property Value
ProtonStatus [unknown, submitted, initializing, provisioning, launching, running, suspended, warming, draining, interrupted, restarting, stopping, stopped, errored, terminated]

ProvidedDockerfile

{
  "additionalProperties": false,
  "description": "User supplies a dockerfile in the uploaded source code.",
  "properties": {
    "path": {
      "default": "./Dockerfile",
      "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
      "title": "Path",
      "type": "string"
    },
    "source": {
      "const": "provided",
      "default": "provided",
      "title": "Source",
      "type": "string"
    }
  },
  "title": "ProvidedDockerfile",
  "type": "object"
}

ProvidedDockerfile

Properties

Name Type Required Restrictions Description
path string false Relative path to the dockerfile in the source code. defaults to ./dockerfile.
source string false none

RelatedEntitiesResponse

{
  "additionalProperties": false,
  "description": "Response containing related entities.",
  "properties": {
    "count": {
      "default": 0,
      "description": "Total number of related entities.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "List of related entities.",
      "items": {
        "anyOf": [
          {
            "additionalProperties": false,
            "description": "Related entity item.",
            "properties": {
              "createdAt": {
                "description": "Timestamp of when the entity was created.",
                "format": "date-time",
                "title": "Created At",
                "type": "string"
              },
              "creator": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "User information embedded in API responses.",
                    "properties": {
                      "email": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "User email address.",
                        "title": "Email"
                      },
                      "fullName": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "User's full name.",
                        "title": "Full Name"
                      },
                      "id": {
                        "description": "User id associated with this resource.",
                        "title": "User ID",
                        "type": "string"
                      },
                      "userhash": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "User's gravatar hash.",
                        "title": "Userhash"
                      },
                      "username": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Username.",
                        "title": "Username"
                      }
                    },
                    "required": [
                      "id"
                    ],
                    "title": "UserData",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Owner user details including id, username and email.",
                "title": "Creator"
              },
              "id": {
                "description": "Unique identifier of the entity.",
                "title": "ID",
                "type": "string"
              },
              "name": {
                "description": "Name of the entity.",
                "title": "Name",
                "type": "string"
              },
              "permissions": {
                "anyOf": [
                  {
                    "items": {
                      "description": "Represents the particular role a user, group or organization holds on an entity.",
                      "enum": [
                        "CAN_VIEW",
                        "CAN_UPDATE",
                        "CAN_DELETE",
                        "CAN_SHARE",
                        "CAN_MAKE_PREDICTIONS",
                        "CAN_SHARE_ROLE_OWNER",
                        "CAN_SHARE_ROLE_READ_WRITE",
                        "CAN_SHARE_ROLE_READ_ONLY"
                      ],
                      "title": "ResourcePermission",
                      "type": "string"
                    },
                    "type": "array"
                  },
                  {
                    "items": {
                      "const": "*",
                      "type": "string"
                    },
                    "type": "array"
                  }
                ]
              },
              "type": {
                "enum": [
                  "artifact",
                  "artifact_repository",
                  "proton",
                  "workload",
                  "custom_model"
                ],
                "title": "ResourceTypes",
                "type": "string"
              },
              "updatedAt": {
                "description": "Timestamp of when the entity was last updated.",
                "format": "date-time",
                "title": "Updated At",
                "type": "string"
              }
            },
            "required": [
              "id",
              "name",
              "createdAt",
              "updatedAt",
              "type"
            ],
            "title": "RelatedItem",
            "type": "object"
          },
          {
            "additionalProperties": false,
            "description": "Basic information about a related entity, identified by its id and type.",
            "properties": {
              "id": {
                "description": "Unique identifier of the entity.",
                "title": "Id",
                "type": "string"
              },
              "type": {
                "enum": [
                  "artifact",
                  "artifact_repository",
                  "proton",
                  "workload",
                  "custom_model"
                ],
                "title": "ResourceTypes",
                "type": "string"
              }
            },
            "required": [
              "id",
              "type"
            ],
            "title": "RelatedItemID",
            "type": "object"
          }
        ]
      },
      "title": "Data",
      "type": "array"
    }
  },
  "title": "RelatedEntitiesResponse",
  "type": "object"
}

RelatedEntitiesResponse

Properties

Name Type Required Restrictions Description
count integer false Total number of related entities.
data [anyOf] false List of related entities.

anyOf

Name Type Required Restrictions Description
» anonymous RelatedItem false Related entity item.

or

Name Type Required Restrictions Description
» anonymous RelatedItemID false Basic information about a related entity, identified by its id and type.

RelatedItem

{
  "additionalProperties": false,
  "description": "Related entity item.",
  "properties": {
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Created At",
      "type": "string"
    },
    "creator": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User information embedded in API responses.",
          "properties": {
            "email": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User email address.",
              "title": "Email"
            },
            "fullName": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's full name.",
              "title": "Full Name"
            },
            "id": {
              "description": "User id associated with this resource.",
              "title": "User ID",
              "type": "string"
            },
            "userhash": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's gravatar hash.",
              "title": "Userhash"
            },
            "username": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Username.",
              "title": "Username"
            }
          },
          "required": [
            "id"
          ],
          "title": "UserData",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Owner user details including id, username and email.",
      "title": "Creator"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "ID",
      "type": "string"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "permissions": {
      "anyOf": [
        {
          "items": {
            "description": "Represents the particular role a user, group or organization holds on an entity.",
            "enum": [
              "CAN_VIEW",
              "CAN_UPDATE",
              "CAN_DELETE",
              "CAN_SHARE",
              "CAN_MAKE_PREDICTIONS",
              "CAN_SHARE_ROLE_OWNER",
              "CAN_SHARE_ROLE_READ_WRITE",
              "CAN_SHARE_ROLE_READ_ONLY"
            ],
            "title": "ResourcePermission",
            "type": "string"
          },
          "type": "array"
        },
        {
          "items": {
            "const": "*",
            "type": "string"
          },
          "type": "array"
        }
      ]
    },
    "type": {
      "enum": [
        "artifact",
        "artifact_repository",
        "proton",
        "workload",
        "custom_model"
      ],
      "title": "ResourceTypes",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updated At",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt",
    "type"
  ],
  "title": "RelatedItem",
  "type": "object"
}

RelatedItem

Properties

Name Type Required Restrictions Description
createdAt string(date-time) true Timestamp of when the entity was created.
creator any false Owner user details including id, username and email.

anyOf

Name Type Required Restrictions Description
» anonymous UserData false User information embedded in API responses.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
id string true Unique identifier of the entity.
name string true Name of the entity.
permissions ResourcePermissions false User permissions for this related entity.
type ResourceTypes true Type of the related entity.
updatedAt string(date-time) true Timestamp of when the entity was last updated.

RelatedItemID

{
  "additionalProperties": false,
  "description": "Basic information about a related entity, identified by its id and type.",
  "properties": {
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "Id",
      "type": "string"
    },
    "type": {
      "enum": [
        "artifact",
        "artifact_repository",
        "proton",
        "workload",
        "custom_model"
      ],
      "title": "ResourceTypes",
      "type": "string"
    }
  },
  "required": [
    "id",
    "type"
  ],
  "title": "RelatedItemID",
  "type": "object"
}

RelatedItemID

Properties

Name Type Required Restrictions Description
id string true Unique identifier of the entity.
type ResourceTypes true Type of the related entity.

Replacement

{
  "additionalProperties": false,
  "description": "Store replacement information for workloads.",
  "properties": {
    "candidateArtifactId": {
      "description": "Candidate artifact id.",
      "title": "Candidateartifactid",
      "type": "string"
    },
    "candidateProtonIds": {
      "description": "Ids of protons pending promotion during artifact replacement.",
      "items": {
        "type": "string"
      },
      "title": "Candidateprotonids",
      "type": "array"
    },
    "config": {
      "additionalProperties": false,
      "description": "Configuration for workload replacement.",
      "properties": {
        "keepOldVersionMinutes": {
          "default": 0,
          "description": "Duration in minutes to keep the old version during replacement.",
          "title": "Keepoldversionminutes",
          "type": "integer"
        },
        "warmupDurationMinutes": {
          "default": 0,
          "description": "Duration in minutes for the warmup phase during replacement.",
          "title": "Warmupdurationminutes",
          "type": "integer"
        }
      },
      "title": "ReplacementConfig",
      "type": "object"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Createdat",
      "type": "string"
    },
    "deletedAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the entity was deleted.",
      "title": "Deletedat"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "Id",
      "type": "string"
    },
    "isDeleted": {
      "default": false,
      "description": "Whether this entity has been deleted.",
      "title": "Isdeleted",
      "type": "boolean"
    },
    "message": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Additional information about the replacement status, such as validation errors or reasons for failure.",
      "title": "Message"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "previousProtonIds": {
      "anyOf": [
        {
          "items": {
            "type": "string"
          },
          "type": "array"
        },
        {
          "type": "null"
        }
      ],
      "description": "Ids of protons pending decommissioning during artifact replacement.",
      "title": "Previousprotonids"
    },
    "protonStatuses": {
      "anyOf": [
        {
          "additionalProperties": {
            "additionalProperties": false,
            "properties": {
              "overallStatus": {
                "additionalProperties": false,
                "description": "Overall status as reported by the workload-monitor service.",
                "properties": {
                  "lastUpdated": {
                    "description": "Rfc3339 timestamp of the last state transition.",
                    "title": "Lastupdated",
                    "type": "string"
                  },
                  "state": {
                    "enum": [
                      "unknown",
                      "submitted",
                      "initializing",
                      "provisioning",
                      "launching",
                      "running",
                      "suspended",
                      "warming",
                      "draining",
                      "interrupted",
                      "restarting",
                      "stopping",
                      "stopped",
                      "errored",
                      "terminated"
                    ],
                    "title": "ProtonStatus",
                    "type": "string"
                  },
                  "summary": {
                    "description": "Human-readable description of the current state.",
                    "title": "Summary",
                    "type": "string"
                  }
                },
                "required": [
                  "state",
                  "summary",
                  "lastUpdated"
                ],
                "title": "WorkloadMonitorOverallStatus",
                "type": "object"
              },
              "replicas": {
                "items": {
                  "additionalProperties": false,
                  "properties": {
                    "address": {
                      "title": "Address",
                      "type": "string"
                    },
                    "conditions": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "lastTransitionTime": {
                            "title": "Lasttransitiontime",
                            "type": "string"
                          },
                          "message": {
                            "default": "",
                            "title": "Message",
                            "type": "string"
                          },
                          "reason": {
                            "default": "",
                            "title": "Reason",
                            "type": "string"
                          },
                          "type": {
                            "title": "Type",
                            "type": "string"
                          },
                          "value": {
                            "anyOf": [
                              {
                                "type": "boolean"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Value"
                          }
                        },
                        "required": [
                          "type",
                          "value",
                          "lastTransitionTime"
                        ],
                        "title": "ReplicaConditionDetail",
                        "type": "object"
                      },
                      "title": "Conditions",
                      "type": "array"
                    },
                    "containers": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "image": {
                            "title": "Image",
                            "type": "string"
                          },
                          "name": {
                            "title": "Name",
                            "type": "string"
                          },
                          "ready": {
                            "title": "Ready",
                            "type": "boolean"
                          },
                          "restartCount": {
                            "title": "Restartcount",
                            "type": "integer"
                          },
                          "startedAt": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Startedat"
                          },
                          "status": {
                            "description": "Lifecycle state of a container within a deployment replica.",
                            "enum": [
                              "running",
                              "waiting",
                              "terminated",
                              "unknown"
                            ],
                            "title": "ContainerStatus",
                            "type": "string"
                          }
                        },
                        "required": [
                          "name",
                          "status",
                          "startedAt",
                          "ready",
                          "restartCount",
                          "image"
                        ],
                        "title": "ContainerStatusDetail",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "title": "Name",
                      "type": "string"
                    },
                    "nodeAddress": {
                      "title": "Nodeaddress",
                      "type": "string"
                    },
                    "startedAt": {
                      "anyOf": [
                        {
                          "type": "string"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "title": "Startedat"
                    },
                    "status": {
                      "description": "Lifecycle phase of a deployment replica.",
                      "enum": [
                        "pending",
                        "running",
                        "succeeded",
                        "failed",
                        "unknown"
                      ],
                      "title": "ReplicaPhase",
                      "type": "string"
                    }
                  },
                  "required": [
                    "name",
                    "status",
                    "address",
                    "nodeAddress",
                    "startedAt",
                    "conditions",
                    "containers"
                  ],
                  "title": "ReplicaDetail",
                  "type": "object"
                },
                "title": "Replicas",
                "type": "array"
              }
            },
            "required": [
              "overallStatus",
              "replicas"
            ],
            "title": "ReplicaStatusesSnapshot",
            "type": "object"
          },
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Latest known status of candidate protons, used to determine replacement status transitions.",
      "title": "Protonstatuses"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "Statuses for workload replacement process.",
      "enum": [
        "unknown",
        "submitted",
        "initializing",
        "awaiting_promotion",
        "switching",
        "deleting",
        "completed",
        "errored",
        "cleaning_up"
      ],
      "title": "ReplacementStatus",
      "type": "string"
    },
    "strategy": {
      "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
      "enum": [
        "rolling"
      ],
      "title": "ReplacementStrategy",
      "type": "string"
    },
    "switchedAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of when the replacement take action.",
      "title": "Switchedat"
    },
    "taskiqLastHeartbeat": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last taskiq poll for this replacement; used by the cron to detect abandoned taskiq-managed replacements.",
      "title": "Taskiqlastheartbeat"
    },
    "taskiqManaged": {
      "default": false,
      "description": "When true, this replacement is managed by the taskiq worker and should be skipped by the batch cronjob.",
      "title": "Taskiqmanaged",
      "type": "boolean"
    },
    "tenantId": {
      "description": "Id of the tenant this entity belongs to.",
      "format": "uuid4",
      "title": "Tenantid",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updatedat",
      "type": "string"
    },
    "userId": {
      "description": "Id of the user who owns this entity.",
      "title": "Userid",
      "type": "string"
    },
    "workloadId": {
      "description": "Workload id.",
      "title": "Workloadid",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt",
    "userId",
    "tenantId",
    "workloadId",
    "candidateArtifactId"
  ],
  "title": "Replacement",
  "type": "object"
}

Replacement

Properties

Name Type Required Restrictions Description
candidateArtifactId string true Candidate artifact id.
candidateProtonIds [string] false Ids of protons pending promotion during artifact replacement.
config ReplacementConfig false Configuration for the replacement process, including warmup duration and old version retention time.
createdAt string(date-time) true Timestamp of when the entity was created.
deletedAt any false Timestamp of when the entity was deleted.

anyOf

Name Type Required Restrictions Description
» anonymous string(date-time) false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
id string true Unique identifier of the entity.
isDeleted boolean false Whether this entity has been deleted.
message any false Additional information about the replacement status, such as validation errors or reasons for failure.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
name string true Name of the entity.
previousProtonIds any false Ids of protons pending decommissioning during artifact replacement.

anyOf

Name Type Required Restrictions Description
» anonymous [string] false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
protonStatuses any false Latest known status of candidate protons, used to determine replacement status transitions.

anyOf

Name Type Required Restrictions Description
» anonymous object false none
»» additionalProperties ReplicaStatusesSnapshot false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
runtime WorkloadRuntime false Runtime for the workload; required if there is no active artifact for the workload.
status ReplacementStatus false Replacement status.
strategy ReplacementStrategy false Replacement strategy.
switchedAt any false Timestamp of when the replacement take action.

anyOf

Name Type Required Restrictions Description
» anonymous string(date-time) false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
taskiqLastHeartbeat any false Timestamp of the last taskiq poll for this replacement; used by the cron to detect abandoned taskiq-managed replacements.

anyOf

Name Type Required Restrictions Description
» anonymous string(date-time) false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
taskiqManaged boolean false When true, this replacement is managed by the taskiq worker and should be skipped by the batch cronjob.
tenantId string(uuid4) true Id of the tenant this entity belongs to.
updatedAt string(date-time) true Timestamp of when the entity was last updated.
userId string true Id of the user who owns this entity.
workloadId string true Workload id.

ReplacementConfig

{
  "additionalProperties": false,
  "description": "Configuration for workload replacement.",
  "properties": {
    "keepOldVersionMinutes": {
      "default": 0,
      "description": "Duration in minutes to keep the old version during replacement.",
      "title": "Keepoldversionminutes",
      "type": "integer"
    },
    "warmupDurationMinutes": {
      "default": 0,
      "description": "Duration in minutes for the warmup phase during replacement.",
      "title": "Warmupdurationminutes",
      "type": "integer"
    }
  },
  "title": "ReplacementConfig",
  "type": "object"
}

ReplacementConfig

Properties

Name Type Required Restrictions Description
keepOldVersionMinutes integer false Duration in minutes to keep the old version during replacement.
warmupDurationMinutes integer false Duration in minutes for the warmup phase during replacement.

ReplacementHistoryListResponse

{
  "additionalProperties": false,
  "description": "Response model for listing replacement history of a workload.",
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "description": "Store replacement information for workloads.",
        "properties": {
          "candidateArtifactId": {
            "description": "Candidate artifact id.",
            "title": "Candidateartifactid",
            "type": "string"
          },
          "candidateProtonIds": {
            "description": "Ids of protons pending promotion during artifact replacement.",
            "items": {
              "type": "string"
            },
            "title": "Candidateprotonids",
            "type": "array"
          },
          "config": {
            "additionalProperties": false,
            "description": "Configuration for workload replacement.",
            "properties": {
              "keepOldVersionMinutes": {
                "default": 0,
                "description": "Duration in minutes to keep the old version during replacement.",
                "title": "Keepoldversionminutes",
                "type": "integer"
              },
              "warmupDurationMinutes": {
                "default": 0,
                "description": "Duration in minutes for the warmup phase during replacement.",
                "title": "Warmupdurationminutes",
                "type": "integer"
              }
            },
            "title": "ReplacementConfig",
            "type": "object"
          },
          "createdAt": {
            "description": "Timestamp of when the entity was created.",
            "format": "date-time",
            "title": "Createdat",
            "type": "string"
          },
          "deletedAt": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of when the entity was deleted.",
            "title": "Deletedat"
          },
          "id": {
            "description": "Unique identifier of the entity.",
            "title": "Id",
            "type": "string"
          },
          "isDeleted": {
            "default": false,
            "description": "Whether this entity has been deleted.",
            "title": "Isdeleted",
            "type": "boolean"
          },
          "message": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Additional information about the replacement status, such as validation errors or reasons for failure.",
            "title": "Message"
          },
          "name": {
            "description": "Name of the entity.",
            "title": "Name",
            "type": "string"
          },
          "previousProtonIds": {
            "anyOf": [
              {
                "items": {
                  "type": "string"
                },
                "type": "array"
              },
              {
                "type": "null"
              }
            ],
            "description": "Ids of protons pending decommissioning during artifact replacement.",
            "title": "Previousprotonids"
          },
          "protonStatuses": {
            "anyOf": [
              {
                "additionalProperties": {
                  "additionalProperties": false,
                  "properties": {
                    "overallStatus": {
                      "additionalProperties": false,
                      "description": "Overall status as reported by the workload-monitor service.",
                      "properties": {
                        "lastUpdated": {
                          "description": "Rfc3339 timestamp of the last state transition.",
                          "title": "Lastupdated",
                          "type": "string"
                        },
                        "state": {
                          "enum": [
                            "unknown",
                            "submitted",
                            "initializing",
                            "provisioning",
                            "launching",
                            "running",
                            "suspended",
                            "warming",
                            "draining",
                            "interrupted",
                            "restarting",
                            "stopping",
                            "stopped",
                            "errored",
                            "terminated"
                          ],
                          "title": "ProtonStatus",
                          "type": "string"
                        },
                        "summary": {
                          "description": "Human-readable description of the current state.",
                          "title": "Summary",
                          "type": "string"
                        }
                      },
                      "required": [
                        "state",
                        "summary",
                        "lastUpdated"
                      ],
                      "title": "WorkloadMonitorOverallStatus",
                      "type": "object"
                    },
                    "replicas": {
                      "items": {
                        "additionalProperties": false,
                        "properties": {
                          "address": {
                            "title": "Address",
                            "type": "string"
                          },
                          "conditions": {
                            "items": {
                              "additionalProperties": false,
                              "properties": {
                                "lastTransitionTime": {
                                  "title": "Lasttransitiontime",
                                  "type": "string"
                                },
                                "message": {
                                  "default": "",
                                  "title": "Message",
                                  "type": "string"
                                },
                                "reason": {
                                  "default": "",
                                  "title": "Reason",
                                  "type": "string"
                                },
                                "type": {
                                  "title": "Type",
                                  "type": "string"
                                },
                                "value": {
                                  "anyOf": [
                                    {
                                      "type": "boolean"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "title": "Value"
                                }
                              },
                              "required": [
                                "type",
                                "value",
                                "lastTransitionTime"
                              ],
                              "title": "ReplicaConditionDetail",
                              "type": "object"
                            },
                            "title": "Conditions",
                            "type": "array"
                          },
                          "containers": {
                            "items": {
                              "additionalProperties": false,
                              "properties": {
                                "image": {
                                  "title": "Image",
                                  "type": "string"
                                },
                                "name": {
                                  "title": "Name",
                                  "type": "string"
                                },
                                "ready": {
                                  "title": "Ready",
                                  "type": "boolean"
                                },
                                "restartCount": {
                                  "title": "Restartcount",
                                  "type": "integer"
                                },
                                "startedAt": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "title": "Startedat"
                                },
                                "status": {
                                  "description": "Lifecycle state of a container within a deployment replica.",
                                  "enum": [
                                    "running",
                                    "waiting",
                                    "terminated",
                                    "unknown"
                                  ],
                                  "title": "ContainerStatus",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "name",
                                "status",
                                "startedAt",
                                "ready",
                                "restartCount",
                                "image"
                              ],
                              "title": "ContainerStatusDetail",
                              "type": "object"
                            },
                            "title": "Containers",
                            "type": "array"
                          },
                          "name": {
                            "title": "Name",
                            "type": "string"
                          },
                          "nodeAddress": {
                            "title": "Nodeaddress",
                            "type": "string"
                          },
                          "startedAt": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "title": "Startedat"
                          },
                          "status": {
                            "description": "Lifecycle phase of a deployment replica.",
                            "enum": [
                              "pending",
                              "running",
                              "succeeded",
                              "failed",
                              "unknown"
                            ],
                            "title": "ReplicaPhase",
                            "type": "string"
                          }
                        },
                        "required": [
                          "name",
                          "status",
                          "address",
                          "nodeAddress",
                          "startedAt",
                          "conditions",
                          "containers"
                        ],
                        "title": "ReplicaDetail",
                        "type": "object"
                      },
                      "title": "Replicas",
                      "type": "array"
                    }
                  },
                  "required": [
                    "overallStatus",
                    "replicas"
                  ],
                  "title": "ReplicaStatusesSnapshot",
                  "type": "object"
                },
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Latest known status of candidate protons, used to determine replacement status transitions.",
            "title": "Protonstatuses"
          },
          "runtime": {
            "additionalProperties": false,
            "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
            "properties": {
              "containerGroups": {
                "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime configuration for a single container group.",
                  "properties": {
                    "autoscaling": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Autoscaling configuration for a proton.",
                          "properties": {
                            "enabled": {
                              "default": true,
                              "description": "Whether autoscaling is enabled.",
                              "title": "Enabled",
                              "type": "boolean"
                            },
                            "policies": {
                              "items": {
                                "additionalProperties": false,
                                "description": "Base class for autoscaling policies.",
                                "properties": {
                                  "maxCount": {
                                    "description": "Maximum number of replicas.",
                                    "minimum": 0,
                                    "title": "Max Count",
                                    "type": "integer"
                                  },
                                  "minCount": {
                                    "description": "Minimum number of replicas.",
                                    "minimum": 0,
                                    "title": "Min Count",
                                    "type": "integer"
                                  },
                                  "priority": {
                                    "anyOf": [
                                      {
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Policy priority when multiple policies are defined.",
                                    "title": "Priority"
                                  },
                                  "scalingMetric": {
                                    "anyOf": [
                                      {
                                        "oneOf": [
                                          {
                                            "const": "cpuAverageUtilization",
                                            "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                            "title": "CPU Average Utilization"
                                          },
                                          {
                                            "const": "httpRequestsConcurrency",
                                            "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                            "title": "HTTP Requests Concurrency"
                                          },
                                          {
                                            "const": "gpuCacheUtilization",
                                            "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                            "title": "GPU Cache Utilization"
                                          },
                                          {
                                            "const": "gpuRequestQueueDepth",
                                            "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                            "title": "GPU Request Queue Depth"
                                          }
                                        ],
                                        "title": "ScalingMetricType",
                                        "type": "string"
                                      },
                                      {
                                        "type": "string"
                                      }
                                    ],
                                    "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                                    "title": "Scaling Metric"
                                  },
                                  "target": {
                                    "description": "Target value for the scaling metric.",
                                    "minimum": 0,
                                    "title": "Target",
                                    "type": "number"
                                  }
                                },
                                "required": [
                                  "scalingMetric",
                                  "target",
                                  "minCount",
                                  "maxCount"
                                ],
                                "title": "AutoscalingPolicy",
                                "type": "object"
                              },
                              "title": "Policies",
                              "type": "array"
                            }
                          },
                          "required": [
                            "policies"
                          ],
                          "title": "AutoscalingProperties",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Autoscaling configuration for this group. takes precedence over replicacount."
                    },
                    "bundleSelectionPolicy": {
                      "enum": [
                        "availability"
                      ],
                      "title": "BundleSelectionPolicy",
                      "type": "string"
                    },
                    "containers": {
                      "description": "Per-container overrides for this group.",
                      "items": {
                        "additionalProperties": false,
                        "description": "Runtime diff targeting a single named container within a group.",
                        "properties": {
                          "name": {
                            "description": "Container name. must match a container declared in the artifact group.",
                            "title": "Name",
                            "type": "string"
                          },
                          "resourceAllocation": {
                            "anyOf": [
                              {
                                "additionalProperties": false,
                                "description": "Per-container resource allocation declared at runtime.",
                                "properties": {
                                  "cpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0.1,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Cpu cores allocated to this container.",
                                    "title": "Cpu"
                                  },
                                  "gpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Gpus allocated to this container.",
                                    "title": "Gpu"
                                  },
                                  "memory": {
                                    "anyOf": [
                                      {
                                        "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                        "type": "string"
                                      },
                                      {
                                        "minimum": 0,
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                                    "examples": [
                                      "8GB",
                                      "512MB"
                                    ],
                                    "title": "Memory"
                                  }
                                },
                                "title": "ResourceAllocation",
                                "type": "object"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Resource allocation for this container. required for multi-container groups."
                          }
                        },
                        "required": [
                          "name"
                        ],
                        "title": "ContainerOverride",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "default": "default",
                      "description": "Group name. must match a container group name declared in the artifact.",
                      "title": "Name",
                      "type": "string"
                    },
                    "replicaCount": {
                      "anyOf": [
                        {
                          "minimum": 1,
                          "type": "integer"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "default": 1,
                      "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                      "title": "Replicacount"
                    },
                    "resolvedBundle": {
                      "anyOf": [
                        {
                          "description": "Bundle details returned in the runtime response after scheduling.",
                          "properties": {
                            "cpuCount": {
                              "description": "Number of cpu cores.",
                              "title": "CPU Count",
                              "type": "number"
                            },
                            "gpuCount": {
                              "default": 0,
                              "description": "Number of gpu units.",
                              "title": "GPU Count",
                              "type": "integer"
                            },
                            "gpuMaker": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu manufacturer.",
                              "title": "GPU Maker"
                            },
                            "gpuTypeLabel": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu type label.",
                              "title": "GPU Type Label"
                            },
                            "id": {
                              "description": "Bundle identifier that was selected.",
                              "title": "Id",
                              "type": "string"
                            },
                            "memoryBytes": {
                              "description": "Memory size in bytes.",
                              "title": "Memory Bytes",
                              "type": "integer"
                            }
                          },
                          "required": [
                            "id",
                            "cpuCount",
                            "memoryBytes"
                          ],
                          "title": "ResolvedBundle",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Full details of the bundle selected at scheduling time. read-only.",
                      "readOnly": true
                    },
                    "resourceBundles": {
                      "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                      "items": {
                        "type": "string"
                      },
                      "title": "Resourcebundles",
                      "type": "array"
                    }
                  },
                  "title": "GroupRuntime",
                  "type": "object"
                },
                "title": "Containergroups",
                "type": "array"
              }
            },
            "title": "WorkloadRuntime",
            "type": "object"
          },
          "status": {
            "description": "Statuses for workload replacement process.",
            "enum": [
              "unknown",
              "submitted",
              "initializing",
              "awaiting_promotion",
              "switching",
              "deleting",
              "completed",
              "errored",
              "cleaning_up"
            ],
            "title": "ReplacementStatus",
            "type": "string"
          },
          "strategy": {
            "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
            "enum": [
              "rolling"
            ],
            "title": "ReplacementStrategy",
            "type": "string"
          },
          "switchedAt": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of when the replacement take action.",
            "title": "Switchedat"
          },
          "taskiqLastHeartbeat": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of the last taskiq poll for this replacement; used by the cron to detect abandoned taskiq-managed replacements.",
            "title": "Taskiqlastheartbeat"
          },
          "taskiqManaged": {
            "default": false,
            "description": "When true, this replacement is managed by the taskiq worker and should be skipped by the batch cronjob.",
            "title": "Taskiqmanaged",
            "type": "boolean"
          },
          "tenantId": {
            "description": "Id of the tenant this entity belongs to.",
            "format": "uuid4",
            "title": "Tenantid",
            "type": "string"
          },
          "updatedAt": {
            "description": "Timestamp of when the entity was last updated.",
            "format": "date-time",
            "title": "Updatedat",
            "type": "string"
          },
          "userId": {
            "description": "Id of the user who owns this entity.",
            "title": "Userid",
            "type": "string"
          },
          "workloadId": {
            "description": "Workload id.",
            "title": "Workloadid",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "createdAt",
          "updatedAt",
          "userId",
          "tenantId",
          "workloadId",
          "candidateArtifactId"
        ],
        "title": "Replacement",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "ReplacementHistoryListResponse",
  "type": "object"
}

ReplacementHistoryListResponse

Properties

Name Type Required Restrictions Description
count integer true The number of records on this page.
data [Replacement] true The list of records.
next any true The url to the next page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
previous any true The url to the previous page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
totalCount integer true The total number of records.

ReplacementStatus

{
  "description": "Statuses for workload replacement process.",
  "enum": [
    "unknown",
    "submitted",
    "initializing",
    "awaiting_promotion",
    "switching",
    "deleting",
    "completed",
    "errored",
    "cleaning_up"
  ],
  "title": "ReplacementStatus",
  "type": "string"
}

ReplacementStatus

Properties

Name Type Required Restrictions Description
ReplacementStatus string false Statuses for workload replacement process.

Enumerated Values

Property Value
ReplacementStatus [unknown, submitted, initializing, awaiting_promotion, switching, deleting, completed, errored, cleaning_up]

ReplacementStrategy

{
  "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
  "enum": [
    "rolling"
  ],
  "title": "ReplacementStrategy",
  "type": "string"
}

ReplacementStrategy

Properties

Name Type Required Restrictions Description
ReplacementStrategy string false Types of replacement strategies. rolling - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.

Enumerated Values

Property Value
ReplacementStrategy rolling

ReplicaConditionDetail

{
  "additionalProperties": false,
  "properties": {
    "lastTransitionTime": {
      "title": "Lasttransitiontime",
      "type": "string"
    },
    "message": {
      "default": "",
      "title": "Message",
      "type": "string"
    },
    "reason": {
      "default": "",
      "title": "Reason",
      "type": "string"
    },
    "type": {
      "title": "Type",
      "type": "string"
    },
    "value": {
      "anyOf": [
        {
          "type": "boolean"
        },
        {
          "type": "null"
        }
      ],
      "title": "Value"
    }
  },
  "required": [
    "type",
    "value",
    "lastTransitionTime"
  ],
  "title": "ReplicaConditionDetail",
  "type": "object"
}

ReplicaConditionDetail

Properties

Name Type Required Restrictions Description
lastTransitionTime string true none
message string false none
reason string false none
type string true none
value any true none

anyOf

Name Type Required Restrictions Description
» anonymous boolean false none

or

Name Type Required Restrictions Description
» anonymous null false none

ReplicaDetail

{
  "additionalProperties": false,
  "properties": {
    "address": {
      "title": "Address",
      "type": "string"
    },
    "conditions": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "lastTransitionTime": {
            "title": "Lasttransitiontime",
            "type": "string"
          },
          "message": {
            "default": "",
            "title": "Message",
            "type": "string"
          },
          "reason": {
            "default": "",
            "title": "Reason",
            "type": "string"
          },
          "type": {
            "title": "Type",
            "type": "string"
          },
          "value": {
            "anyOf": [
              {
                "type": "boolean"
              },
              {
                "type": "null"
              }
            ],
            "title": "Value"
          }
        },
        "required": [
          "type",
          "value",
          "lastTransitionTime"
        ],
        "title": "ReplicaConditionDetail",
        "type": "object"
      },
      "title": "Conditions",
      "type": "array"
    },
    "containers": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "image": {
            "title": "Image",
            "type": "string"
          },
          "name": {
            "title": "Name",
            "type": "string"
          },
          "ready": {
            "title": "Ready",
            "type": "boolean"
          },
          "restartCount": {
            "title": "Restartcount",
            "type": "integer"
          },
          "startedAt": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "title": "Startedat"
          },
          "status": {
            "description": "Lifecycle state of a container within a deployment replica.",
            "enum": [
              "running",
              "waiting",
              "terminated",
              "unknown"
            ],
            "title": "ContainerStatus",
            "type": "string"
          }
        },
        "required": [
          "name",
          "status",
          "startedAt",
          "ready",
          "restartCount",
          "image"
        ],
        "title": "ContainerStatusDetail",
        "type": "object"
      },
      "title": "Containers",
      "type": "array"
    },
    "name": {
      "title": "Name",
      "type": "string"
    },
    "nodeAddress": {
      "title": "Nodeaddress",
      "type": "string"
    },
    "startedAt": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Startedat"
    },
    "status": {
      "description": "Lifecycle phase of a deployment replica.",
      "enum": [
        "pending",
        "running",
        "succeeded",
        "failed",
        "unknown"
      ],
      "title": "ReplicaPhase",
      "type": "string"
    }
  },
  "required": [
    "name",
    "status",
    "address",
    "nodeAddress",
    "startedAt",
    "conditions",
    "containers"
  ],
  "title": "ReplicaDetail",
  "type": "object"
}

ReplicaDetail

Properties

Name Type Required Restrictions Description
address string true none
conditions [ReplicaConditionDetail] true none
containers [ContainerStatusDetail] true none
name string true none
nodeAddress string true none
startedAt any true none

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
status ReplicaPhase true Lifecycle phase of a deployment replica.

ReplicaPhase

{
  "description": "Lifecycle phase of a deployment replica.",
  "enum": [
    "pending",
    "running",
    "succeeded",
    "failed",
    "unknown"
  ],
  "title": "ReplicaPhase",
  "type": "string"
}

ReplicaPhase

Properties

Name Type Required Restrictions Description
ReplicaPhase string false Lifecycle phase of a deployment replica.

Enumerated Values

Property Value
ReplicaPhase [pending, running, succeeded, failed, unknown]

ReplicaStatusesSnapshot

{
  "additionalProperties": false,
  "properties": {
    "overallStatus": {
      "additionalProperties": false,
      "description": "Overall status as reported by the workload-monitor service.",
      "properties": {
        "lastUpdated": {
          "description": "Rfc3339 timestamp of the last state transition.",
          "title": "Lastupdated",
          "type": "string"
        },
        "state": {
          "enum": [
            "unknown",
            "submitted",
            "initializing",
            "provisioning",
            "launching",
            "running",
            "suspended",
            "warming",
            "draining",
            "interrupted",
            "restarting",
            "stopping",
            "stopped",
            "errored",
            "terminated"
          ],
          "title": "ProtonStatus",
          "type": "string"
        },
        "summary": {
          "description": "Human-readable description of the current state.",
          "title": "Summary",
          "type": "string"
        }
      },
      "required": [
        "state",
        "summary",
        "lastUpdated"
      ],
      "title": "WorkloadMonitorOverallStatus",
      "type": "object"
    },
    "replicas": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "address": {
            "title": "Address",
            "type": "string"
          },
          "conditions": {
            "items": {
              "additionalProperties": false,
              "properties": {
                "lastTransitionTime": {
                  "title": "Lasttransitiontime",
                  "type": "string"
                },
                "message": {
                  "default": "",
                  "title": "Message",
                  "type": "string"
                },
                "reason": {
                  "default": "",
                  "title": "Reason",
                  "type": "string"
                },
                "type": {
                  "title": "Type",
                  "type": "string"
                },
                "value": {
                  "anyOf": [
                    {
                      "type": "boolean"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "title": "Value"
                }
              },
              "required": [
                "type",
                "value",
                "lastTransitionTime"
              ],
              "title": "ReplicaConditionDetail",
              "type": "object"
            },
            "title": "Conditions",
            "type": "array"
          },
          "containers": {
            "items": {
              "additionalProperties": false,
              "properties": {
                "image": {
                  "title": "Image",
                  "type": "string"
                },
                "name": {
                  "title": "Name",
                  "type": "string"
                },
                "ready": {
                  "title": "Ready",
                  "type": "boolean"
                },
                "restartCount": {
                  "title": "Restartcount",
                  "type": "integer"
                },
                "startedAt": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "title": "Startedat"
                },
                "status": {
                  "description": "Lifecycle state of a container within a deployment replica.",
                  "enum": [
                    "running",
                    "waiting",
                    "terminated",
                    "unknown"
                  ],
                  "title": "ContainerStatus",
                  "type": "string"
                }
              },
              "required": [
                "name",
                "status",
                "startedAt",
                "ready",
                "restartCount",
                "image"
              ],
              "title": "ContainerStatusDetail",
              "type": "object"
            },
            "title": "Containers",
            "type": "array"
          },
          "name": {
            "title": "Name",
            "type": "string"
          },
          "nodeAddress": {
            "title": "Nodeaddress",
            "type": "string"
          },
          "startedAt": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "title": "Startedat"
          },
          "status": {
            "description": "Lifecycle phase of a deployment replica.",
            "enum": [
              "pending",
              "running",
              "succeeded",
              "failed",
              "unknown"
            ],
            "title": "ReplicaPhase",
            "type": "string"
          }
        },
        "required": [
          "name",
          "status",
          "address",
          "nodeAddress",
          "startedAt",
          "conditions",
          "containers"
        ],
        "title": "ReplicaDetail",
        "type": "object"
      },
      "title": "Replicas",
      "type": "array"
    }
  },
  "required": [
    "overallStatus",
    "replicas"
  ],
  "title": "ReplicaStatusesSnapshot",
  "type": "object"
}

ReplicaStatusesSnapshot

Properties

Name Type Required Restrictions Description
overallStatus WorkloadMonitorOverallStatus true Overall status as reported by the workload-monitor service.
replicas [ReplicaDetail] true none

RequestMetrics

{
  "additionalProperties": false,
  "description": "Detailed request metrics.",
  "properties": {
    "concurrentRequests": {
      "default": 0,
      "description": "Current concurrent requests.",
      "title": "Concurrentrequests",
      "type": "integer"
    },
    "requestsPerMinute": {
      "default": 0,
      "description": "Average requests per minute.",
      "title": "Requestsperminute",
      "type": "integer"
    },
    "responseTime": {
      "default": 0,
      "description": "Average response time in milliseconds.",
      "title": "Responsetime",
      "type": "integer"
    },
    "serverErrorRate": {
      "default": 0,
      "description": "Server error rate.",
      "title": "Servererrorrate",
      "type": "number"
    },
    "serverErrors": {
      "default": 0,
      "description": "Number of server errors (5xx).",
      "title": "Servererrors",
      "type": "integer"
    },
    "slowRequests": {
      "default": 0,
      "description": "Number of slow requests exceeding threshold.",
      "title": "Slowrequests",
      "type": "integer"
    },
    "totalErrorRate": {
      "default": 0,
      "description": "Total error rate.",
      "title": "Totalerrorrate",
      "type": "number"
    },
    "totalRequests": {
      "default": 0,
      "description": "Total number of requests.",
      "title": "Totalrequests",
      "type": "integer"
    },
    "userErrorRate": {
      "default": 0,
      "description": "User error rate.",
      "title": "Usererrorrate",
      "type": "number"
    },
    "userErrors": {
      "default": 0,
      "description": "Number of user errors (4xx).",
      "title": "Usererrors",
      "type": "integer"
    }
  },
  "title": "RequestMetrics",
  "type": "object"
}

RequestMetrics

Properties

Name Type Required Restrictions Description
concurrentRequests integer false Current concurrent requests.
requestsPerMinute integer false Average requests per minute.
responseTime integer false Average response time in milliseconds.
serverErrorRate number false Server error rate.
serverErrors integer false Number of server errors (5xx).
slowRequests integer false Number of slow requests exceeding threshold.
totalErrorRate number false Total error rate.
totalRequests integer false Total number of requests.
userErrorRate number false User error rate.
userErrors integer false Number of user errors (4xx).

RequestStats

{
  "additionalProperties": false,
  "description": "Request statistics summary.",
  "properties": {
    "concurrentRequests": {
      "default": 0,
      "description": "Number of concurrent requests.",
      "title": "Concurrentrequests",
      "type": "integer"
    },
    "errorRate": {
      "default": 0,
      "description": "Error rate percentage.",
      "title": "Errorrate",
      "type": "number"
    },
    "errorRates": {
      "description": "Error rates over the last 7 time periods.",
      "items": {
        "type": "integer"
      },
      "title": "Errorrates",
      "type": "array"
    },
    "lastRequestAt": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last request.",
      "title": "Lastrequestat"
    },
    "requestRates": {
      "description": "Request rates over the last 7 time periods.",
      "items": {
        "type": "integer"
      },
      "title": "Requestrates",
      "type": "array"
    },
    "responseTime": {
      "default": 0,
      "description": "Average response time in milliseconds.",
      "title": "Responsetime",
      "type": "integer"
    },
    "totalRequests": {
      "default": 0,
      "description": "Total number of requests.",
      "title": "Totalrequests",
      "type": "integer"
    }
  },
  "title": "RequestStats",
  "type": "object"
}

RequestStats

Properties

Name Type Required Restrictions Description
concurrentRequests integer false Number of concurrent requests.
errorRate number false Error rate percentage.
errorRates [integer] false Error rates over the last 7 time periods.
lastRequestAt any false Timestamp of the last request.

anyOf

Name Type Required Restrictions Description
» anonymous string(date-time) false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
requestRates [integer] false Request rates over the last 7 time periods.
responseTime integer false Average response time in milliseconds.
totalRequests integer false Total number of requests.

ResolvedBundle

{
  "description": "Bundle details returned in the runtime response after scheduling.",
  "properties": {
    "cpuCount": {
      "description": "Number of cpu cores.",
      "title": "CPU Count",
      "type": "number"
    },
    "gpuCount": {
      "default": 0,
      "description": "Number of gpu units.",
      "title": "GPU Count",
      "type": "integer"
    },
    "gpuMaker": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Gpu manufacturer.",
      "title": "GPU Maker"
    },
    "gpuTypeLabel": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Gpu type label.",
      "title": "GPU Type Label"
    },
    "id": {
      "description": "Bundle identifier that was selected.",
      "title": "Id",
      "type": "string"
    },
    "memoryBytes": {
      "description": "Memory size in bytes.",
      "title": "Memory Bytes",
      "type": "integer"
    }
  },
  "required": [
    "id",
    "cpuCount",
    "memoryBytes"
  ],
  "title": "ResolvedBundle",
  "type": "object"
}

ResolvedBundle

Properties

Name Type Required Restrictions Description
cpuCount number true Number of cpu cores.
gpuCount integer false Number of gpu units.
gpuMaker any false Gpu manufacturer.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
gpuTypeLabel any false Gpu type label.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
id string true Bundle identifier that was selected.
memoryBytes integer true Memory size in bytes.

ResourceAllocation

{
  "additionalProperties": false,
  "description": "Per-container resource allocation declared at runtime.",
  "properties": {
    "cpu": {
      "anyOf": [
        {
          "minimum": 0.1,
          "type": "number"
        },
        {
          "type": "null"
        }
      ],
      "description": "Cpu cores allocated to this container.",
      "title": "Cpu"
    },
    "gpu": {
      "anyOf": [
        {
          "minimum": 0,
          "type": "number"
        },
        {
          "type": "null"
        }
      ],
      "description": "Gpus allocated to this container.",
      "title": "Gpu"
    },
    "memory": {
      "anyOf": [
        {
          "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
          "type": "string"
        },
        {
          "minimum": 0,
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
      "examples": [
        "8GB",
        "512MB"
      ],
      "title": "Memory"
    }
  },
  "title": "ResourceAllocation",
  "type": "object"
}

ResourceAllocation

Properties

Name Type Required Restrictions Description
cpu any false Cpu cores allocated to this container.

anyOf

Name Type Required Restrictions Description
» anonymous number false minimum: 0.1
none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
gpu any false Gpus allocated to this container.

anyOf

Name Type Required Restrictions Description
» anonymous number false minimum: 0
none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
memory any false Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous integer false minimum: 0
none

or

Name Type Required Restrictions Description
» anonymous null false none

ResourcePermission

{
  "description": "Represents the particular role a user, group or organization holds on an entity.",
  "enum": [
    "CAN_VIEW",
    "CAN_UPDATE",
    "CAN_DELETE",
    "CAN_SHARE",
    "CAN_MAKE_PREDICTIONS",
    "CAN_SHARE_ROLE_OWNER",
    "CAN_SHARE_ROLE_READ_WRITE",
    "CAN_SHARE_ROLE_READ_ONLY"
  ],
  "title": "ResourcePermission",
  "type": "string"
}

ResourcePermission

Properties

Name Type Required Restrictions Description
ResourcePermission string false Represents the particular role a user, group or organization holds on an entity.

Enumerated Values

Property Value
ResourcePermission [CAN_VIEW, CAN_UPDATE, CAN_DELETE, CAN_SHARE, CAN_MAKE_PREDICTIONS, CAN_SHARE_ROLE_OWNER, CAN_SHARE_ROLE_READ_WRITE, CAN_SHARE_ROLE_READ_ONLY]

ResourcePermissions

{
  "anyOf": [
    {
      "items": {
        "description": "Represents the particular role a user, group or organization holds on an entity.",
        "enum": [
          "CAN_VIEW",
          "CAN_UPDATE",
          "CAN_DELETE",
          "CAN_SHARE",
          "CAN_MAKE_PREDICTIONS",
          "CAN_SHARE_ROLE_OWNER",
          "CAN_SHARE_ROLE_READ_WRITE",
          "CAN_SHARE_ROLE_READ_ONLY"
        ],
        "title": "ResourcePermission",
        "type": "string"
      },
      "type": "array"
    },
    {
      "items": {
        "const": "*",
        "type": "string"
      },
      "type": "array"
    }
  ]
}

Properties

anyOf

Name Type Required Restrictions Description
anonymous [ResourcePermission] false [Represents the particular role a user, group or organization holds on an entity.]

or

Name Type Required Restrictions Description
anonymous [ANY_PERMISSION] false none

ResourceTypes

{
  "enum": [
    "artifact",
    "artifact_repository",
    "proton",
    "workload",
    "custom_model"
  ],
  "title": "ResourceTypes",
  "type": "string"
}

ResourceTypes

Properties

Name Type Required Restrictions Description
ResourceTypes string false none

Enumerated Values

Property Value
ResourceTypes [artifact, artifact_repository, proton, workload, custom_model]

ScalingMetricType

{
  "oneOf": [
    {
      "const": "cpuAverageUtilization",
      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
      "title": "CPU Average Utilization"
    },
    {
      "const": "httpRequestsConcurrency",
      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
      "title": "HTTP Requests Concurrency"
    },
    {
      "const": "gpuCacheUtilization",
      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
      "title": "GPU Cache Utilization"
    },
    {
      "const": "gpuRequestQueueDepth",
      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
      "title": "GPU Request Queue Depth"
    }
  ],
  "title": "ScalingMetricType",
  "type": "string"
}

ScalingMetricType

Properties

Name Type Required Restrictions Description
ScalingMetricType string false none

oneOf

Name Type Required Restrictions Description
anonymous any false Scale replicas to maintain a target average CPU utilization across pods.

xor

Name Type Required Restrictions Description
anonymous any false Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.

xor

Name Type Required Restrictions Description
anonymous any false Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.

xor

Name Type Required Restrictions Description
anonymous any false Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.

SeccompProfile

{
  "additionalProperties": false,
  "description": "Seccomp profile configuration.",
  "properties": {
    "localhostProfile": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
      "title": "Localhostprofile"
    },
    "type": {
      "description": "Allowed seccomp profile types.",
      "enum": [
        "RuntimeDefault",
        "Unconfined",
        "Localhost"
      ],
      "title": "SeccompProfileType",
      "type": "string"
    }
  },
  "required": [
    "type"
  ],
  "title": "SeccompProfile",
  "type": "object"
}

SeccompProfile

Properties

Name Type Required Restrictions Description
localhostProfile any false Path to a seccomp profile on the node. only valid when type is localhost.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
type SeccompProfileType true Seccomp profile type.

SeccompProfileType

{
  "description": "Allowed seccomp profile types.",
  "enum": [
    "RuntimeDefault",
    "Unconfined",
    "Localhost"
  ],
  "title": "SeccompProfileType",
  "type": "string"
}

SeccompProfileType

Properties

Name Type Required Restrictions Description
SeccompProfileType string false Allowed seccomp profile types.

Enumerated Values

Property Value
SeccompProfileType [RuntimeDefault, Unconfined, Localhost]

SecurityContext

{
  "additionalProperties": false,
  "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
  "properties": {
    "allowPrivilegeEscalation": {
      "anyOf": [
        {
          "type": "boolean"
        },
        {
          "type": "null"
        }
      ],
      "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
      "title": "Allowprivilegeescalation"
    },
    "capabilities": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Linux capabilities to add or drop from the container.",
          "properties": {
            "add": {
              "anyOf": [
                {
                  "items": {
                    "type": "string"
                  },
                  "type": "array"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Capabilities to add.",
              "title": "Add"
            },
            "drop": {
              "anyOf": [
                {
                  "items": {
                    "type": "string"
                  },
                  "type": "array"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Capabilities to drop.",
              "title": "Drop"
            }
          },
          "title": "Capabilities",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Linux capabilities to add or drop."
    },
    "readOnlyRootFilesystem": {
      "anyOf": [
        {
          "type": "boolean"
        },
        {
          "type": "null"
        }
      ],
      "description": "Whether the root filesystem is read-only.",
      "title": "Readonlyrootfilesystem"
    },
    "seccompProfile": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Seccomp profile configuration.",
          "properties": {
            "localhostProfile": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
              "title": "Localhostprofile"
            },
            "type": {
              "description": "Allowed seccomp profile types.",
              "enum": [
                "RuntimeDefault",
                "Unconfined",
                "Localhost"
              ],
              "title": "SeccompProfileType",
              "type": "string"
            }
          },
          "required": [
            "type"
          ],
          "title": "SeccompProfile",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Seccomp profile for the container."
    }
  },
  "title": "SecurityContext",
  "type": "object"
}

SecurityContext

Properties

Name Type Required Restrictions Description
allowPrivilegeEscalation any false Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.

anyOf

Name Type Required Restrictions Description
» anonymous boolean false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
capabilities any false Linux capabilities to add or drop.

anyOf

Name Type Required Restrictions Description
» anonymous Capabilities false Linux capabilities to add or drop from the container.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
readOnlyRootFilesystem any false Whether the root filesystem is read-only.

anyOf

Name Type Required Restrictions Description
» anonymous boolean false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
seccompProfile any false Seccomp profile for the container.

anyOf

Name Type Required Restrictions Description
» anonymous SeccompProfile false Seccomp profile configuration.

or

Name Type Required Restrictions Description
» anonymous null false none

ServiceArtifactSpec

{
  "additionalProperties": false,
  "properties": {
    "containerGroups": {
      "default": [],
      "description": "List of container groups.",
      "items": {
        "additionalProperties": false,
        "properties": {
          "containers": {
            "default": [],
            "description": "List of containers making this container group.",
            "items": {
              "additionalProperties": false,
              "properties": {
                "build": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "description": "Build reference embedded in a container spec when an image build is triggered.",
                      "properties": {
                        "artifactImageBuildId": {
                          "description": "Artifact image build id.",
                          "title": "Artifactimagebuildid",
                          "type": "string"
                        },
                        "createdAt": {
                          "description": "Build creation timestamp (utc).",
                          "format": "date-time",
                          "title": "Createdat",
                          "type": "string"
                        },
                        "status": {
                          "description": "Image build reported status at submit time.",
                          "title": "Status",
                          "type": "string"
                        }
                      },
                      "required": [
                        "artifactImageBuildId",
                        "status",
                        "createdAt"
                      ],
                      "title": "ContainerBuildInfo",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Server-set image build metadata (e.g. after lock or draft build trigger). workload API clears this on artifact create/update before persistence; clients must not rely on sending it."
                },
                "description": {
                  "default": "",
                  "description": "Description of the container.",
                  "title": "Description",
                  "type": "string"
                },
                "entrypoint": {
                  "anyOf": [
                    {
                      "items": {
                        "type": "string"
                      },
                      "type": "array"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Runtime entrypoint override for the container command. independent of build entrypoint.",
                  "title": "Entrypoint"
                },
                "environmentVars": {
                  "default": [],
                  "description": "Environment variables.",
                  "items": {
                    "anyOf": [
                      {
                        "properties": {
                          "name": {
                            "description": "Name of the environment variable.",
                            "title": "Name",
                            "type": "string"
                          },
                          "source": {
                            "const": "string",
                            "default": "string",
                            "title": "Source",
                            "type": "string"
                          },
                          "value": {
                            "description": "Value of the environment variable.",
                            "title": "Value",
                            "type": "string"
                          }
                        },
                        "required": [
                          "name",
                          "value"
                        ],
                        "title": "StringEnvironmentVariable",
                        "type": "object"
                      },
                      {
                        "properties": {
                          "drCredentialId": {
                            "description": "Id of the datarobot credential to use.",
                            "title": "DR Credential ID",
                            "type": "string"
                          },
                          "key": {
                            "description": "Key within the credential.",
                            "title": "Key",
                            "type": "string"
                          },
                          "name": {
                            "description": "Name of the environment variable.",
                            "title": "Name",
                            "type": "string"
                          },
                          "source": {
                            "const": "dr-credential",
                            "title": "Source",
                            "type": "string"
                          }
                        },
                        "required": [
                          "source",
                          "name",
                          "drCredentialId",
                          "key"
                        ],
                        "title": "CredentialEnvironmentVariable",
                        "type": "object"
                      },
                      {
                        "description": "A platform-managed datarobot API token injected as an environment variable. the token value is resolved at proton creation (find-or-create a per-workload ``workload <workloadid>`` API key scoped to the invoking user); no value or id is supplied by the user.",
                        "properties": {
                          "name": {
                            "description": "Name of the environment variable.",
                            "title": "Name",
                            "type": "string"
                          },
                          "source": {
                            "const": "dr-api-token",
                            "title": "Source",
                            "type": "string"
                          }
                        },
                        "required": [
                          "source",
                          "name"
                        ],
                        "title": "DrApiTokenEnvironmentVariable",
                        "type": "object"
                      }
                    ]
                  },
                  "title": "Environmentvars",
                  "type": "array"
                },
                "imageBuildConfig": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "description": "User-provided configuration for server-side image builds from source code.",
                      "properties": {
                        "codeRef": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "properties": {
                                "datarobot": {
                                  "additionalProperties": false,
                                  "properties": {
                                    "catalogId": {
                                      "title": "Catalogid",
                                      "type": "string"
                                    },
                                    "catalogVersionId": {
                                      "title": "Catalogversionid",
                                      "type": "string"
                                    }
                                  },
                                  "required": [
                                    "catalogId",
                                    "catalogVersionId"
                                  ],
                                  "title": "DataRobotCodeRef",
                                  "type": "object"
                                },
                                "provider": {
                                  "const": "datarobot",
                                  "default": "datarobot",
                                  "title": "Provider",
                                  "type": "string"
                                },
                                "type": {
                                  "const": "datarobot",
                                  "default": "datarobot",
                                  "title": "Type",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "datarobot"
                              ],
                              "title": "CodeRef",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Reference to source code (e.g. files API catalog). optional at create time; required before build or lock."
                        },
                        "dockerfile": {
                          "description": "How the dockerfile is obtained. defaults to using ./dockerfile from the source code.",
                          "discriminator": {
                            "mapping": {
                              "generated": "#/components/schemas/GeneratedDockerfile",
                              "provided": "#/components/schemas/ProvidedDockerfile"
                            },
                            "propertyName": "source"
                          },
                          "oneOf": [
                            {
                              "additionalProperties": false,
                              "description": "User supplies a dockerfile in the uploaded source code.",
                              "properties": {
                                "path": {
                                  "default": "./Dockerfile",
                                  "description": "Relative path to the dockerfile in the source code. defaults to ./dockerfile.",
                                  "title": "Path",
                                  "type": "string"
                                },
                                "source": {
                                  "const": "provided",
                                  "default": "provided",
                                  "title": "Source",
                                  "type": "string"
                                }
                              },
                              "title": "ProvidedDockerfile",
                              "type": "object"
                            },
                            {
                              "additionalProperties": false,
                              "description": "System generates a dockerfile from execution environment metadata.",
                              "properties": {
                                "entrypoint": {
                                  "description": "Entrypoint baked into the generated dockerfile cmd (e.g. [\"python\", \"app.py\"]).",
                                  "items": {
                                    "type": "string"
                                  },
                                  "minItems": 1,
                                  "title": "Entrypoint",
                                  "type": "array"
                                },
                                "executionEnvironmentId": {
                                  "description": "Execution environment id used to resolve the base Docker image.",
                                  "title": "Execution Environment ID",
                                  "type": "string"
                                },
                                "executionEnvironmentVersionId": {
                                  "description": "Execution environment version id that pins the exact base image tag.",
                                  "title": "Execution Environment Version ID",
                                  "type": "string"
                                },
                                "source": {
                                  "const": "generated",
                                  "default": "generated",
                                  "title": "Source",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "executionEnvironmentId",
                                "executionEnvironmentVersionId",
                                "entrypoint"
                              ],
                              "title": "GeneratedDockerfile",
                              "type": "object"
                            }
                          ],
                          "title": "Dockerfile"
                        }
                      },
                      "title": "ImageBuildConfig",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Configuration for server-side image builds from source code."
                },
                "imageUri": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Docker image uri. required when imagebuildconfig is not set; server-populated after a successful image build.",
                  "title": "Imageuri"
                },
                "livenessProbe": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "properties": {
                        "failureThreshold": {
                          "default": 3,
                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                          "title": "Failurethreshold",
                          "type": "integer"
                        },
                        "host": {
                          "anyOf": [
                            {
                              "minLength": 0,
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Host name to connect to, defaults to the pod ip.",
                          "title": "Host"
                        },
                        "httpHeaders": {
                          "additionalProperties": {
                            "type": "string"
                          },
                          "description": "HTTP headers for probe.",
                          "title": "Httpheaders",
                          "type": "object"
                        },
                        "initialDelaySeconds": {
                          "default": 30,
                          "description": "Number of seconds to wait before the first probe is executed.",
                          "title": "Initialdelayseconds",
                          "type": "integer"
                        },
                        "path": {
                          "description": "Url path to query for health check.",
                          "title": "Path",
                          "type": "string"
                        },
                        "periodSeconds": {
                          "default": 30,
                          "description": "How often (in seconds) to perform the probe.",
                          "title": "Periodseconds",
                          "type": "integer"
                        },
                        "port": {
                          "default": 8080,
                          "description": "Port number to access on the container.",
                          "maximum": 65535,
                          "minimum": 1,
                          "title": "Port",
                          "type": "integer"
                        },
                        "scheme": {
                          "default": "HTTP",
                          "description": "Scheme to use for connecting to the host.",
                          "enum": [
                            "HTTP",
                            "HTTPS"
                          ],
                          "title": "Scheme",
                          "type": "string"
                        },
                        "timeoutSeconds": {
                          "default": 30,
                          "description": "Number of seconds after which the probe times out.",
                          "title": "Timeoutseconds",
                          "type": "integer"
                        }
                      },
                      "required": [
                        "path"
                      ],
                      "title": "ProbeConfig",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container liveness check configuration."
                },
                "name": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Name of the container. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
                  "title": "Name"
                },
                "port": {
                  "anyOf": [
                    {
                      "maximum": 65535,
                      "minimum": 1024,
                      "type": "integer"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container access port. when set, must be >= 1024 for security and platform compatibility reasons. primary containers must define a port; non-primary containers must omit it.",
                  "title": "Port"
                },
                "primary": {
                  "anyOf": [
                    {
                      "type": "boolean"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "default": false,
                  "description": "Whether this is the primary container.",
                  "title": "Primary"
                },
                "readinessProbe": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "properties": {
                        "failureThreshold": {
                          "default": 3,
                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                          "title": "Failurethreshold",
                          "type": "integer"
                        },
                        "host": {
                          "anyOf": [
                            {
                              "minLength": 0,
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Host name to connect to, defaults to the pod ip.",
                          "title": "Host"
                        },
                        "httpHeaders": {
                          "additionalProperties": {
                            "type": "string"
                          },
                          "description": "HTTP headers for probe.",
                          "title": "Httpheaders",
                          "type": "object"
                        },
                        "initialDelaySeconds": {
                          "default": 30,
                          "description": "Number of seconds to wait before the first probe is executed.",
                          "title": "Initialdelayseconds",
                          "type": "integer"
                        },
                        "path": {
                          "description": "Url path to query for health check.",
                          "title": "Path",
                          "type": "string"
                        },
                        "periodSeconds": {
                          "default": 30,
                          "description": "How often (in seconds) to perform the probe.",
                          "title": "Periodseconds",
                          "type": "integer"
                        },
                        "port": {
                          "default": 8080,
                          "description": "Port number to access on the container.",
                          "maximum": 65535,
                          "minimum": 1,
                          "title": "Port",
                          "type": "integer"
                        },
                        "scheme": {
                          "default": "HTTP",
                          "description": "Scheme to use for connecting to the host.",
                          "enum": [
                            "HTTP",
                            "HTTPS"
                          ],
                          "title": "Scheme",
                          "type": "string"
                        },
                        "timeoutSeconds": {
                          "default": 30,
                          "description": "Number of seconds after which the probe times out.",
                          "title": "Timeoutseconds",
                          "type": "integer"
                        }
                      },
                      "required": [
                        "path"
                      ],
                      "title": "ProbeConfig",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container readiness check configuration."
                },
                "securityContext": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "description": "Container-level security context. lets workload creators tighten security constraints beyond the platform defaults. runasnonroot and runasuser are enforced by the platform and are not user-settable. elevated fields (capabilities.add, allowprivilegeescalation=true, seccompprofile.type=unconfined) require the mlops admin role; regular users may only tighten defaults — drop capabilities, enable read-only rootfs, or set a runtimedefault/localhost seccomp profile.",
                      "properties": {
                        "allowPrivilegeEscalation": {
                          "anyOf": [
                            {
                              "type": "boolean"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Whether a process can gain more privileges than its parent. requires the mlops admin role to set to true.",
                          "title": "Allowprivilegeescalation"
                        },
                        "capabilities": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Linux capabilities to add or drop from the container.",
                              "properties": {
                                "add": {
                                  "anyOf": [
                                    {
                                      "items": {
                                        "type": "string"
                                      },
                                      "type": "array"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Capabilities to add.",
                                  "title": "Add"
                                },
                                "drop": {
                                  "anyOf": [
                                    {
                                      "items": {
                                        "type": "string"
                                      },
                                      "type": "array"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Capabilities to drop.",
                                  "title": "Drop"
                                }
                              },
                              "title": "Capabilities",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Linux capabilities to add or drop."
                        },
                        "readOnlyRootFilesystem": {
                          "anyOf": [
                            {
                              "type": "boolean"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Whether the root filesystem is read-only.",
                          "title": "Readonlyrootfilesystem"
                        },
                        "seccompProfile": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Seccomp profile configuration.",
                              "properties": {
                                "localhostProfile": {
                                  "anyOf": [
                                    {
                                      "type": "string"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Path to a seccomp profile on the node. only valid when type is localhost.",
                                  "title": "Localhostprofile"
                                },
                                "type": {
                                  "description": "Allowed seccomp profile types.",
                                  "enum": [
                                    "RuntimeDefault",
                                    "Unconfined",
                                    "Localhost"
                                  ],
                                  "title": "SeccompProfileType",
                                  "type": "string"
                                }
                              },
                              "required": [
                                "type"
                              ],
                              "title": "SeccompProfile",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Seccomp profile for the container."
                        }
                      },
                      "title": "SecurityContext",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container security context."
                },
                "startupProbe": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "properties": {
                        "failureThreshold": {
                          "default": 3,
                          "description": "Minimum consecutive failures for the probe to be considered failed.",
                          "title": "Failurethreshold",
                          "type": "integer"
                        },
                        "host": {
                          "anyOf": [
                            {
                              "minLength": 0,
                              "type": "string"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Host name to connect to, defaults to the pod ip.",
                          "title": "Host"
                        },
                        "httpHeaders": {
                          "additionalProperties": {
                            "type": "string"
                          },
                          "description": "HTTP headers for probe.",
                          "title": "Httpheaders",
                          "type": "object"
                        },
                        "initialDelaySeconds": {
                          "default": 30,
                          "description": "Number of seconds to wait before the first probe is executed.",
                          "title": "Initialdelayseconds",
                          "type": "integer"
                        },
                        "path": {
                          "description": "Url path to query for health check.",
                          "title": "Path",
                          "type": "string"
                        },
                        "periodSeconds": {
                          "default": 30,
                          "description": "How often (in seconds) to perform the probe.",
                          "title": "Periodseconds",
                          "type": "integer"
                        },
                        "port": {
                          "default": 8080,
                          "description": "Port number to access on the container.",
                          "maximum": 65535,
                          "minimum": 1,
                          "title": "Port",
                          "type": "integer"
                        },
                        "scheme": {
                          "default": "HTTP",
                          "description": "Scheme to use for connecting to the host.",
                          "enum": [
                            "HTTP",
                            "HTTPS"
                          ],
                          "title": "Scheme",
                          "type": "string"
                        },
                        "timeoutSeconds": {
                          "default": 30,
                          "description": "Number of seconds after which the probe times out.",
                          "title": "Timeoutseconds",
                          "type": "integer"
                        }
                      },
                      "required": [
                        "path"
                      ],
                      "title": "ProbeConfig",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Container startup check configuration."
                }
              },
              "title": "Container",
              "type": "object"
            },
            "title": "Containers",
            "type": "array"
          },
          "name": {
            "default": "default",
            "description": "Name of the container group. used as the lookup key for runtime overrides. lowercase letters, digits, and hyphens only; must start with a lowercase letter and end with a letter or digit; max 63 characters.",
            "title": "Name",
            "type": "string"
          }
        },
        "title": "ContainerGroup",
        "type": "object"
      },
      "title": "Containergroups",
      "type": "array"
    },
    "type": {
      "const": "service",
      "default": "service",
      "description": "Artifact type discriminator. injected automatically from the top-level `type` field — do not set this directly.",
      "title": "Type",
      "type": "string"
    }
  },
  "title": "ServiceArtifactSpec",
  "type": "object"
}

ServiceArtifactSpec

Properties

Name Type Required Restrictions Description
containerGroups [ContainerGroup] false List of container groups.
type string false Artifact type discriminator. injected automatically from the top-level type field — do not set this directly.

SharedRole

{
  "additionalProperties": false,
  "description": "Represents a recipient (user, group, or organization) with access to an entity. this model is used for listing and managing access control on shared resources, providing information about who has access and what role they have.",
  "properties": {
    "id": {
      "description": "The identifier of the recipient.",
      "title": "Id",
      "type": "string"
    },
    "name": {
      "description": "The name of the recipient.",
      "title": "Name",
      "type": "string"
    },
    "role": {
      "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
      "enum": [
        "NO_ROLE",
        "OWNER",
        "READ_WRITE",
        "EDITOR",
        "USER",
        "DATA_SCIENTIST",
        "ADMIN",
        "READ_ONLY",
        "CONSUMER",
        "OBSERVER"
      ],
      "title": "SharingRole",
      "type": "string"
    },
    "shareRecipientType": {
      "description": "Enum of possible subject types.",
      "enum": [
        "user",
        "group",
        "organization",
        "role"
      ],
      "title": "SubjectType",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "shareRecipientType",
    "role"
  ],
  "title": "SharedRole",
  "type": "object"
}

SharedRole

Properties

Name Type Required Restrictions Description
id string true The identifier of the recipient.
name string true The name of the recipient.
role SharingRole true The role of the recipient on this entity.
shareRecipientType SubjectType true The type of the recipient.

SharedRoleListResponse

{
  "additionalProperties": false,
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "description": "Represents a recipient (user, group, or organization) with access to an entity. this model is used for listing and managing access control on shared resources, providing information about who has access and what role they have.",
        "properties": {
          "id": {
            "description": "The identifier of the recipient.",
            "title": "Id",
            "type": "string"
          },
          "name": {
            "description": "The name of the recipient.",
            "title": "Name",
            "type": "string"
          },
          "role": {
            "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
            "enum": [
              "NO_ROLE",
              "OWNER",
              "READ_WRITE",
              "EDITOR",
              "USER",
              "DATA_SCIENTIST",
              "ADMIN",
              "READ_ONLY",
              "CONSUMER",
              "OBSERVER"
            ],
            "title": "SharingRole",
            "type": "string"
          },
          "shareRecipientType": {
            "description": "Enum of possible subject types.",
            "enum": [
              "user",
              "group",
              "organization",
              "role"
            ],
            "title": "SubjectType",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "shareRecipientType",
          "role"
        ],
        "title": "SharedRole",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "SharedRoleListResponse",
  "type": "object"
}

SharedRoleListResponse

Properties

Name Type Required Restrictions Description
count integer true The number of records on this page.
data [SharedRole] true The list of records.
next any true The url to the next page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
previous any true The url to the previous page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
totalCount integer true The total number of records.

SharedRolesUpdateRequest

{
  "additionalProperties": false,
  "description": "Request model for updating shared roles on an entity. used to grant access, remove access, or update roles for organizations, groups, or users. up to 100 roles may be set in a single request.",
  "properties": {
    "operation": {
      "const": "updateRoles",
      "description": "Name of the action being taken. the only operation is 'updateroles'.",
      "title": "Operation",
      "type": "string"
    },
    "roles": {
      "description": "Array of grantaccesscontrol objects, up to maximum 100 objects.",
      "items": {
        "anyOf": [
          {
            "additionalProperties": false,
            "description": "Grant access control request using username for user identification.",
            "properties": {
              "role": {
                "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
                "enum": [
                  "NO_ROLE",
                  "OWNER",
                  "READ_WRITE",
                  "EDITOR",
                  "USER",
                  "DATA_SCIENTIST",
                  "ADMIN",
                  "READ_ONLY",
                  "CONSUMER",
                  "OBSERVER"
                ],
                "title": "SharingRole",
                "type": "string"
              },
              "shareRecipientType": {
                "description": "Enum of possible subject types.",
                "enum": [
                  "user",
                  "group",
                  "organization",
                  "role"
                ],
                "title": "SubjectType",
                "type": "string"
              },
              "username": {
                "description": "Username of the user to update the access role for.",
                "title": "Username",
                "type": "string"
              }
            },
            "required": [
              "shareRecipientType",
              "role",
              "username"
            ],
            "title": "GrantAccessControlWithUsername",
            "type": "object"
          },
          {
            "additionalProperties": false,
            "description": "Grant access control request using id for recipient identification. can be used for users, groups, or organizations.",
            "properties": {
              "id": {
                "description": "The id of the recipient.",
                "title": "Id",
                "type": "string"
              },
              "role": {
                "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
                "enum": [
                  "NO_ROLE",
                  "OWNER",
                  "READ_WRITE",
                  "EDITOR",
                  "USER",
                  "DATA_SCIENTIST",
                  "ADMIN",
                  "READ_ONLY",
                  "CONSUMER",
                  "OBSERVER"
                ],
                "title": "SharingRole",
                "type": "string"
              },
              "shareRecipientType": {
                "description": "Enum of possible subject types.",
                "enum": [
                  "user",
                  "group",
                  "organization",
                  "role"
                ],
                "title": "SubjectType",
                "type": "string"
              }
            },
            "required": [
              "shareRecipientType",
              "role",
              "id"
            ],
            "title": "GrantAccessControlWithId",
            "type": "object"
          }
        ]
      },
      "maxItems": 100,
      "minItems": 1,
      "title": "Roles",
      "type": "array"
    }
  },
  "required": [
    "operation",
    "roles"
  ],
  "title": "SharedRolesUpdateRequest",
  "type": "object"
}

SharedRolesUpdateRequest

Properties

Name Type Required Restrictions Description
operation string true Name of the action being taken. the only operation is 'updateroles'.
roles [anyOf] true maxItems: 100
minItems: 1
Array of grantaccesscontrol objects, up to maximum 100 objects.

anyOf

Name Type Required Restrictions Description
» anonymous GrantAccessControlWithUsername false Grant access control request using username for user identification.

or

Name Type Required Restrictions Description
» anonymous GrantAccessControlWithId false Grant access control request using id for recipient identification. can be used for users, groups, or organizations.

SharingRole

{
  "description": "External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.",
  "enum": [
    "NO_ROLE",
    "OWNER",
    "READ_WRITE",
    "EDITOR",
    "USER",
    "DATA_SCIENTIST",
    "ADMIN",
    "READ_ONLY",
    "CONSUMER",
    "OBSERVER"
  ],
  "title": "SharingRole",
  "type": "string"
}

SharingRole

Properties

Name Type Required Restrictions Description
SharingRole string false External sharing roles representing the permission level a user, group or organization holds on an entity. these roles map to internal permissions and are used in sharing apis.

Enumerated Values

Property Value
SharingRole [NO_ROLE, OWNER, READ_WRITE, EDITOR, USER, DATA_SCIENTIST, ADMIN, READ_ONLY, CONSUMER, OBSERVER]

StartReplacementRequest

{
  "additionalProperties": false,
  "description": "Request to start a replacement for a workload.",
  "properties": {
    "artifactId": {
      "description": "Existing artifact id to deploy.",
      "title": "Artifactid",
      "type": "string"
    },
    "config": {
      "additionalProperties": false,
      "description": "Configuration for workload replacement.",
      "properties": {
        "keepOldVersionMinutes": {
          "default": 0,
          "description": "Duration in minutes to keep the old version during replacement.",
          "title": "Keepoldversionminutes",
          "type": "integer"
        },
        "warmupDurationMinutes": {
          "default": 0,
          "description": "Duration in minutes for the warmup phase during replacement.",
          "title": "Warmupdurationminutes",
          "type": "integer"
        }
      },
      "title": "ReplacementConfig",
      "type": "object"
    },
    "runtime": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
          "properties": {
            "containerGroups": {
              "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
              "items": {
                "additionalProperties": false,
                "description": "Runtime configuration for a single container group.",
                "properties": {
                  "autoscaling": {
                    "anyOf": [
                      {
                        "additionalProperties": false,
                        "description": "Autoscaling configuration for a proton.",
                        "properties": {
                          "enabled": {
                            "default": true,
                            "description": "Whether autoscaling is enabled.",
                            "title": "Enabled",
                            "type": "boolean"
                          },
                          "policies": {
                            "items": {
                              "additionalProperties": false,
                              "description": "Base class for autoscaling policies.",
                              "properties": {
                                "maxCount": {
                                  "description": "Maximum number of replicas.",
                                  "minimum": 0,
                                  "title": "Max Count",
                                  "type": "integer"
                                },
                                "minCount": {
                                  "description": "Minimum number of replicas.",
                                  "minimum": 0,
                                  "title": "Min Count",
                                  "type": "integer"
                                },
                                "priority": {
                                  "anyOf": [
                                    {
                                      "type": "integer"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Policy priority when multiple policies are defined.",
                                  "title": "Priority"
                                },
                                "scalingMetric": {
                                  "anyOf": [
                                    {
                                      "oneOf": [
                                        {
                                          "const": "cpuAverageUtilization",
                                          "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                          "title": "CPU Average Utilization"
                                        },
                                        {
                                          "const": "httpRequestsConcurrency",
                                          "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                          "title": "HTTP Requests Concurrency"
                                        },
                                        {
                                          "const": "gpuCacheUtilization",
                                          "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                          "title": "GPU Cache Utilization"
                                        },
                                        {
                                          "const": "gpuRequestQueueDepth",
                                          "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                          "title": "GPU Request Queue Depth"
                                        }
                                      ],
                                      "title": "ScalingMetricType",
                                      "type": "string"
                                    },
                                    {
                                      "type": "string"
                                    }
                                  ],
                                  "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                                  "title": "Scaling Metric"
                                },
                                "target": {
                                  "description": "Target value for the scaling metric.",
                                  "minimum": 0,
                                  "title": "Target",
                                  "type": "number"
                                }
                              },
                              "required": [
                                "scalingMetric",
                                "target",
                                "minCount",
                                "maxCount"
                              ],
                              "title": "AutoscalingPolicy",
                              "type": "object"
                            },
                            "title": "Policies",
                            "type": "array"
                          }
                        },
                        "required": [
                          "policies"
                        ],
                        "title": "AutoscalingProperties",
                        "type": "object"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Autoscaling configuration for this group. takes precedence over replicacount."
                  },
                  "bundleSelectionPolicy": {
                    "enum": [
                      "availability"
                    ],
                    "title": "BundleSelectionPolicy",
                    "type": "string"
                  },
                  "containers": {
                    "description": "Per-container overrides for this group.",
                    "items": {
                      "additionalProperties": false,
                      "description": "Runtime diff targeting a single named container within a group.",
                      "properties": {
                        "name": {
                          "description": "Container name. must match a container declared in the artifact group.",
                          "title": "Name",
                          "type": "string"
                        },
                        "resourceAllocation": {
                          "anyOf": [
                            {
                              "additionalProperties": false,
                              "description": "Per-container resource allocation declared at runtime.",
                              "properties": {
                                "cpu": {
                                  "anyOf": [
                                    {
                                      "minimum": 0.1,
                                      "type": "number"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Cpu cores allocated to this container.",
                                  "title": "Cpu"
                                },
                                "gpu": {
                                  "anyOf": [
                                    {
                                      "minimum": 0,
                                      "type": "number"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Gpus allocated to this container.",
                                  "title": "Gpu"
                                },
                                "memory": {
                                  "anyOf": [
                                    {
                                      "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                      "type": "string"
                                    },
                                    {
                                      "minimum": 0,
                                      "type": "integer"
                                    },
                                    {
                                      "type": "null"
                                    }
                                  ],
                                  "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                                  "examples": [
                                    "8GB",
                                    "512MB"
                                  ],
                                  "title": "Memory"
                                }
                              },
                              "title": "ResourceAllocation",
                              "type": "object"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Resource allocation for this container. required for multi-container groups."
                        }
                      },
                      "required": [
                        "name"
                      ],
                      "title": "ContainerOverride",
                      "type": "object"
                    },
                    "title": "Containers",
                    "type": "array"
                  },
                  "name": {
                    "default": "default",
                    "description": "Group name. must match a container group name declared in the artifact.",
                    "title": "Name",
                    "type": "string"
                  },
                  "replicaCount": {
                    "anyOf": [
                      {
                        "minimum": 1,
                        "type": "integer"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "default": 1,
                    "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                    "title": "Replicacount"
                  },
                  "resolvedBundle": {
                    "anyOf": [
                      {
                        "description": "Bundle details returned in the runtime response after scheduling.",
                        "properties": {
                          "cpuCount": {
                            "description": "Number of cpu cores.",
                            "title": "CPU Count",
                            "type": "number"
                          },
                          "gpuCount": {
                            "default": 0,
                            "description": "Number of gpu units.",
                            "title": "GPU Count",
                            "type": "integer"
                          },
                          "gpuMaker": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Gpu manufacturer.",
                            "title": "GPU Maker"
                          },
                          "gpuTypeLabel": {
                            "anyOf": [
                              {
                                "type": "string"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Gpu type label.",
                            "title": "GPU Type Label"
                          },
                          "id": {
                            "description": "Bundle identifier that was selected.",
                            "title": "Id",
                            "type": "string"
                          },
                          "memoryBytes": {
                            "description": "Memory size in bytes.",
                            "title": "Memory Bytes",
                            "type": "integer"
                          }
                        },
                        "required": [
                          "id",
                          "cpuCount",
                          "memoryBytes"
                        ],
                        "title": "ResolvedBundle",
                        "type": "object"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Full details of the bundle selected at scheduling time. read-only.",
                    "readOnly": true
                  },
                  "resourceBundles": {
                    "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                    "items": {
                      "type": "string"
                    },
                    "title": "Resourcebundles",
                    "type": "array"
                  }
                },
                "title": "GroupRuntime",
                "type": "object"
              },
              "title": "Containergroups",
              "type": "array"
            }
          },
          "title": "WorkloadRuntime",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Runtime for the workload; if omitted, the current runtime is reused."
    },
    "strategy": {
      "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
      "enum": [
        "rolling"
      ],
      "title": "ReplacementStrategy",
      "type": "string"
    }
  },
  "required": [
    "artifactId",
    "strategy"
  ],
  "title": "StartReplacementRequest",
  "type": "object"
}

StartReplacementRequest

Properties

Name Type Required Restrictions Description
artifactId string true Existing artifact id to deploy.
config ReplacementConfig false Configuration for the replacement process, including warmup duration and old version retention time.
runtime any false Runtime for the workload; if omitted, the current runtime is reused.

anyOf

Name Type Required Restrictions Description
» anonymous WorkloadRuntime false Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside container_groups, each identified by name matching the artifact topology.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
strategy ReplacementStrategy true Replacement strategy.

StringEnvironmentVariable

{
  "properties": {
    "name": {
      "description": "Name of the environment variable.",
      "title": "Name",
      "type": "string"
    },
    "source": {
      "const": "string",
      "default": "string",
      "title": "Source",
      "type": "string"
    },
    "value": {
      "description": "Value of the environment variable.",
      "title": "Value",
      "type": "string"
    }
  },
  "required": [
    "name",
    "value"
  ],
  "title": "StringEnvironmentVariable",
  "type": "object"
}

StringEnvironmentVariable

Properties

Name Type Required Restrictions Description
name string true Name of the environment variable.
source string false none
value string true Value of the environment variable.

SubjectType

{
  "description": "Enum of possible subject types.",
  "enum": [
    "user",
    "group",
    "organization",
    "role"
  ],
  "title": "SubjectType",
  "type": "string"
}

SubjectType

Properties

Name Type Required Restrictions Description
SubjectType string false Enum of possible subject types.

Enumerated Values

Property Value
SubjectType [user, group, organization, role]

Summary

{
  "additionalProperties": false,
  "description": "Summary information for proton statistics.",
  "properties": {
    "period": {
      "additionalProperties": false,
      "description": "Time period definition.",
      "properties": {
        "end": {
          "anyOf": [
            {
              "format": "date-time",
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "description": "Period end time.",
          "title": "End"
        },
        "start": {
          "anyOf": [
            {
              "format": "date-time",
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "description": "Period start time.",
          "title": "Start"
        }
      },
      "title": "Period",
      "type": "object"
    },
    "protonId": {
      "description": "Proton id.",
      "title": "Protonid",
      "type": "string"
    }
  },
  "required": [
    "protonId",
    "period"
  ],
  "title": "Summary",
  "type": "object"
}

Summary

Properties

Name Type Required Restrictions Description
period Period true Time period for the summary.
protonId string true Proton id.

TagInfo

{
  "additionalProperties": false,
  "properties": {
    "id": {
      "description": "Unique identifier of the tag.",
      "title": "Id",
      "type": "string"
    },
    "name": {
      "description": "Name of the tag.",
      "title": "Name",
      "type": "string"
    },
    "value": {
      "description": "Value of the tag.",
      "title": "Value",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "value"
  ],
  "title": "TagInfo",
  "type": "object"
}

TagInfo

Properties

Name Type Required Restrictions Description
id string true Unique identifier of the tag.
name string true Name of the tag.
value string true Value of the tag.

Tags

{
  "items": {
    "additionalProperties": false,
    "properties": {
      "id": {
        "description": "Unique identifier of the tag.",
        "title": "Id",
        "type": "string"
      },
      "name": {
        "description": "Name of the tag.",
        "title": "Name",
        "type": "string"
      },
      "value": {
        "description": "Value of the tag.",
        "title": "Value",
        "type": "string"
      }
    },
    "required": [
      "id",
      "name",
      "value"
    ],
    "title": "TagInfo",
    "type": "object"
  },
  "type": "array"
}

Properties

Name Type Required Restrictions Description
anonymous [TagInfo] false none

UpdateSettingsRequest

{
  "additionalProperties": false,
  "description": "Request to update runtime settings for a workload.",
  "properties": {
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    }
  },
  "required": [
    "runtime"
  ],
  "title": "UpdateSettingsRequest",
  "type": "object"
}

UpdateSettingsRequest

Properties

Name Type Required Restrictions Description
runtime WorkloadRuntime true Runtime configuration to apply to the workload.

UpdateWorkloadRequest

{
  "additionalProperties": false,
  "description": "Request to update an existing workload.",
  "properties": {
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Updated workload description.",
      "title": "Description"
    },
    "importance": {
      "anyOf": [
        {
          "description": "Importance level for workloads.",
          "enum": [
            "critical",
            "high",
            "moderate",
            "low"
          ],
          "title": "WorkloadImportance",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Updated workload importance level."
    },
    "name": {
      "anyOf": [
        {
          "maxLength": 5000,
          "minLength": 1,
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Updated workload name.",
      "title": "Name"
    }
  },
  "title": "UpdateWorkloadRequest",
  "type": "object"
}

UpdateWorkloadRequest

Properties

Name Type Required Restrictions Description
description any false Updated workload description.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
importance any false Updated workload importance level.

anyOf

Name Type Required Restrictions Description
» anonymous WorkloadImportance false Importance level for workloads.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
name any false Updated workload name.

anyOf

Name Type Required Restrictions Description
» anonymous string false maxLength: 5000
minLength: 1
minLength: 1
none

or

Name Type Required Restrictions Description
» anonymous null false none

UserData

{
  "additionalProperties": false,
  "description": "User information embedded in API responses.",
  "properties": {
    "email": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "User email address.",
      "title": "Email"
    },
    "fullName": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "User's full name.",
      "title": "Full Name"
    },
    "id": {
      "description": "User id associated with this resource.",
      "title": "User ID",
      "type": "string"
    },
    "userhash": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "User's gravatar hash.",
      "title": "Userhash"
    },
    "username": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Username.",
      "title": "Username"
    }
  },
  "required": [
    "id"
  ],
  "title": "UserData",
  "type": "object"
}

UserData

Properties

Name Type Required Restrictions Description
email any false User email address.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
fullName any false User's full name.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
id string true User id associated with this resource.
userhash any false User's gravatar hash.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
username any false Username.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

ValidationError

{
  "properties": {
    "ctx": {
      "title": "Context",
      "type": "object"
    },
    "input": {
      "title": "Input"
    },
    "loc": {
      "items": {
        "anyOf": [
          {
            "type": "string"
          },
          {
            "type": "integer"
          }
        ]
      },
      "title": "Location",
      "type": "array"
    },
    "msg": {
      "title": "Message",
      "type": "string"
    },
    "type": {
      "title": "Error Type",
      "type": "string"
    }
  },
  "required": [
    "loc",
    "msg",
    "type"
  ],
  "title": "ValidationError",
  "type": "object"
}

ValidationError

Properties

Name Type Required Restrictions Description
ctx object false none
input any false none
loc [anyOf] true none

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous integer false none

continued

Name Type Required Restrictions Description
msg string true none
type string true none

WorkloadEvent

{
  "additionalProperties": false,
  "description": "A single workload event record. note: full event schema will be defined once event storage is implemented (p7). this placeholder documents the known shape.",
  "properties": {
    "actorId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the user or system that triggered the event.",
      "title": "Actor ID"
    },
    "details": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Additional event-specific details.",
      "title": "Details"
    },
    "eventType": {
      "description": "Type of event.",
      "title": "Event Type",
      "type": "string"
    },
    "id": {
      "description": "Event id.",
      "title": "ID",
      "type": "string"
    },
    "timestamp": {
      "description": "When the event occurred.",
      "format": "date-time",
      "title": "Timestamp",
      "type": "string"
    },
    "workloadId": {
      "description": "Id of the workload this event belongs to.",
      "title": "Workload ID",
      "type": "string"
    }
  },
  "required": [
    "id",
    "workloadId",
    "timestamp",
    "eventType"
  ],
  "title": "WorkloadEvent",
  "type": "object"
}

WorkloadEvent

Properties

Name Type Required Restrictions Description
actorId any false Id of the user or system that triggered the event.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
details any false Additional event-specific details.

anyOf

Name Type Required Restrictions Description
» anonymous object false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
eventType string true Type of event.
id string true Event id.
timestamp string(date-time) true When the event occurred.
workloadId string true Id of the workload this event belongs to.

WorkloadEventsResponse

{
  "additionalProperties": false,
  "description": "Response containing workload events.",
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "description": "A single workload event record. note: full event schema will be defined once event storage is implemented (p7). this placeholder documents the known shape.",
        "properties": {
          "actorId": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Id of the user or system that triggered the event.",
            "title": "Actor ID"
          },
          "details": {
            "anyOf": [
              {
                "additionalProperties": true,
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Additional event-specific details.",
            "title": "Details"
          },
          "eventType": {
            "description": "Type of event.",
            "title": "Event Type",
            "type": "string"
          },
          "id": {
            "description": "Event id.",
            "title": "ID",
            "type": "string"
          },
          "timestamp": {
            "description": "When the event occurred.",
            "format": "date-time",
            "title": "Timestamp",
            "type": "string"
          },
          "workloadId": {
            "description": "Id of the workload this event belongs to.",
            "title": "Workload ID",
            "type": "string"
          }
        },
        "required": [
          "id",
          "workloadId",
          "timestamp",
          "eventType"
        ],
        "title": "WorkloadEvent",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "WorkloadEventsResponse",
  "type": "object"
}

WorkloadEventsResponse

Properties

Name Type Required Restrictions Description
count integer true The number of records on this page.
data [WorkloadEvent] true The list of records.
next any true The url to the next page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
previous any true The url to the previous page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
totalCount integer true The total number of records.

WorkloadFormatted

{
  "additionalProperties": false,
  "description": "API representation of a workload. this is the formatted version returned to clients, excluding internal fields and including computed properties like permissions and statistics.",
  "properties": {
    "artifact": {
      "anyOf": [
        {
          "description": "Artifact basic information.",
          "properties": {
            "artifactRepositoryId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the artifact repository this artifact belongs to (for versioning).",
              "title": "Artifactrepositoryid"
            },
            "id": {
              "description": "Unique identifier of the entity.",
              "title": "Id",
              "type": "string"
            },
            "name": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Name of the entity.",
              "title": "Name"
            },
            "status": {
              "anyOf": [
                {
                  "enum": [
                    "draft",
                    "locked"
                  ],
                  "title": "ArtifactStatus",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact status."
            },
            "templateId": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Id of the template used to create this artifact.",
              "title": "Templateid"
            },
            "type": {
              "anyOf": [
                {
                  "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
                  "enum": [
                    "service",
                    "nim"
                  ],
                  "title": "ArtifactType",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Artifact type."
            },
            "version": {
              "anyOf": [
                {
                  "type": "integer"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Version number of the artifact (set only for locked artifacts).",
              "title": "Version"
            }
          },
          "required": [
            "id"
          ],
          "title": "ArtifactInfoFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Basic information about the currently active artifact for this workload.",
      "title": "Artifact"
    },
    "artifactId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active artifact for this workload.",
      "title": "Artifact ID"
    },
    "createdAt": {
      "description": "Timestamp of when the entity was created.",
      "format": "date-time",
      "title": "Created At",
      "type": "string"
    },
    "creator": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "User information embedded in API responses.",
          "properties": {
            "email": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User email address.",
              "title": "Email"
            },
            "fullName": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's full name.",
              "title": "Full Name"
            },
            "id": {
              "description": "User id associated with this resource.",
              "title": "User ID",
              "type": "string"
            },
            "userhash": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "User's gravatar hash.",
              "title": "Userhash"
            },
            "username": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Username.",
              "title": "Username"
            }
          },
          "required": [
            "id"
          ],
          "title": "UserData",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Owner user details including id, username and email.",
      "title": "Creator"
    },
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": "",
      "description": "Workload description.",
      "title": "Description"
    },
    "endpoint": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Workload endpoint url.",
      "title": "Endpoint"
    },
    "id": {
      "description": "Unique identifier of the entity.",
      "title": "ID",
      "type": "string"
    },
    "importance": {
      "description": "Importance level for workloads.",
      "enum": [
        "critical",
        "high",
        "moderate",
        "low"
      ],
      "title": "WorkloadImportance",
      "type": "string"
    },
    "lastResponse": {
      "anyOf": [
        {
          "format": "date-time",
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Timestamp of the last response received from this workload.",
      "title": "Last Response Time"
    },
    "name": {
      "description": "Name of the entity.",
      "title": "Name",
      "type": "string"
    },
    "owners": {
      "description": "List of workload owners.",
      "items": {
        "additionalProperties": false,
        "description": "User information embedded in API responses.",
        "properties": {
          "email": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User email address.",
            "title": "Email"
          },
          "fullName": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's full name.",
            "title": "Full Name"
          },
          "id": {
            "description": "User id associated with this resource.",
            "title": "User ID",
            "type": "string"
          },
          "userhash": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "User's gravatar hash.",
            "title": "Userhash"
          },
          "username": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Username.",
            "title": "Username"
          }
        },
        "required": [
          "id"
        ],
        "title": "UserData",
        "type": "object"
      },
      "title": "Owners",
      "type": "array"
    },
    "permissions": {
      "anyOf": [
        {
          "items": {
            "description": "Represents the particular role a user, group or organization holds on an entity.",
            "enum": [
              "CAN_VIEW",
              "CAN_UPDATE",
              "CAN_DELETE",
              "CAN_SHARE",
              "CAN_MAKE_PREDICTIONS",
              "CAN_SHARE_ROLE_OWNER",
              "CAN_SHARE_ROLE_READ_WRITE",
              "CAN_SHARE_ROLE_READ_ONLY"
            ],
            "title": "ResourcePermission",
            "type": "string"
          },
          "type": "array"
        },
        {
          "items": {
            "const": "*",
            "type": "string"
          },
          "type": "array"
        }
      ]
    },
    "protonId": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "Id of the currently active proton for this workload.",
      "title": "Proton ID"
    },
    "replacement": {
      "anyOf": [
        {
          "description": "Formatted replacement information for API responses.",
          "properties": {
            "candidateProtonIds": {
              "description": "Ids of protons pending promotion during artifact replacement.",
              "items": {
                "type": "string"
              },
              "title": "Candidateprotonids",
              "type": "array"
            },
            "status": {
              "description": "Statuses for workload replacement process.",
              "enum": [
                "unknown",
                "submitted",
                "initializing",
                "awaiting_promotion",
                "switching",
                "deleting",
                "completed",
                "errored",
                "cleaning_up"
              ],
              "title": "ReplacementStatus",
              "type": "string"
            },
            "strategy": {
              "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
              "enum": [
                "rolling"
              ],
              "title": "ReplacementStrategy",
              "type": "string"
            }
          },
          "title": "WorkloadReplacementFormatted",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Information about an active replacement process for this workload, if any.",
      "title": "Replacement"
    },
    "requestStats": {
      "anyOf": [
        {
          "additionalProperties": false,
          "description": "Request statistics summary.",
          "properties": {
            "concurrentRequests": {
              "default": 0,
              "description": "Number of concurrent requests.",
              "title": "Concurrentrequests",
              "type": "integer"
            },
            "errorRate": {
              "default": 0,
              "description": "Error rate percentage.",
              "title": "Errorrate",
              "type": "number"
            },
            "errorRates": {
              "description": "Error rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Errorrates",
              "type": "array"
            },
            "lastRequestAt": {
              "anyOf": [
                {
                  "format": "date-time",
                  "type": "string"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Timestamp of the last request.",
              "title": "Lastrequestat"
            },
            "requestRates": {
              "description": "Request rates over the last 7 time periods.",
              "items": {
                "type": "integer"
              },
              "title": "Requestrates",
              "type": "array"
            },
            "responseTime": {
              "default": 0,
              "description": "Average response time in milliseconds.",
              "title": "Responsetime",
              "type": "integer"
            },
            "totalRequests": {
              "default": 0,
              "description": "Total number of requests.",
              "title": "Totalrequests",
              "type": "integer"
            }
          },
          "title": "RequestStats",
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "description": "Request statistics for this workload.",
      "title": "Request Stats"
    },
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    },
    "status": {
      "description": "User-facing workload status. a subset of :class:`protonstatus` — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.",
      "enum": [
        "unknown",
        "submitted",
        "provisioning",
        "launching",
        "running",
        "suspended",
        "interrupted",
        "stopping",
        "stopped",
        "errored",
        "terminated"
      ],
      "title": "WorkloadStatus",
      "type": "string"
    },
    "tags": {
      "items": {
        "additionalProperties": false,
        "properties": {
          "id": {
            "description": "Unique identifier of the tag.",
            "title": "Id",
            "type": "string"
          },
          "name": {
            "description": "Name of the tag.",
            "title": "Name",
            "type": "string"
          },
          "value": {
            "description": "Value of the tag.",
            "title": "Value",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "value"
        ],
        "title": "TagInfo",
        "type": "object"
      },
      "type": "array"
    },
    "type": {
      "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
      "enum": [
        "service",
        "nim"
      ],
      "title": "ArtifactType",
      "type": "string"
    },
    "updatedAt": {
      "description": "Timestamp of when the entity was last updated.",
      "format": "date-time",
      "title": "Updated At",
      "type": "string"
    }
  },
  "required": [
    "id",
    "name",
    "createdAt",
    "updatedAt"
  ],
  "title": "WorkloadFormatted",
  "type": "object"
}

WorkloadFormatted

Properties

Name Type Required Restrictions Description
artifact any false Basic information about the currently active artifact for this workload.

anyOf

Name Type Required Restrictions Description
» anonymous ArtifactInfoFormatted false Artifact basic information.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
artifactId any false Id of the currently active artifact for this workload.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
createdAt string(date-time) true Timestamp of when the entity was created.
creator any false Owner user details including id, username and email.

anyOf

Name Type Required Restrictions Description
» anonymous UserData false User information embedded in API responses.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
description any false Workload description.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
endpoint any false Workload endpoint url.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
id string true Unique identifier of the entity.
importance WorkloadImportance false Workload importance level.
lastResponse any false Timestamp of the last response received from this workload.

anyOf

Name Type Required Restrictions Description
» anonymous string(date-time) false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
name string true Name of the entity.
owners [UserData] false List of workload owners.
permissions ResourcePermissions false User permissions for this workload.
protonId any false Id of the currently active proton for this workload.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
replacement any false Information about an active replacement process for this workload, if any.

anyOf

Name Type Required Restrictions Description
» anonymous WorkloadReplacementFormatted false Formatted replacement information for API responses.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
requestStats any false Request statistics for this workload.

anyOf

Name Type Required Restrictions Description
» anonymous RequestStats false Request statistics summary.

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
runtime WorkloadRuntime false Runtime configuration sourced from the current proton.
status WorkloadStatus false Current workload status.
tags Tags false Tags associated with this workload.
type ArtifactType false Workload artifact type.
updatedAt string(date-time) true Timestamp of when the entity was last updated.

WorkloadImportance

{
  "description": "Importance level for workloads.",
  "enum": [
    "critical",
    "high",
    "moderate",
    "low"
  ],
  "title": "WorkloadImportance",
  "type": "string"
}

WorkloadImportance

Properties

Name Type Required Restrictions Description
WorkloadImportance string false Importance level for workloads.

Enumerated Values

Property Value
WorkloadImportance [critical, high, moderate, low]

WorkloadListResponse

{
  "additionalProperties": false,
  "properties": {
    "count": {
      "description": "The number of records on this page.",
      "title": "Count",
      "type": "integer"
    },
    "data": {
      "description": "The list of records.",
      "items": {
        "additionalProperties": false,
        "description": "API representation of a workload. this is the formatted version returned to clients, excluding internal fields and including computed properties like permissions and statistics.",
        "properties": {
          "artifact": {
            "anyOf": [
              {
                "description": "Artifact basic information.",
                "properties": {
                  "artifactRepositoryId": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Id of the artifact repository this artifact belongs to (for versioning).",
                    "title": "Artifactrepositoryid"
                  },
                  "id": {
                    "description": "Unique identifier of the entity.",
                    "title": "Id",
                    "type": "string"
                  },
                  "name": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Name of the entity.",
                    "title": "Name"
                  },
                  "status": {
                    "anyOf": [
                      {
                        "enum": [
                          "draft",
                          "locked"
                        ],
                        "title": "ArtifactStatus",
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Artifact status."
                  },
                  "templateId": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Id of the template used to create this artifact.",
                    "title": "Templateid"
                  },
                  "type": {
                    "anyOf": [
                      {
                        "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
                        "enum": [
                          "service",
                          "nim"
                        ],
                        "title": "ArtifactType",
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Artifact type."
                  },
                  "version": {
                    "anyOf": [
                      {
                        "type": "integer"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Version number of the artifact (set only for locked artifacts).",
                    "title": "Version"
                  }
                },
                "required": [
                  "id"
                ],
                "title": "ArtifactInfoFormatted",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Basic information about the currently active artifact for this workload.",
            "title": "Artifact"
          },
          "artifactId": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Id of the currently active artifact for this workload.",
            "title": "Artifact ID"
          },
          "createdAt": {
            "description": "Timestamp of when the entity was created.",
            "format": "date-time",
            "title": "Created At",
            "type": "string"
          },
          "creator": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "User information embedded in API responses.",
                "properties": {
                  "email": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User email address.",
                    "title": "Email"
                  },
                  "fullName": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User's full name.",
                    "title": "Full Name"
                  },
                  "id": {
                    "description": "User id associated with this resource.",
                    "title": "User ID",
                    "type": "string"
                  },
                  "userhash": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "User's gravatar hash.",
                    "title": "Userhash"
                  },
                  "username": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Username.",
                    "title": "Username"
                  }
                },
                "required": [
                  "id"
                ],
                "title": "UserData",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Owner user details including id, username and email.",
            "title": "Creator"
          },
          "description": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "default": "",
            "description": "Workload description.",
            "title": "Description"
          },
          "endpoint": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Workload endpoint url.",
            "title": "Endpoint"
          },
          "id": {
            "description": "Unique identifier of the entity.",
            "title": "ID",
            "type": "string"
          },
          "importance": {
            "description": "Importance level for workloads.",
            "enum": [
              "critical",
              "high",
              "moderate",
              "low"
            ],
            "title": "WorkloadImportance",
            "type": "string"
          },
          "lastResponse": {
            "anyOf": [
              {
                "format": "date-time",
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Timestamp of the last response received from this workload.",
            "title": "Last Response Time"
          },
          "name": {
            "description": "Name of the entity.",
            "title": "Name",
            "type": "string"
          },
          "owners": {
            "description": "List of workload owners.",
            "items": {
              "additionalProperties": false,
              "description": "User information embedded in API responses.",
              "properties": {
                "email": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "User email address.",
                  "title": "Email"
                },
                "fullName": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "User's full name.",
                  "title": "Full Name"
                },
                "id": {
                  "description": "User id associated with this resource.",
                  "title": "User ID",
                  "type": "string"
                },
                "userhash": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "User's gravatar hash.",
                  "title": "Userhash"
                },
                "username": {
                  "anyOf": [
                    {
                      "type": "string"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Username.",
                  "title": "Username"
                }
              },
              "required": [
                "id"
              ],
              "title": "UserData",
              "type": "object"
            },
            "title": "Owners",
            "type": "array"
          },
          "permissions": {
            "anyOf": [
              {
                "items": {
                  "description": "Represents the particular role a user, group or organization holds on an entity.",
                  "enum": [
                    "CAN_VIEW",
                    "CAN_UPDATE",
                    "CAN_DELETE",
                    "CAN_SHARE",
                    "CAN_MAKE_PREDICTIONS",
                    "CAN_SHARE_ROLE_OWNER",
                    "CAN_SHARE_ROLE_READ_WRITE",
                    "CAN_SHARE_ROLE_READ_ONLY"
                  ],
                  "title": "ResourcePermission",
                  "type": "string"
                },
                "type": "array"
              },
              {
                "items": {
                  "const": "*",
                  "type": "string"
                },
                "type": "array"
              }
            ]
          },
          "protonId": {
            "anyOf": [
              {
                "type": "string"
              },
              {
                "type": "null"
              }
            ],
            "description": "Id of the currently active proton for this workload.",
            "title": "Proton ID"
          },
          "replacement": {
            "anyOf": [
              {
                "description": "Formatted replacement information for API responses.",
                "properties": {
                  "candidateProtonIds": {
                    "description": "Ids of protons pending promotion during artifact replacement.",
                    "items": {
                      "type": "string"
                    },
                    "title": "Candidateprotonids",
                    "type": "array"
                  },
                  "status": {
                    "description": "Statuses for workload replacement process.",
                    "enum": [
                      "unknown",
                      "submitted",
                      "initializing",
                      "awaiting_promotion",
                      "switching",
                      "deleting",
                      "completed",
                      "errored",
                      "cleaning_up"
                    ],
                    "title": "ReplacementStatus",
                    "type": "string"
                  },
                  "strategy": {
                    "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
                    "enum": [
                      "rolling"
                    ],
                    "title": "ReplacementStrategy",
                    "type": "string"
                  }
                },
                "title": "WorkloadReplacementFormatted",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Information about an active replacement process for this workload, if any.",
            "title": "Replacement"
          },
          "requestStats": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "Request statistics summary.",
                "properties": {
                  "concurrentRequests": {
                    "default": 0,
                    "description": "Number of concurrent requests.",
                    "title": "Concurrentrequests",
                    "type": "integer"
                  },
                  "errorRate": {
                    "default": 0,
                    "description": "Error rate percentage.",
                    "title": "Errorrate",
                    "type": "number"
                  },
                  "errorRates": {
                    "description": "Error rates over the last 7 time periods.",
                    "items": {
                      "type": "integer"
                    },
                    "title": "Errorrates",
                    "type": "array"
                  },
                  "lastRequestAt": {
                    "anyOf": [
                      {
                        "format": "date-time",
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Timestamp of the last request.",
                    "title": "Lastrequestat"
                  },
                  "requestRates": {
                    "description": "Request rates over the last 7 time periods.",
                    "items": {
                      "type": "integer"
                    },
                    "title": "Requestrates",
                    "type": "array"
                  },
                  "responseTime": {
                    "default": 0,
                    "description": "Average response time in milliseconds.",
                    "title": "Responsetime",
                    "type": "integer"
                  },
                  "totalRequests": {
                    "default": 0,
                    "description": "Total number of requests.",
                    "title": "Totalrequests",
                    "type": "integer"
                  }
                },
                "title": "RequestStats",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Request statistics for this workload.",
            "title": "Request Stats"
          },
          "runtime": {
            "additionalProperties": false,
            "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
            "properties": {
              "containerGroups": {
                "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime configuration for a single container group.",
                  "properties": {
                    "autoscaling": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Autoscaling configuration for a proton.",
                          "properties": {
                            "enabled": {
                              "default": true,
                              "description": "Whether autoscaling is enabled.",
                              "title": "Enabled",
                              "type": "boolean"
                            },
                            "policies": {
                              "items": {
                                "additionalProperties": false,
                                "description": "Base class for autoscaling policies.",
                                "properties": {
                                  "maxCount": {
                                    "description": "Maximum number of replicas.",
                                    "minimum": 0,
                                    "title": "Max Count",
                                    "type": "integer"
                                  },
                                  "minCount": {
                                    "description": "Minimum number of replicas.",
                                    "minimum": 0,
                                    "title": "Min Count",
                                    "type": "integer"
                                  },
                                  "priority": {
                                    "anyOf": [
                                      {
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Policy priority when multiple policies are defined.",
                                    "title": "Priority"
                                  },
                                  "scalingMetric": {
                                    "anyOf": [
                                      {
                                        "oneOf": [
                                          {
                                            "const": "cpuAverageUtilization",
                                            "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                            "title": "CPU Average Utilization"
                                          },
                                          {
                                            "const": "httpRequestsConcurrency",
                                            "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                            "title": "HTTP Requests Concurrency"
                                          },
                                          {
                                            "const": "gpuCacheUtilization",
                                            "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                            "title": "GPU Cache Utilization"
                                          },
                                          {
                                            "const": "gpuRequestQueueDepth",
                                            "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                            "title": "GPU Request Queue Depth"
                                          }
                                        ],
                                        "title": "ScalingMetricType",
                                        "type": "string"
                                      },
                                      {
                                        "type": "string"
                                      }
                                    ],
                                    "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                                    "title": "Scaling Metric"
                                  },
                                  "target": {
                                    "description": "Target value for the scaling metric.",
                                    "minimum": 0,
                                    "title": "Target",
                                    "type": "number"
                                  }
                                },
                                "required": [
                                  "scalingMetric",
                                  "target",
                                  "minCount",
                                  "maxCount"
                                ],
                                "title": "AutoscalingPolicy",
                                "type": "object"
                              },
                              "title": "Policies",
                              "type": "array"
                            }
                          },
                          "required": [
                            "policies"
                          ],
                          "title": "AutoscalingProperties",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Autoscaling configuration for this group. takes precedence over replicacount."
                    },
                    "bundleSelectionPolicy": {
                      "enum": [
                        "availability"
                      ],
                      "title": "BundleSelectionPolicy",
                      "type": "string"
                    },
                    "containers": {
                      "description": "Per-container overrides for this group.",
                      "items": {
                        "additionalProperties": false,
                        "description": "Runtime diff targeting a single named container within a group.",
                        "properties": {
                          "name": {
                            "description": "Container name. must match a container declared in the artifact group.",
                            "title": "Name",
                            "type": "string"
                          },
                          "resourceAllocation": {
                            "anyOf": [
                              {
                                "additionalProperties": false,
                                "description": "Per-container resource allocation declared at runtime.",
                                "properties": {
                                  "cpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0.1,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Cpu cores allocated to this container.",
                                    "title": "Cpu"
                                  },
                                  "gpu": {
                                    "anyOf": [
                                      {
                                        "minimum": 0,
                                        "type": "number"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Gpus allocated to this container.",
                                    "title": "Gpu"
                                  },
                                  "memory": {
                                    "anyOf": [
                                      {
                                        "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                        "type": "string"
                                      },
                                      {
                                        "minimum": 0,
                                        "type": "integer"
                                      },
                                      {
                                        "type": "null"
                                      }
                                    ],
                                    "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                                    "examples": [
                                      "8GB",
                                      "512MB"
                                    ],
                                    "title": "Memory"
                                  }
                                },
                                "title": "ResourceAllocation",
                                "type": "object"
                              },
                              {
                                "type": "null"
                              }
                            ],
                            "description": "Resource allocation for this container. required for multi-container groups."
                          }
                        },
                        "required": [
                          "name"
                        ],
                        "title": "ContainerOverride",
                        "type": "object"
                      },
                      "title": "Containers",
                      "type": "array"
                    },
                    "name": {
                      "default": "default",
                      "description": "Group name. must match a container group name declared in the artifact.",
                      "title": "Name",
                      "type": "string"
                    },
                    "replicaCount": {
                      "anyOf": [
                        {
                          "minimum": 1,
                          "type": "integer"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "default": 1,
                      "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                      "title": "Replicacount"
                    },
                    "resolvedBundle": {
                      "anyOf": [
                        {
                          "description": "Bundle details returned in the runtime response after scheduling.",
                          "properties": {
                            "cpuCount": {
                              "description": "Number of cpu cores.",
                              "title": "CPU Count",
                              "type": "number"
                            },
                            "gpuCount": {
                              "default": 0,
                              "description": "Number of gpu units.",
                              "title": "GPU Count",
                              "type": "integer"
                            },
                            "gpuMaker": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu manufacturer.",
                              "title": "GPU Maker"
                            },
                            "gpuTypeLabel": {
                              "anyOf": [
                                {
                                  "type": "string"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpu type label.",
                              "title": "GPU Type Label"
                            },
                            "id": {
                              "description": "Bundle identifier that was selected.",
                              "title": "Id",
                              "type": "string"
                            },
                            "memoryBytes": {
                              "description": "Memory size in bytes.",
                              "title": "Memory Bytes",
                              "type": "integer"
                            }
                          },
                          "required": [
                            "id",
                            "cpuCount",
                            "memoryBytes"
                          ],
                          "title": "ResolvedBundle",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Full details of the bundle selected at scheduling time. read-only.",
                      "readOnly": true
                    },
                    "resourceBundles": {
                      "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                      "items": {
                        "type": "string"
                      },
                      "title": "Resourcebundles",
                      "type": "array"
                    }
                  },
                  "title": "GroupRuntime",
                  "type": "object"
                },
                "title": "Containergroups",
                "type": "array"
              }
            },
            "title": "WorkloadRuntime",
            "type": "object"
          },
          "status": {
            "description": "User-facing workload status. a subset of :class:`protonstatus` — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.",
            "enum": [
              "unknown",
              "submitted",
              "provisioning",
              "launching",
              "running",
              "suspended",
              "interrupted",
              "stopping",
              "stopped",
              "errored",
              "terminated"
            ],
            "title": "WorkloadStatus",
            "type": "string"
          },
          "tags": {
            "items": {
              "additionalProperties": false,
              "properties": {
                "id": {
                  "description": "Unique identifier of the tag.",
                  "title": "Id",
                  "type": "string"
                },
                "name": {
                  "description": "Name of the tag.",
                  "title": "Name",
                  "type": "string"
                },
                "value": {
                  "description": "Value of the tag.",
                  "title": "Value",
                  "type": "string"
                }
              },
              "required": [
                "id",
                "name",
                "value"
              ],
              "title": "TagInfo",
              "type": "object"
            },
            "type": "array"
          },
          "type": {
            "description": "Discriminator for the artifact spec variant. used to label the workload, which may be used to prioritize the best matching operator available in the cluster for scheduling. defaults to ``service`` when omitted. - ``service``: generic service artifact. - ``nim``: nvidia nim model artifact.",
            "enum": [
              "service",
              "nim"
            ],
            "title": "ArtifactType",
            "type": "string"
          },
          "updatedAt": {
            "description": "Timestamp of when the entity was last updated.",
            "format": "date-time",
            "title": "Updated At",
            "type": "string"
          }
        },
        "required": [
          "id",
          "name",
          "createdAt",
          "updatedAt"
        ],
        "title": "WorkloadFormatted",
        "type": "object"
      },
      "title": "Data",
      "type": "array"
    },
    "next": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the next page, or `null` if there is no such page.",
      "title": "Next"
    },
    "previous": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The url to the previous page, or `null` if there is no such page.",
      "title": "Previous"
    },
    "totalCount": {
      "description": "The total number of records.",
      "title": "Totalcount",
      "type": "integer"
    }
  },
  "required": [
    "totalCount",
    "count",
    "next",
    "previous",
    "data"
  ],
  "title": "WorkloadListResponse",
  "type": "object"
}

WorkloadListResponse

Properties

Name Type Required Restrictions Description
count integer true The number of records on this page.
data [WorkloadFormatted] true The list of records.
next any true The url to the next page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
previous any true The url to the previous page, or null if there is no such page.

anyOf

Name Type Required Restrictions Description
» anonymous string false none

or

Name Type Required Restrictions Description
» anonymous null false none

continued

Name Type Required Restrictions Description
totalCount integer true The total number of records.

WorkloadMonitorOverallStatus

{
  "additionalProperties": false,
  "description": "Overall status as reported by the workload-monitor service.",
  "properties": {
    "lastUpdated": {
      "description": "Rfc3339 timestamp of the last state transition.",
      "title": "Lastupdated",
      "type": "string"
    },
    "state": {
      "enum": [
        "unknown",
        "submitted",
        "initializing",
        "provisioning",
        "launching",
        "running",
        "suspended",
        "warming",
        "draining",
        "interrupted",
        "restarting",
        "stopping",
        "stopped",
        "errored",
        "terminated"
      ],
      "title": "ProtonStatus",
      "type": "string"
    },
    "summary": {
      "description": "Human-readable description of the current state.",
      "title": "Summary",
      "type": "string"
    }
  },
  "required": [
    "state",
    "summary",
    "lastUpdated"
  ],
  "title": "WorkloadMonitorOverallStatus",
  "type": "object"
}

WorkloadMonitorOverallStatus

Properties

Name Type Required Restrictions Description
lastUpdated string true Rfc3339 timestamp of the last state transition.
state ProtonStatus true Proton state mapped to team-owned protonstatus.
summary string true Human-readable description of the current state.

WorkloadOperationResponse

{
  "additionalProperties": false,
  "description": "Acknowledgement returned by asynchronous workload lifecycle operations (start/stop). the operation has been accepted and queued. poll ``get /workloads/{workloadid}`` to observe the resulting status transition.",
  "properties": {
    "status": {
      "description": "Human-readable description of the operation outcome.",
      "title": "Status",
      "type": "string"
    },
    "trackVia": {
      "description": "Url to poll in order to observe the status transition.",
      "title": "Track Via",
      "type": "string"
    },
    "workloadId": {
      "description": "Id of the workload on which the operation was requested.",
      "title": "Workload ID",
      "type": "string"
    }
  },
  "required": [
    "status",
    "workloadId",
    "trackVia"
  ],
  "title": "WorkloadOperationResponse",
  "type": "object"
}

WorkloadOperationResponse

Properties

Name Type Required Restrictions Description
status string true Human-readable description of the operation outcome.
trackVia string true Url to poll in order to observe the status transition.
workloadId string true Id of the workload on which the operation was requested.

WorkloadReplacementFormatted

{
  "description": "Formatted replacement information for API responses.",
  "properties": {
    "candidateProtonIds": {
      "description": "Ids of protons pending promotion during artifact replacement.",
      "items": {
        "type": "string"
      },
      "title": "Candidateprotonids",
      "type": "array"
    },
    "status": {
      "description": "Statuses for workload replacement process.",
      "enum": [
        "unknown",
        "submitted",
        "initializing",
        "awaiting_promotion",
        "switching",
        "deleting",
        "completed",
        "errored",
        "cleaning_up"
      ],
      "title": "ReplacementStatus",
      "type": "string"
    },
    "strategy": {
      "description": "Types of replacement strategies. `rolling` - the new proton is deployed alongside the old one, and trafic is switched to the new proton once it is ready. the old proton is then decommissioned.",
      "enum": [
        "rolling"
      ],
      "title": "ReplacementStrategy",
      "type": "string"
    }
  },
  "title": "WorkloadReplacementFormatted",
  "type": "object"
}

WorkloadReplacementFormatted

Properties

Name Type Required Restrictions Description
candidateProtonIds [string] false Ids of protons pending promotion during artifact replacement.
status ReplacementStatus false Replacement status.
strategy ReplacementStrategy false Replacement strategy.

WorkloadRuntime

{
  "additionalProperties": false,
  "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
  "properties": {
    "containerGroups": {
      "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
      "items": {
        "additionalProperties": false,
        "description": "Runtime configuration for a single container group.",
        "properties": {
          "autoscaling": {
            "anyOf": [
              {
                "additionalProperties": false,
                "description": "Autoscaling configuration for a proton.",
                "properties": {
                  "enabled": {
                    "default": true,
                    "description": "Whether autoscaling is enabled.",
                    "title": "Enabled",
                    "type": "boolean"
                  },
                  "policies": {
                    "items": {
                      "additionalProperties": false,
                      "description": "Base class for autoscaling policies.",
                      "properties": {
                        "maxCount": {
                          "description": "Maximum number of replicas.",
                          "minimum": 0,
                          "title": "Max Count",
                          "type": "integer"
                        },
                        "minCount": {
                          "description": "Minimum number of replicas.",
                          "minimum": 0,
                          "title": "Min Count",
                          "type": "integer"
                        },
                        "priority": {
                          "anyOf": [
                            {
                              "type": "integer"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Policy priority when multiple policies are defined.",
                          "title": "Priority"
                        },
                        "scalingMetric": {
                          "anyOf": [
                            {
                              "oneOf": [
                                {
                                  "const": "cpuAverageUtilization",
                                  "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                  "title": "CPU Average Utilization"
                                },
                                {
                                  "const": "httpRequestsConcurrency",
                                  "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                  "title": "HTTP Requests Concurrency"
                                },
                                {
                                  "const": "gpuCacheUtilization",
                                  "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                  "title": "GPU Cache Utilization"
                                },
                                {
                                  "const": "gpuRequestQueueDepth",
                                  "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                  "title": "GPU Request Queue Depth"
                                }
                              ],
                              "title": "ScalingMetricType",
                              "type": "string"
                            },
                            {
                              "type": "string"
                            }
                          ],
                          "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                          "title": "Scaling Metric"
                        },
                        "target": {
                          "description": "Target value for the scaling metric.",
                          "minimum": 0,
                          "title": "Target",
                          "type": "number"
                        }
                      },
                      "required": [
                        "scalingMetric",
                        "target",
                        "minCount",
                        "maxCount"
                      ],
                      "title": "AutoscalingPolicy",
                      "type": "object"
                    },
                    "title": "Policies",
                    "type": "array"
                  }
                },
                "required": [
                  "policies"
                ],
                "title": "AutoscalingProperties",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Autoscaling configuration for this group. takes precedence over replicacount."
          },
          "bundleSelectionPolicy": {
            "enum": [
              "availability"
            ],
            "title": "BundleSelectionPolicy",
            "type": "string"
          },
          "containers": {
            "description": "Per-container overrides for this group.",
            "items": {
              "additionalProperties": false,
              "description": "Runtime diff targeting a single named container within a group.",
              "properties": {
                "name": {
                  "description": "Container name. must match a container declared in the artifact group.",
                  "title": "Name",
                  "type": "string"
                },
                "resourceAllocation": {
                  "anyOf": [
                    {
                      "additionalProperties": false,
                      "description": "Per-container resource allocation declared at runtime.",
                      "properties": {
                        "cpu": {
                          "anyOf": [
                            {
                              "minimum": 0.1,
                              "type": "number"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Cpu cores allocated to this container.",
                          "title": "Cpu"
                        },
                        "gpu": {
                          "anyOf": [
                            {
                              "minimum": 0,
                              "type": "number"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Gpus allocated to this container.",
                          "title": "Gpu"
                        },
                        "memory": {
                          "anyOf": [
                            {
                              "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                              "type": "string"
                            },
                            {
                              "minimum": 0,
                              "type": "integer"
                            },
                            {
                              "type": "null"
                            }
                          ],
                          "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                          "examples": [
                            "8GB",
                            "512MB"
                          ],
                          "title": "Memory"
                        }
                      },
                      "title": "ResourceAllocation",
                      "type": "object"
                    },
                    {
                      "type": "null"
                    }
                  ],
                  "description": "Resource allocation for this container. required for multi-container groups."
                }
              },
              "required": [
                "name"
              ],
              "title": "ContainerOverride",
              "type": "object"
            },
            "title": "Containers",
            "type": "array"
          },
          "name": {
            "default": "default",
            "description": "Group name. must match a container group name declared in the artifact.",
            "title": "Name",
            "type": "string"
          },
          "replicaCount": {
            "anyOf": [
              {
                "minimum": 1,
                "type": "integer"
              },
              {
                "type": "null"
              }
            ],
            "default": 1,
            "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
            "title": "Replicacount"
          },
          "resolvedBundle": {
            "anyOf": [
              {
                "description": "Bundle details returned in the runtime response after scheduling.",
                "properties": {
                  "cpuCount": {
                    "description": "Number of cpu cores.",
                    "title": "CPU Count",
                    "type": "number"
                  },
                  "gpuCount": {
                    "default": 0,
                    "description": "Number of gpu units.",
                    "title": "GPU Count",
                    "type": "integer"
                  },
                  "gpuMaker": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Gpu manufacturer.",
                    "title": "GPU Maker"
                  },
                  "gpuTypeLabel": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "null"
                      }
                    ],
                    "description": "Gpu type label.",
                    "title": "GPU Type Label"
                  },
                  "id": {
                    "description": "Bundle identifier that was selected.",
                    "title": "Id",
                    "type": "string"
                  },
                  "memoryBytes": {
                    "description": "Memory size in bytes.",
                    "title": "Memory Bytes",
                    "type": "integer"
                  }
                },
                "required": [
                  "id",
                  "cpuCount",
                  "memoryBytes"
                ],
                "title": "ResolvedBundle",
                "type": "object"
              },
              {
                "type": "null"
              }
            ],
            "description": "Full details of the bundle selected at scheduling time. read-only.",
            "readOnly": true
          },
          "resourceBundles": {
            "description": "Ordered list of bundle ids. one is selected at scheduling time.",
            "items": {
              "type": "string"
            },
            "title": "Resourcebundles",
            "type": "array"
          }
        },
        "title": "GroupRuntime",
        "type": "object"
      },
      "title": "Containergroups",
      "type": "array"
    }
  },
  "title": "WorkloadRuntime",
  "type": "object"
}

WorkloadRuntime

Properties

Name Type Required Restrictions Description
containerGroups [GroupRuntime] false Per-group runtime configuration. each entry's name must match a group in the artifact.

WorkloadSettingsResponse

{
  "additionalProperties": false,
  "description": "Response containing workload settings.",
  "properties": {
    "runtime": {
      "additionalProperties": false,
      "description": "Runtime configuration for a workload. for service and nim artifacts, all configuration is scoped inside ``container_groups``, each identified by name matching the artifact topology.",
      "properties": {
        "containerGroups": {
          "description": "Per-group runtime configuration. each entry's name must match a group in the artifact.",
          "items": {
            "additionalProperties": false,
            "description": "Runtime configuration for a single container group.",
            "properties": {
              "autoscaling": {
                "anyOf": [
                  {
                    "additionalProperties": false,
                    "description": "Autoscaling configuration for a proton.",
                    "properties": {
                      "enabled": {
                        "default": true,
                        "description": "Whether autoscaling is enabled.",
                        "title": "Enabled",
                        "type": "boolean"
                      },
                      "policies": {
                        "items": {
                          "additionalProperties": false,
                          "description": "Base class for autoscaling policies.",
                          "properties": {
                            "maxCount": {
                              "description": "Maximum number of replicas.",
                              "minimum": 0,
                              "title": "Max Count",
                              "type": "integer"
                            },
                            "minCount": {
                              "description": "Minimum number of replicas.",
                              "minimum": 0,
                              "title": "Min Count",
                              "type": "integer"
                            },
                            "priority": {
                              "anyOf": [
                                {
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Policy priority when multiple policies are defined.",
                              "title": "Priority"
                            },
                            "scalingMetric": {
                              "anyOf": [
                                {
                                  "oneOf": [
                                    {
                                      "const": "cpuAverageUtilization",
                                      "description": "Scale replicas to maintain a target average CPU utilization across pods.",
                                      "title": "CPU Average Utilization"
                                    },
                                    {
                                      "const": "httpRequestsConcurrency",
                                      "description": "Scale replicas based on HTTP request concurrency using an external HTTP-aware autoscaler. The platform manages the underlying autoscaling resources on your behalf. This scaling option will scale to zero replicas when the proton is idle.",
                                      "title": "HTTP Requests Concurrency"
                                    },
                                    {
                                      "const": "gpuCacheUtilization",
                                      "description": "Scales replicas based on model-specific GPU memory cache utilization. This signal reflects how the model's KV cache is used during inference, when such metrics are exposed by the serving runtime. High cache utilization may indicate memory pressure and can be used to trigger scale-out to maintain throughput. Applicable to NIM Artifacts only.",
                                      "title": "GPU Cache Utilization"
                                    },
                                    {
                                      "const": "gpuRequestQueueDepth",
                                      "description": "Scales replicas based on the depth of the inference request queue. This metric represents the number of incoming requests waiting to be processed by the inference service. Increasing queue depth may indicate insufficient capacity and can be used to trigger additional replicas to reduce latency. Applicable to NIM Artifacts only.",
                                      "title": "GPU Request Queue Depth"
                                    }
                                  ],
                                  "title": "ScalingMetricType",
                                  "type": "string"
                                },
                                {
                                  "type": "string"
                                }
                              ],
                              "description": "Metric used for scaling decisions. use one of the predefined values for standard autoscaling, or provide a custom metric name for nim 2.0 workloads (e.g. 'vllm:kv_cache_usage_perc'). custom metric names are only supported for nim artifacts.",
                              "title": "Scaling Metric"
                            },
                            "target": {
                              "description": "Target value for the scaling metric.",
                              "minimum": 0,
                              "title": "Target",
                              "type": "number"
                            }
                          },
                          "required": [
                            "scalingMetric",
                            "target",
                            "minCount",
                            "maxCount"
                          ],
                          "title": "AutoscalingPolicy",
                          "type": "object"
                        },
                        "title": "Policies",
                        "type": "array"
                      }
                    },
                    "required": [
                      "policies"
                    ],
                    "title": "AutoscalingProperties",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Autoscaling configuration for this group. takes precedence over replicacount."
              },
              "bundleSelectionPolicy": {
                "enum": [
                  "availability"
                ],
                "title": "BundleSelectionPolicy",
                "type": "string"
              },
              "containers": {
                "description": "Per-container overrides for this group.",
                "items": {
                  "additionalProperties": false,
                  "description": "Runtime diff targeting a single named container within a group.",
                  "properties": {
                    "name": {
                      "description": "Container name. must match a container declared in the artifact group.",
                      "title": "Name",
                      "type": "string"
                    },
                    "resourceAllocation": {
                      "anyOf": [
                        {
                          "additionalProperties": false,
                          "description": "Per-container resource allocation declared at runtime.",
                          "properties": {
                            "cpu": {
                              "anyOf": [
                                {
                                  "minimum": 0.1,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Cpu cores allocated to this container.",
                              "title": "Cpu"
                            },
                            "gpu": {
                              "anyOf": [
                                {
                                  "minimum": 0,
                                  "type": "number"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Gpus allocated to this container.",
                              "title": "Gpu"
                            },
                            "memory": {
                              "anyOf": [
                                {
                                  "pattern": "^\\s*(\\d*\\.?\\d+)\\s*(\\w+)?",
                                  "type": "string"
                                },
                                {
                                  "minimum": 0,
                                  "type": "integer"
                                },
                                {
                                  "type": "null"
                                }
                              ],
                              "description": "Ram allocated to this container. accepts a human-readable string with one of: b, kb, mb, gb (1000-based) — e.g. '8gb', '512mb'. also accepts raw byte integers.",
                              "examples": [
                                "8GB",
                                "512MB"
                              ],
                              "title": "Memory"
                            }
                          },
                          "title": "ResourceAllocation",
                          "type": "object"
                        },
                        {
                          "type": "null"
                        }
                      ],
                      "description": "Resource allocation for this container. required for multi-container groups."
                    }
                  },
                  "required": [
                    "name"
                  ],
                  "title": "ContainerOverride",
                  "type": "object"
                },
                "title": "Containers",
                "type": "array"
              },
              "name": {
                "default": "default",
                "description": "Group name. must match a container group name declared in the artifact.",
                "title": "Name",
                "type": "string"
              },
              "replicaCount": {
                "anyOf": [
                  {
                    "minimum": 1,
                    "type": "integer"
                  },
                  {
                    "type": "null"
                  }
                ],
                "default": 1,
                "description": "Number of replicas. cannot be set alongside autoscaling.enabled=true.",
                "title": "Replicacount"
              },
              "resolvedBundle": {
                "anyOf": [
                  {
                    "description": "Bundle details returned in the runtime response after scheduling.",
                    "properties": {
                      "cpuCount": {
                        "description": "Number of cpu cores.",
                        "title": "CPU Count",
                        "type": "number"
                      },
                      "gpuCount": {
                        "default": 0,
                        "description": "Number of gpu units.",
                        "title": "GPU Count",
                        "type": "integer"
                      },
                      "gpuMaker": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu manufacturer.",
                        "title": "GPU Maker"
                      },
                      "gpuTypeLabel": {
                        "anyOf": [
                          {
                            "type": "string"
                          },
                          {
                            "type": "null"
                          }
                        ],
                        "description": "Gpu type label.",
                        "title": "GPU Type Label"
                      },
                      "id": {
                        "description": "Bundle identifier that was selected.",
                        "title": "Id",
                        "type": "string"
                      },
                      "memoryBytes": {
                        "description": "Memory size in bytes.",
                        "title": "Memory Bytes",
                        "type": "integer"
                      }
                    },
                    "required": [
                      "id",
                      "cpuCount",
                      "memoryBytes"
                    ],
                    "title": "ResolvedBundle",
                    "type": "object"
                  },
                  {
                    "type": "null"
                  }
                ],
                "description": "Full details of the bundle selected at scheduling time. read-only.",
                "readOnly": true
              },
              "resourceBundles": {
                "description": "Ordered list of bundle ids. one is selected at scheduling time.",
                "items": {
                  "type": "string"
                },
                "title": "Resourcebundles",
                "type": "array"
              }
            },
            "title": "GroupRuntime",
            "type": "object"
          },
          "title": "Containergroups",
          "type": "array"
        }
      },
      "title": "WorkloadRuntime",
      "type": "object"
    }
  },
  "title": "WorkloadSettingsResponse",
  "type": "object"
}

WorkloadSettingsResponse

Properties

Name Type Required Restrictions Description
runtime WorkloadRuntime false Runtime configuration sourced from the current proton.

WorkloadSortQueryParam

{
  "anyOf": [
    {
      "anyOf": [
        {
          "const": "name",
          "description": "Sort by name in ascending order (A-Z)",
          "title": "Name Ascending",
          "type": "string"
        },
        {
          "const": "-name",
          "description": "Sort by name in descending order (Z-A)",
          "title": "Name Descending",
          "type": "string"
        },
        {
          "const": "createdAt",
          "description": "Sort by creation date in ascending order (oldest first)",
          "title": "Creation Date Ascending",
          "type": "string"
        },
        {
          "const": "-createdAt",
          "description": "Sort by creation date in descending order (newest first)",
          "title": "Creation Date Descending",
          "type": "string"
        },
        {
          "const": "updatedAt",
          "description": "Sort by update date in ascending order (oldest first)",
          "title": "Update Date Ascending",
          "type": "string"
        },
        {
          "const": "-updatedAt",
          "description": "Sort by update date in descending order (newest first)",
          "title": "Update Date Descending",
          "type": "string"
        }
      ]
    },
    {
      "const": "status",
      "description": "Sort by status in ascending order (e.g., STOPPED before RUNNING)",
      "title": "Status Ascending",
      "type": "string"
    },
    {
      "const": "-status",
      "description": "Sort by status in descending order (e.g., RUNNING before STOPPED)",
      "title": "Status Descending",
      "type": "string"
    },
    {
      "const": "importance",
      "description": "Sort by importance in ascending order (least important first)",
      "title": "Importance Ascending",
      "type": "string"
    },
    {
      "const": "-importance",
      "description": "Sort by importance in descending order (most important first)",
      "title": "Importance Descending",
      "type": "string"
    }
  ]
}

Properties

anyOf

Name Type Required Restrictions Description
anonymous CommonSortQueryParams false none

or

Name Type Required Restrictions Description
anonymous string false Sort by status in ascending order (e.g., STOPPED before RUNNING)

or

Name Type Required Restrictions Description
anonymous string false Sort by status in descending order (e.g., RUNNING before STOPPED)

or

Name Type Required Restrictions Description
anonymous string false Sort by importance in ascending order (least important first)

or

Name Type Required Restrictions Description
anonymous string false Sort by importance in descending order (most important first)

WorkloadStatsMetricName

{
  "description": "Metric names for workload statistics.",
  "enum": [
    "totalRequests",
    "requestsOverN",
    "requestsPerMinute",
    "concurrentRequests",
    "responseTime",
    "totalErrorRate"
  ],
  "title": "WorkloadStatsMetricName",
  "type": "string"
}

WorkloadStatsMetricName

Properties

Name Type Required Restrictions Description
WorkloadStatsMetricName string false Metric names for workload statistics.

Enumerated Values

Property Value
WorkloadStatsMetricName [totalRequests, requestsOverN, requestsPerMinute, concurrentRequests, responseTime, totalErrorRate]

WorkloadStatsResponse

{
  "additionalProperties": false,
  "description": "Response containing workload statistics.",
  "properties": {
    "concurrentRequests": {
      "additionalProperties": false,
      "description": "Workload concurrent requests statistics.",
      "properties": {
        "count": {
          "default": 0,
          "description": "Number of concurrent requests.",
          "title": "Count",
          "type": "integer"
        },
        "trend": {
          "default": 0,
          "description": "Trend indicator (positive = increasing).",
          "title": "Trend",
          "type": "number"
        }
      },
      "title": "WorkloadsConcurrentRequestsStats",
      "type": "object"
    },
    "errorRate": {
      "additionalProperties": false,
      "description": "Workload error rate statistics.",
      "properties": {
        "rate": {
          "default": 0,
          "description": "Error rate percentage.",
          "title": "Rate",
          "type": "number"
        },
        "trend": {
          "default": 0,
          "description": "Trend indicator (positive = increasing).",
          "title": "Trend",
          "type": "number"
        }
      },
      "title": "WorkloadsErrorRateStats",
      "type": "object"
    },
    "requests": {
      "additionalProperties": false,
      "description": "Workload request statistics.",
      "properties": {
        "failed": {
          "default": 0,
          "description": "Number of failed requests.",
          "title": "Failed",
          "type": "integer"
        },
        "succeeded": {
          "default": 0,
          "description": "Number of successful requests.",
          "title": "Succeeded",
          "type": "integer"
        },
        "total": {
          "default": 0,
          "description": "Total number of requests.",
          "title": "Total",
          "type": "integer"
        },
        "trend": {
          "default": 0,
          "description": "Trend indicator (positive = increasing).",
          "title": "Trend",
          "type": "number"
        }
      },
      "title": "WorkloadsRequestsStats",
      "type": "object"
    },
    "responseTime": {
      "additionalProperties": false,
      "description": "Workload response time statistics.",
      "properties": {
        "millis": {
          "default": 0,
          "description": "Response time in milliseconds.",
          "title": "Millis",
          "type": "integer"
        },
        "trend": {
          "default": 0,
          "description": "Trend indicator (positive = increasing).",
          "title": "Trend",
          "type": "number"
        }
      },
      "title": "WorkloadsResponseTimeStats",
      "type": "object"
    },
    "workloads": {
      "additionalProperties": false,
      "description": "Workload count statistics.",
      "properties": {
        "active": {
          "default": 0,
          "description": "Number of active workloads.",
          "title": "Active",
          "type": "integer"
        },
        "total": {
          "default": 0,
          "description": "Total number of workloads.",
          "title": "Total",
          "type": "integer"
        }
      },
      "title": "WorkloadsCountStats",
      "type": "object"
    }
  },
  "title": "WorkloadStatsResponse",
  "type": "object"
}

WorkloadStatsResponse

Properties

Name Type Required Restrictions Description
concurrentRequests WorkloadsConcurrentRequestsStats false Concurrent requests statistics.
errorRate WorkloadsErrorRateStats false Error rate statistics.
requests WorkloadsRequestsStats false Request statistics across all workloads.
responseTime WorkloadsResponseTimeStats false Response time statistics.
workloads WorkloadsCountStats false Workload count statistics.

WorkloadStatus

{
  "description": "User-facing workload status. a subset of :class:`protonstatus` — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.",
  "enum": [
    "unknown",
    "submitted",
    "provisioning",
    "launching",
    "running",
    "suspended",
    "interrupted",
    "stopping",
    "stopped",
    "errored",
    "terminated"
  ],
  "title": "WorkloadStatus",
  "type": "string"
}

WorkloadStatus

Properties

Name Type Required Restrictions Description
WorkloadStatus string false User-facing workload status. a subset of :class:protonstatus — excludes internal proton-lifecycle states (warming, draining, restarting) that should never be surfaced as a workload status.

Enumerated Values

Property Value
WorkloadStatus [unknown, submitted, provisioning, launching, running, suspended, interrupted, stopping, stopped, errored, terminated]

WorkloadsConcurrentRequestsStats

{
  "additionalProperties": false,
  "description": "Workload concurrent requests statistics.",
  "properties": {
    "count": {
      "default": 0,
      "description": "Number of concurrent requests.",
      "title": "Count",
      "type": "integer"
    },
    "trend": {
      "default": 0,
      "description": "Trend indicator (positive = increasing).",
      "title": "Trend",
      "type": "number"
    }
  },
  "title": "WorkloadsConcurrentRequestsStats",
  "type": "object"
}

WorkloadsConcurrentRequestsStats

Properties

Name Type Required Restrictions Description
count integer false Number of concurrent requests.
trend number false Trend indicator (positive = increasing).

WorkloadsCountStats

{
  "additionalProperties": false,
  "description": "Workload count statistics.",
  "properties": {
    "active": {
      "default": 0,
      "description": "Number of active workloads.",
      "title": "Active",
      "type": "integer"
    },
    "total": {
      "default": 0,
      "description": "Total number of workloads.",
      "title": "Total",
      "type": "integer"
    }
  },
  "title": "WorkloadsCountStats",
  "type": "object"
}

WorkloadsCountStats

Properties

Name Type Required Restrictions Description
active integer false Number of active workloads.
total integer false Total number of workloads.

WorkloadsErrorRateStats

{
  "additionalProperties": false,
  "description": "Workload error rate statistics.",
  "properties": {
    "rate": {
      "default": 0,
      "description": "Error rate percentage.",
      "title": "Rate",
      "type": "number"
    },
    "trend": {
      "default": 0,
      "description": "Trend indicator (positive = increasing).",
      "title": "Trend",
      "type": "number"
    }
  },
  "title": "WorkloadsErrorRateStats",
  "type": "object"
}

WorkloadsErrorRateStats

Properties

Name Type Required Restrictions Description
rate number false Error rate percentage.
trend number false Trend indicator (positive = increasing).

WorkloadsRequestsStats

{
  "additionalProperties": false,
  "description": "Workload request statistics.",
  "properties": {
    "failed": {
      "default": 0,
      "description": "Number of failed requests.",
      "title": "Failed",
      "type": "integer"
    },
    "succeeded": {
      "default": 0,
      "description": "Number of successful requests.",
      "title": "Succeeded",
      "type": "integer"
    },
    "total": {
      "default": 0,
      "description": "Total number of requests.",
      "title": "Total",
      "type": "integer"
    },
    "trend": {
      "default": 0,
      "description": "Trend indicator (positive = increasing).",
      "title": "Trend",
      "type": "number"
    }
  },
  "title": "WorkloadsRequestsStats",
  "type": "object"
}

WorkloadsRequestsStats

Properties

Name Type Required Restrictions Description
failed integer false Number of failed requests.
succeeded integer false Number of successful requests.
total integer false Total number of requests.
trend number false Trend indicator (positive = increasing).

WorkloadsResponseTimeStats

{
  "additionalProperties": false,
  "description": "Workload response time statistics.",
  "properties": {
    "millis": {
      "default": 0,
      "description": "Response time in milliseconds.",
      "title": "Millis",
      "type": "integer"
    },
    "trend": {
      "default": 0,
      "description": "Trend indicator (positive = increasing).",
      "title": "Trend",
      "type": "number"
    }
  },
  "title": "WorkloadsResponseTimeStats",
  "type": "object"
}

WorkloadsResponseTimeStats

Properties

Name Type Required Restrictions Description
millis integer false Response time in milliseconds.
trend number false Trend indicator (positive = increasing).