NVIDIA NIM Air-gap deployment guide¶

This document describes how to deploy NVIDIA Inference Microservice (NIM) in environments without direct internet connectivity ("air‑gapped" clusters).

概要¶

Air‑gapped deployment involves two independent workstreams, which may be performed in any order:

Mirror NIM assets into the secured network:
Transfer container images to the internal container registry.
Transfer model profiles (weights) to S3‑compatible object storage.
Configure DataRobot to access these assets:
Set Helm values using the appropriate environment variables.
Create a Secure Configuration of type AWS Credentials to enable access to object storage.

前提条件¶

Limited platform support¶

Only S3‑compatible object storage solutions (such as AWS S3 or MinIO) are supported for storing NIM model weights. Azure Blob Storage and Google Cloud Storage aren't supported. Note: The object storage used for NIM model weights does not need to be the same as the storage used by DataRobot.

General requirements for NIM container¶

Feature flags are set, see NIM Generic cluster configuration
GPU resource bundles and LRS Operator are configured, see Custom Models with GPU Inference.
Trust Manager is installed in K8S cluster
Trust Manager is configured with a Public CA (or Private CA bundle), see Configuring Custom CA.
S3-compatible HTTPS API endpoint with a valid TLS certificate.
S3‑compatible object storage that supports virtual‑hosted style addressing (<bucket>.<domain>).
A dedicated bucket must be provisioned for NIM model profiles, with all profiles stored at the root level (no prefixes, "subdirectories").
The bucket must be secured using permanent credentials (e.g., an AWS IAM user with appropriate permissions). Temporary credentials, such as those provided by AWS STS, aren't supported.
Python 3.8 or later and the boto3 library must be installed on the host system used to upload NIM model profiles to object storage.

MinIO‑specific requirements¶

Configure the MinIO server with the MINIO_DOMAIN=<minio-domain> environment variable to enable virtual‑hosted–style addressing. Refer to the MinIO documentation for details.
Ensure the TLS certificate and Subject Alternative Name (SAN) include all relevant subdomains (i.e., bucket names in the URL) in the format *.minio.<domain>.

1. Mirror NIM container images¶

To transfer images between registries, follow these steps:

前提条件¶

Install the crane tool. See crane installation instructions.
Ensure access to both NVIDIA NGC and your internal container registry.
For fully isolated environments, see 2.5 Offline Transfer.
Before copying images, create a repository in your internal registry with a name that exactly matches the source image name (e.g., nim/meta/llama-3.2-1b-instruct).
DataRobot is distributed with image tags as specified in the NIM GPU Support Matrix table. To use a different supported container tag, refer to 4. Updating NIM Container Tags.

Copy image: AWS ECR Registry¶

# Log in to the source registry (NVIDIA NGC) {: #log-in-to-the-source-registry-nvidia-ngc }
docker login nvcr.io -u '$oauthtoken' -p "${NGC_API_KEY}"

# Log in to the target registry {: #log-in-to-the-target-registry }
aws ecr get-login-password --region us-east-1 | docker login \
  --username AWS \
  --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com

# Create the target repository {: #create-the-target-repository }
aws ecr create-repository --repository-name nim/meta/llama-3.2-1b-instruct

# Copy the image {: #copy-the-image }
crane cp nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 \
  123456789012.dkr.ecr.us-east-1.amazonaws.com/nim/meta/llama-3.2-1b-instruct:1.8.5

Copy image: Azure Container registry (acr)¶

# Log in to the source registry (NVIDIA NGC) {: #log-in-to-the-source-registry-nvidia-ngc }
docker login nvcr.io -u '$oauthtoken' -p "${NGC_API_KEY}"

# Log in to the target registry {: #log-in-to-the-target-registry }
az acr login --name <registry-name> --username '<your-username>' --password '<your-password>'

# Copy the image {: #copy-the-image }
crane cp nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 \
  myacrregistry.azurecr.io/nim/meta/llama-3.2-1b-instruct:1.8.5

Copy image: Google artifact registry (gar)¶

# Log in to the source registry (NVIDIA NGC) {: #log-in-to-the-source-registry-nvidia-ngc }
docker login nvcr.io -u '$oauthtoken' -p "${NGC_API_KEY}"

# Log in to the target registry {: #log-in-to-the-target-registry }
gcloud auth configure-docker us-east1-docker.pkg.dev
gcloud auth login --no-browser
gcloud config set project <project-id>

# Create the target repository {: #create-the-target-repository }
gcloud artifacts repositories create <repository-name> --repository-format=docker --location=us-central1 --description="Docker repo for NIM images"

# Copy the image {: #copy-the-image }
crane cp nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 \
  us-east1-docker.pkg.dev/<project-id>/<repository-name>/nim/meta/llama-3.2-1b-instruct:1.8.5

Copy image: OpenShift Container platform (ocp)¶

# Log in to the source registry (NVIDIA NGC) {: #log-in-to-the-source-registry-nvidia-ngc }
docker login nvcr.io -u '$oauthtoken' -p "${NGC_API_KEY}"

# Log in to the target registry {: #log-in-to-the-target-registry }
export REGISTRY=<your-openshift-registry>
export NAMESPACE=nim-containers

oc login -u "$USER" -p "$PASS" --server="https://api.${CLUSTER}:6443"
oc registry login --registry="${REGISTRY}" --to="$HOME/.docker/config.json"

# Create a new project if needed and grant permissions to push images {: #create-a-new-project-if-needed-and-grant-permissions-to-push-images }
oc new-project 
oc policy add-role-to-user system:image-pusher "$OC_USER" -n $NAMESPACE

# Copy the image {: #copy-the-image }
crane cp \
  nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 \
  "$REGISTRY/$NAMESPACE/nim-meta-llama-3.2-1b-instruct:1.8.5"

NOTE: OCP doesn't permit slashes in image names. Therefore, image names must be flattened to the format nim-meta-llama-3.2-1b-instruct:1.8.5. Failure to do so results in an error similar to the following:

Error: GET ... 
unexpected status code 401 Unauthorized: 
{"details":"repository name \"nim-containers/nim/meta/llama-3.2-1b-instruct\" invalid: 
it must be of the format \u003cproject\u003e/\u003cname\u003e"}

NOTE: You may encounter log messages similar to the following, which can be safely disregarded:

retrying without mount: POST https://registry-openshift-image-registry.apps.rosa.cluster.openshiftapps.com/v2/nim-containers/...:
unexpected status code 400 Bad Request

Copy image: Docker Hub¶

export NAMESPACE=nim-containers

# Log in to the source registry (NVIDIA NGC) {: #log-in-to-the-source-registry-nvidia-ngc }
docker login nvcr.io -u '$oauthtoken' -p "${NGC_API_KEY}"
docker login docker.io

crane cp nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8 \
"docker.io/${NAMESPACE}/nim-meta-llama-3.2-1b-instruct:1.8"

NOTE: Docker Hub doesn't permit slashes in image names. Therefore, image names must be flattened to the format nim-meta-llama-3.2-1b-instruct:1.8.5. Failure to do so results in an error similar to the following:

crane cp   nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8   "$REGISTRY/nim-containers/nim/meta/llama-3.2-1b-instruct:1.8"
Copying from nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8 to docker.io/nim-containers/nim/meta/llama-3.2-1b-instruct:1.8
...
unexpected status code 401 Unauthorized (HEAD responses have no body, use GET for details)

2. Mirror NIM model profiles¶

2.1 Discover available profiles¶

To enumerate available model profiles, launch the NIM container as follows:

export NGC_API_KEY=<your-ngc-api-key>
export NIM_CACHE_DIR=$HOME/nim_model_cache
mkdir -p "$NIM_CACHE_DIR"

docker run --rm -it \
  -e NGC_API_KEY=$NGC_API_KEY \
  -v "$NIM_CACHE_DIR":/opt/nim/.cache \
  nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 \
  /bin/bash

Within the container, execute the list-model-profiles command. Example output:

INFO 2025-06-23 11:48:15.81 info.py:23] Unable to get hardware specifications, retrieving every profile's info.
4f904d571fe60ff24695b5ee2aa42da58cb460787a968f1e8a09f5a7e862728d: vllm-bf16-tp1-pp1
ac34857f8dcbd174ad524974248f2faf271bd2a0355643b2cf1490d0fe7787c2: tensorrt_llm-trtllm_buildable-bf16-tp1-pp1
...
ac5071bbd91efcc71dc486fcd5210779570868b3b8328b4abf7a408a58b5e57c: tensorrt_llm-l40s-bf16-tp1-pp1-throughput
ad17776f4619854fccd50354f31132a558a1ca619930698fd184d6ccf5fe3c99: tensorrt_llm-l40s-fp8-tp1-pp1-throughput
af876a179190d1832143f8b4f4a71f640f3df07b0503259cedee3e3a8363aa96: tensorrt_llm-h200-fp8-tp1-pp1-throughput
0782f55dcd12ec36d6126d9768fd364182986eecd25526eb206553df388057b7: tensorrt_llm-l40s-fp8-tp1-pp1-throughput-lora-lora
bae7cf0b51c21f0e6f697593ee58dc8a555559b4b81903502a9e0ffbdc1b67a9: tensorrt_llm-b200-fp8-tp1-pp1-throughput-lora-lora
c6821c013c559912c37e61d7b954c5ca8fe07dda76d8bea0f4a52320e0a54427: tensorrt_llm-a100_sxm4_40gb-bf16-tp1-pp1-throughput
e7dbd9a8ce6270d2ec649a0fecbcae9b5336566113525f20aee3809ba5e63856: tensorrt_llm-h100-bf16-tp1-pp1-throughput
f749ba07aade1d9e1c36ca1b4d0b67949122bd825e8aa6a52909115888a34b95: vllm-bf16-tp1-pp1-lora

2.2 Select appropriate profiles¶

Always include the generic (hardware-agnostic) profiles:

vllm-bf16-tp1-pp1
tensorrt_llm-trtllm_buildable-bf16-tp1-pp1

Next, add all profiles that reference each GPU model present in your cluster. NIM automatically selects the optimal profile at runtime.

For example, if your cluster contains: * 1 GPU of type L40s * 1 GPU of type T4

You should mirror the following profiles. If a specific profile for a GPU (e.g., T4) is unavailable, ensure the generic profile is included:

vllm-bf16-tp1-pp1
tensorrt_llm-trtllm_buildable-bf16-tp1-pp1
tensorrt_llm-l40s-bf16-tp1-pp1-throughput
tensorrt_llm-l40s-fp8-tp1-pp1-throughput
tensorrt_llm-l40s-fp8-tp1-pp1-throughput-lora-lora

2.3 Download the selected profiles¶

Use the following command to download the required profiles:

download-to-cache --profiles \
  vllm-bf16-tp1-pp1 \
  tensorrt_llm-trtllm_buildable-bf16-tp1-pp1 \
  tensorrt_llm-l40s-bf16-tp1-pp1-throughput \
  tensorrt_llm-l40s-fp8-tp1-pp1-throughput

2.4 Uploading model profiles to object storage¶

A. online upload (direct network access available)¶

To upload the processed model profiles to S3-compatible object storage, execute the command below. Refer to the code example in process_nim_cache.py (see NIM Containers - Code Snippet Appendix). This script standardizes profile filenames to the format required by the NIM Container and uploads them to the designated object storage bucket.

export NIM_CACHE_DIR=${HOME}/nim_model_cache
export NIM_BUCKET_NAME=<bucket>
export AWS_ACCESS_KEY_ID=<aws-access-key-id>
export AWS_SECRET_ACCESS_KEY=<aws-secret-access-key>
export AWS_ENDPOINT_URL=<https://minio.domain/>  # Omit for AWS S3

python3 process_nim_cache.py --process-and-upload \
      --dest-dir ./nim-model-profiles \
      --insecure  # Use this flag to skip TLS verification only if absolutely necessary

B. offline preparation (no network access)¶

Prepare a directory containing the processed profiles, which can be transferred via removable media and uploaded from within the secure environment:

export NIM_CACHE_DIR=${HOME}/nim_model_cache
python3 process_nim_cache.py --process --dest-dir ./nim-model-profiles

Copy the ./nim-model-profiles directory to removable media. Once inside the air-gapped network, use a native CLI (e.g., aws s3 cp ... --recursive) to upload the files to the internal object storage.

2.5 Offline transfer (fully disconnected environments)¶

For clusters with no routable network path (no VPN, no bastion), transfer assets using removable media:

On an internet-connected staging host (outside the secure zone)¶

Export container images to a tarball:

crane cp nvcr.io/nim/meta/<image>:<tag> oci:/tmp/nim-image && \
tar -czf nim-images.tgz /tmp/nim-image
# Alternatively, using Docker {: #alternatively-using-docker }
docker save -o nim-images.tgz nvcr.io/nim/meta/<image>:<tag>

Archive the model profile cache directory:

tar -czf nim-profiles.tgz "$NIM_CACHE_DIR"
# Generate checksums (and sign if required): {: #generate-checksums-and-sign-if-required }
sha256sum nim-*.tgz > checksums.txt

Physically Transfer the encrypted media into the air-gapped network.¶

Within the Secure Zone:

Import images into the internal registry:

tar -xzf nim-images.tgz -C /tmp && \
crane cp oci:/tmp/nim-image <internal-registry>/nim/meta/<image>:<tag>
# Alternatively {: #alternatively }
docker load -i nim-images.tgz && \
docker push <internal-registry>/nim/meta/<image>:<tag>

Upload model profiles to object storage:

aws --endpoint-url https://<minio.domain>/ \
  s3 cp nim-profiles/ s3://<bucket>/ --recursive

3. DataRobot configuration¶

3.1 Helm chart values¶

The custom-templates CronJob executes hourly to refresh metadata for NIM containers. This metadata informs DataRobot of available containers and the locations from which to pull both container images and their associated model profiles.

Example configuration for Docker Hub:

custom-templates:
  nim:
    registry_host: docker.io
    # optional for EKS and ACR. required for GKE, OCP and Docker Hub
    registry_repo: datarobot
    # skip for EKS ACR, and GKE. required for OCP and Docker Hub
    registry_image_name_hyphenate: "true"
    s3_bucket: nim-model-templates
    aws_endpoint_url: https://s3.amazonaws.com
    aws_region: us-east-1
    # Restricts DataRobot UI to the listed images; tags are not enforced
    image_url_allowlist:
    - nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5
    - nvcr.io/nim/meta/llama-3.2-3b-instruct:1.8.4

Note: The registry_host, registry_repo, and registry_image_name_hyphenate: true settings transform Docker image URLs in templates as follows: * From: nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 * To: docker.io/DataRobot/nim-meta-llama-3.2-1b-instruct:1.8.5

Example configuration for AWS ECR:

custom-templates:
  nim:
    registry_host: 123456789012.dkr.ecr.us-east-1.amazonaws.com
    s3_bucket: nim-model-templates
    aws_endpoint_url: https://s3.amazonaws.com
    aws_region: us-east-1
    image_url_allowlist:
    - nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5
    - nvcr.io/nim/meta/llama-3.2-3b-instruct:1.8.4

Note: The registry_host and registry_image_name_hyphenate: "" (unset) settings ensure that Docker image URLs in templates are transformed as follows: * From: nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 * To: 123456789012.dkr.ecr.us-east-1.amazonaws.com/nim/meta/llama-3.2-1b-instruct:1.8.5

To apply these changes immediately, manually trigger the job as follows:

export NS=<namespace>
export JOB_NAME=custom-templates-manual-$(date +%s)
kubectl create job --from=cronjob/custom-templates-cronjob $JOB_NAME -n $NS
kubectl logs job/$JOB_NAME -n $NS -f

3.2 ImagePullSecrets¶

Create an ImagePullSecret in the global namespace containing credentials for your internal container registry. This secret is required by the LRS Operator to pull NIM container images.

To create the secret, run:

kubectl create secret docker-registry private-image-credentials \
  --docker-server=<internal-registry-fqdn> \
  --docker-username=<username> \
  --docker-password=<password>

Next, configure the LRS Operator to use this secret by adding the following to its configuration:

lrs-operator:
  operator:
    pullSecrets:
    - name: private-image-credentials

3.3 Secure configuration (AWS credentials)¶

Create a Secure Configuration of type AWS Credentials and assign at least Consumer permissions.

Refer to the Secure Configuration (AWS Credentials) documentation to create the entry in the DataRobot UI.

Note: The bucket must be secured using permanent credentials (e.g., an AWS IAM user with appropriate permissions). Temporary credentials, such as those provided by AWS STS, aren't supported.

4. Updating NIM container tags¶

4.1 Updating the tag of a supported container¶

Mirror the container with new image tag.
Mirror the corresponding model profiles.
In the DataRobot UI, update the NIM Version under Image Registry ▸ Import from NVIDIA NGC. Note: Failure to update this value results in pod startup errors (ImagePullBackOff).

4.2 Using unlisted containers¶

Unlisted NIM containers aren't supported in release 11.1 due to missing template metadata. Support for unlisted containers is planned for a future release.