Skip to content

NVIDIA NIM Air‑Gap Deployment Guide

This document describes how to deploy NVIDIA Inference Microservice (NIM) in environments without direct internet connectivity ("air‑gapped" clusters).

For generic, non‑air‑gapped deployments see NIM Generic cluster configuration.

Overview

Air‑gapped deployment consists of two independent workstreams that you can perform in any order:

  1. Mirror NIM assets into the secured network
  2. Copy container images to internal container registry
  3. Copy model profiles (weights) to S3‑compatible object storage
  4. Tell DataRobot where to find those assets
  5. Configure Helm values with environment variables
  6. Set Secure Configuration of type AWS Credentias to access object‑storage

1. Prerequisites

1.1 Generic

  • S3‑compatible object storage that supports virtual‑hosted style addressing (<bucket>.<domain>).
  • S3-compatible HTTPS API endpoint with a valid certificate chain (public or internal CA).

1.2 Optional: MinIO specifics

  • Set the MINIO_DOMAIN=<minio-domain> to MinIO server configuration, to enable virtual‑hosted–style - see MinIO documentation
  • TLS certificate/SAN covers subdomains (bucket names in URL) *.minio.<domain>.
  • Custom CA bundle mounted into all DataRobot K8s workloads – see Configuring Custom CA

1.3 NIM Container Image versions

The DataRobot is shipped with the image tags described in Generic NIM GPU Recommendations. If you want to run a newer or older tag of a supported container, see Section 5 – Updating NIM Container Tags.

2. Mirror NIM Container Images

Requires access to NVIDIA NGC and an internal container registry. For fully disconnected sites, see §3.5 Offline transfer.

# 2.1  Create target repository (once). Example for ECR:
export TARGET_REGISTRY=123456789012.dkr.ecr.us-east-1.amazonaws.com
aws ecr create-repository --repository-name nim

# 2.2  Log in to registries
docker login nvcr.io -u '$oauthtoken' -p "${NGC_API_KEY}"
docker login $TARGET_REGISTRY

# 2.3  Copy the image
crane cp nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 \
  $TARGET_REGISTRY/nim/meta/llama-3.2-1b-instruct:1.8.5

3. Mirror NIM Model Profiles

3.1 Discover available profiles

Run NIM Container

export NGC_API_KEY=<your-ngc-api-key>
export NIM_CACHE_DIR=$HOME/nim_model_cache
mkdir -p "$NIM_CACHE_DIR"

docker run --rm -it \
  -e NGC_API_KEY=$NGC_API_KEY \
  -v "$NIM_CACHE_DIR":/opt/nim/.cache \
  nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5 \
  /bin/bash
Inside the container run: list-model-profiles. Example output:

INFO 2025-06-23 11:48:15.81 info.py:23] Unable to get hardware specifications, retrieving every profile's info.
4f904d571fe60ff24695b5ee2aa42da58cb460787a968f1e8a09f5a7e862728d: vllm-bf16-tp1-pp1
ac34857f8dcbd174ad524974248f2faf271bd2a0355643b2cf1490d0fe7787c2: tensorrt_llm-trtllm_buildable-bf16-tp1-pp1
...
ac5071bbd91efcc71dc486fcd5210779570868b3b8328b4abf7a408a58b5e57c: tensorrt_llm-l40s-bf16-tp1-pp1-throughput
ad17776f4619854fccd50354f31132a558a1ca619930698fd184d6ccf5fe3c99: tensorrt_llm-l40s-fp8-tp1-pp1-throughput
af876a179190d1832143f8b4f4a71f640f3df07b0503259cedee3e3a8363aa96: tensorrt_llm-h200-fp8-tp1-pp1-throughput
0782f55dcd12ec36d6126d9768fd364182986eecd25526eb206553df388057b7: tensorrt_llm-l40s-fp8-tp1-pp1-throughput-lora-lora
bae7cf0b51c21f0e6f697593ee58dc8a555559b4b81903502a9e0ffbdc1b67a9: tensorrt_llm-b200-fp8-tp1-pp1-throughput-lora-lora
c6821c013c559912c37e61d7b954c5ca8fe07dda76d8bea0f4a52320e0a54427: tensorrt_llm-a100_sxm4_40gb-bf16-tp1-pp1-throughput
e7dbd9a8ce6270d2ec649a0fecbcae9b5336566113525f20aee3809ba5e63856: tensorrt_llm-h100-bf16-tp1-pp1-throughput
f749ba07aade1d9e1c36ca1b4d0b67949122bd825e8aa6a52909115888a34b95: vllm-bf16-tp1-pp1-lora

3.2 Select profiles

  1. Always include generic profiles (hardware‑agnostic), e.g.:
  2. vllm-bf16-tp1-pp1
  3. tensorrt_llm-trtllm_buildable-bf16-tp1-pp1
  4. Add every profile that mentions each GPU model present in the cluster
  5. Rule of thumb: Include all profiles that mention a given GPU model; NIM selects the optimal one automatically.

For example, for a cluster with: * 1 GPU of type L40s * 1 GPU of type A10G

The following profiles should be copied, note there is no specific profile for A10G, so we include the generic one:

vllm-bf16-tp1-pp1 
tensorrt_llm-trtllm_buildable-bf16-tp1-pp1
tensorrt_llm-l40s-bf16-tp1-pp1-throughput
tensorrt_llm-l40s-fp8-tp1-pp1-throughput
tensorrt_llm-l40s-fp8-tp1-pp1-throughput-lora-lora

3.3 Download the selected profiles

download-to-cache --profiles \
  vllm-bf16-tp1-pp1 \
  tensorrt_llm-trtllm_buildable-bf16-tp1-pp1 \
  tensorrt_llm-l40s-bf16-tp1-pp1-throughput \
  tensorrt_llm-l40s-fp8-tp1-pp1-throughput

3.4 Upload to object storage

a. Online upload (network path available)

export NIM_CACHE_DIR=${HOME}/nim_model_cache
export NIM_BUCKET_NAME=<bucket>
export AWS_ACCESS_KEY_ID=<aws-access-key-id>
export AWS_SECRET_ACCESS_KEY=<aws-secret-access-key>
export AWS_ENDPOINT_URL=<https://minio.domain/>  # omit for AWS S3

python process_nim_cache.py --process-and-upload \
      --dest-dir ./nim-model-profiles \
      --workers 10 \
      --insecure  # skip TLS verification only if absolutely required

This command normalises profile filenames and uploads them to ${NIM_BUCKET_NAME}.

b. Offline preparation (no network path)

Prepare a ready‑to‑copy directory that can be moved by USB drive and uploaded later from inside the secure zone:

export NIM_CACHE_DIR=${HOME}/nim_model_cache
python process_nim_cache.py --process --dest-dir ./nim-model-profiles

Copy ./nim-model-profiles to the removable media. Once inside the air‑gapped network, use a native CLI (e.g. aws s3 cp ... --recursive) to upload the files to the internal object store.

3.5 Offline transfer (fully disconnected sites)

If the cluster has no routable path whatsoever (no VPN, no bastion), transfer assets by removable drive:

  1. On an internet‑enabled staging host (outside the secure zone)

  2. Export images to a tarball:

    crane cp nvcr.io/nim/meta/<image>:<tag> oci:/tmp/nim-image && \
    tar -czf nim-images.tgz /tmp/nim-image
    # or with Docker
    docker save -o nim-images.tgz nvcr.io/nim/meta/<image>:<tag>
    

  3. Archive the model‑profile cache directory:
    tar -czf nim-profiles.tgz "$NIM_CACHE_DIR"
    
  4. Generate checksums (and sign, if required):

    sha256sum nim-*.tgz > checksums.txt
    

  5. Physically move the encrypted SSD / USB drive into the air‑gapped network.

  6. Inside the secure zone

  7. Import images into the internal registry:

    tar -xzf nim-images.tgz -C /tmp && \
    crane cp oci:/tmp/nim-image <internal-registry>/nim/meta/<image>:<tag>
    # or
    docker load -i nim-images.tgz && \
    docker push <internal-registry>/nim/meta/<image>:<tag>
    

  8. Upload model profiles to object storage:
    aws --endpoint-url https://<minio.domain>/ s3 cp nim-profiles/ s3://<bucket>/ --recursive
    

4. Configure DataRobot

4.1 Helm chart values

The custom-templates CronJob runs every hour to refresh the metadata for NIM containers. This metadata tells the DataRobot which containers are available and where to pull both the container images and their model profiles.

core:
  config_env_vars:
    SERVE_NIM_MODEL_FROM_LOCAL_ASSETS: true                    # used by DataRobot UI
...
custom-templates:
  custom_model_nim_image_registry: <internal-registry>          # Must match the value of CUSTOM_MODEL_NIM_IMAGE_REGISTRY environment variable
  custom_model_nim_repository_override: s3://<bucket-name>/
  custom_model_nim_aws_endpoint_url: "https://<domain>"         # skip for AWS, required for MinIO
  custom_model_nim_aws_region: "us-east-1"
  custom_model_nim_image_url_allowlist:
    - "nvcr.io/nim/meta/llama-3.2-1b-instruct:1.8.5"
    - "nvcr.io/nim/meta/llama-3.2-8b-instruct:1.8.4" 
Where custom_model_nim_image_url_allowlist limits the Templates UI to the images listed; tags are ignored.

To apply changes immediately, trigger the job manually:

export NS=<namespace>
export JOB_NAME=custom-templates-manual-$(date +%s)
kubectl create job --from=cronjob/custom-templates-cronjob $JOB_NAME -n $NS
kubectl logs job/$JOB_NAME -n $NS -f

4.2 Secure Configuration (AWS Credentials)

Create a Secure Configuration of type AWS Credentials; grant at least Consumer

Create a Secure Configuration (AWS Credentials) entry in the DataRobot UI and grant at least Consumer access to the users who will deploy NIMs.

5 Updating NIM Container Tags

5.1 Switching to a different tag of a supported container

  1. Mirror new image tag.
  2. Mirror matching model profiles.
  3. In DataRobot UI, update the NIM Version in Image Registry ▸ Import from NVIDIA NGC. If you skip this step, the pod fails with ImagePullBackOff.

5.2 Using an unlisted container

Any unlisted NIM containers are not supported in release 11.1 because template metadata is missing.
Support is planned for a future release.