Skip to content

NIM Containers – Troubleshooting Appendix

See the Air‑Gap configuration guide.

Profile Not Found

Symptom

The container terminates during startup with a NoSuchKey error:

Selected profile: 74bfd8b2df5eafe452a9887637eef4820779fb4e1edb72a4a7a2a1a2d1e6480b (tensorrt_llm-a10g-bf16-tp1-pp1-throughput)
...
Exception: S3 GetObject failed: service error: NoSuchKey: The specified key does not exist.

Root Cause

The specified model profile is not present in the object storage bucket.

Resolution

  1. Copy the Selected profile name from the log output, e.g tensorrt_llm-a10g-bf16-tp1-pp1-throughput
  2. Follow the documented procedures to download and upload the required profiles:
  3. Download profiles
  4. Upload profiles

Lazy Instance Previously Poisoned

Symptom

pyo3_runtime.PanicException: Lazy instance has previously been poisoned

Root Cause / Resolution

  • Endpoint is HTTP: NIM Containers require all communication to occur over HTTPS.
  • Container Cannot Verify HTTPS Certificate: Ensure your Public CA or Private CA bundle is mounted to all DataRobot Kubernetes workloads. Refer to the Public CA configuration example.

MinIO Connection Error

Symptom

Exception: S3 GetObject failed: dispatch failure: io error: error trying to connect: 
dns error: failed to lookup address information: Name or service not known: dns error

Root Cause

NIM supports only the S3 virtual-hosted style (<bucket>.<domain>). As a result, MinIO must be configured to accept wildcard hosts.

Resolution

Option A – Wildcard Configuration (Recommended) 1. Configure DNS to support *.minio.internal-example.net. 2. Ensure the TLS certificate includes the Subject Alternative Name (SAN) for *.minio.internal-example.net. 3. Update ingress rules to allow wildcard hosts. 4. Set the domain name on the MinIO server using the following environment variable:

env:
  - name: MINIO_DOMAIN
    value: minio.internal-example.net
For further details, refer to the MinIO official documentation.

Option B - Path‑style fallback 1. Create a bucket that matches the domain part (minio). 2. Configure NIM:

NIM_REPOSITORY_OVERRIDE=s3://minio/
AWS_ENDPOINT_URL=https://internal-example.net/
3. In MinIO, configure the MINIO_DOMAIN environment variable to internal-example.net.

Errors During Model Profile Uploads

Double Bucket Name in URL

Symptom:

botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: 
"https://nim-bucket.nim-bucket.minio.internal-example.net/nim%252Fmeta%252Fllama-3.2-1b-instruct%253Ahf-e9f8eff-nim1.5%252B%253Ffile%253DLICENSE.txt"

Resolution:
Ensure that the AWS_ENDPOINT_URL environment variable does not include the bucket name.

# Incorrect {: #incorrect }
export AWS_ENDPOINT_URL=https://nim-bucket.minio.internal-example.net/

# Correct {: #correct }
export AWS_ENDPOINT_URL=https://minio.internal-example.net/

Signature Does Not Match

Symptom:

boto3.exceptions.S3UploadFailedError: Failed to upload ...
An error occurred (SignatureDoesNotMatch) when calling the PutObject operation: 
The request signature we calculated does not match the signature you provided. 
Check your key and signing method.

Root cause: This error is typically caused by an incorrect access key or secret. Review the logs for HTTP 403 errors to identify authentication issues.