Image Build Service¶
This service aims to support all of DR's requirements for container image building. The main feature is accepting incoming REST API calls for building Docker images from a given set of artifacts, running those builds on k8s infrastructure, and publishing the resulting image to a target repository.
This is the list of known internal consumers (DataRobot components) of Image Build Service: - カスタムモデル - カスタムアプリ - カスタムジョブ - GenAI (Buzok) - Notebooks
For additional information on object storage, please refer to object-storage-configuration
Configurations for onprem¶
IBS runs buildkit behind the scenes. For the known limitations, troubleshooting guide and for running BuildKit daemon as a non-root user please refer to https://github.com/moby/buildkit/blob/master/docs/rootless.md
For best performance, the node group running IBS should be running on Linux kernel 5.11+ and set buildService.envApp.secret.BUILDKIT_OCI_WORKER_SNAPSHOTTER to "overlayfs" in the IBS charts. 例:
build-service:
buildService:
envApp:
secret:
BUILDKIT_OCI_WORKER_SNAPSHOTTER: "overlayfs"
Example AWS AMIs with Linux Kernel Support: - Amazon Linux 2023 (AL2023) - ubuntu-eks/k8s_1.28/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20240301
To check the kernel version run on the node:
uname -r
Image signing¶
In 10.2.3 we added the ability to sign images using the Notary project and AWS Signer.
A few things are needed to utilize this feature.
Please follow the AWS Signer guide to setup image signing prerequisites on aws. - Create a signing profile https://docs.aws.amazon.com/signer/latest/developerguide/signing-profiles.html - Make sure the IAM identity has the right permissions: https://docs.aws.amazon.com/signer/latest/developerguide/image-signing-prerequisites.html#signer-iam-policy
When the AWS config is done we can provide Build service with the required values. An example is provided below but it is important to know that NOTATION_PROFILE and NOTATION_REGION has to match the previously created signer.
build-service:
buildService:
envApp:
secret:
IMAGE_SIGNER_TYPE: "notation"
NOTATION_PROFILE: "arn:aws:signer:us-east-1:1234567890:/signing-profiles/your-signing-profiles"
NOTATION_PLUGIN: "com.amazonaws.signer.notation.plugin"
NOTATION_REGION: "us-east-1"
Build service will push the signature to ECR when an image has been built so it is important that access is given.
Object Storage and other options in springProfile¶
The springProfile values in the values.yaml file support various configurations, including different object storage options. The format is <cluster_type>,<registry_type>,<storage_type>. The supported configurations are:
rhos,private_registry,{s3,azure,gcs,minio}aws,private_registry,{s3,azure,gcs,minio}rhos,ecr,{s3,azure,gcs,minio}aws,ecr,{s3,azure,gcs,minio}azure,acr,{s3,azure,gcs,minio}gcp,gcr,{s3,azure,gcs,minio}
If not specified, springProfile is generated automatically by checking registry and storage-specific environment variables.
The first value (cluster type) is always set to aws, unless it's installed in an OpenShift cluster, then rhos.
The second value (registry type) is always set to private_registry unless it should use ECR and authenticate to it with IRSA (IAM roles for ServiceAccounts).
The third value (storage type) is chosen between s3, azure, gcs, and minio.
Example for GKE installation:
build-service:
buildService:
springProfile: aws,private_registry,gcs
OAuth2 Registry Authentication (ACR / GCR)¶
Build Service supports OAuth2 Bearer token authentication for Private Registry, ACR and GCR, with automatic fallback to Basic authentication.
Azure Container Registry¶
build-service:
buildService:
springProfile: "azure,acr,s3"
envApp:
secret:
DOCKERHUB_USERNAME: "<service_principal_appId>"
DOCKERHUB_PASSWORD: "<service_principal_password>"
AZURE_TENANT_ID: "<your_tenant_id>"
AZURE_TENANT_ID is required. The service principal must have AcrPush role on the target registry.
Google Container Registry / Artifact Registry¶
build-service:
buildService:
springProfile: "google,gcr,s3"
envApp:
secret:
DOCKERHUB_USERNAME: "_json_key"
DOCKERHUB_PASSWORD: "<raw_service_account_json>"
DOCKERHUB_PASSWORD must be the raw JSON from the service account key file (not base64-encoded). The service account must have roles/artifactregistry.writer.
Extra image pull credentials¶
By default, IBS is configured to mount global image pull credentials into Image Builder pods, so it can pull base images from the same registry during custom image builds. In cases where custom images (models and/or apps) aren't trusted, an administrator might want to disable them during installation by setting the following Helm values:
build-service:
imagePullSecrets:
mount: false
In such cases, or when multiple different image registries are going to be used for pulling base images, additional image pull credentials can be provided for Image Builder:
build-service:
imageBuilder:
# -- Image pull credentials for the image builder
imagePullCredentials:
# -- Plain text credentials for OCI (docker) registries
# Each entry should have host, user, and password fields
plain:
- host: docker.io
user: dockerhub_user1
password: dockerhub_password
- host: private-registry.example
user: user1
password:
# -- Secrets of type kubernetes.io/dockerconfigjson for OCI (docker) registries
# Each entry is an existing k8s secret name
secret:
- name: dockerhub-image-pull-secret-read-only
# -- External secrets for OCI (docker) registries
# Works only if buildService.secretManager.enabled is true
# Each entry is an external secret reference to create a k8s secret from
# Each entry should have secretName, remoteRefKey, and remoteRefProperty fields
externalSecret:
- secretName: dockerhub-image-pull-secret-external
remoteRefKey: /ibs/image-builder-pull-credentials
remoteRefProperty: dockerconfigjson
Resources requests and limits¶
For supporting building of extra large custom images (usually 4+GB), Image Builder might need to be configured with increased CPU, memory, and storage requests and limits:
build-service:
imageBuilder:
resources:
requests:
cpu: "1"
memory: "4G"
limits:
cpu: "2"
memory: "4G"
The default requests/limits are: CPU 1/1, RAM 1GB/1GB, Storage 1GB/100GB.
Environment Variables under envApp¶
The envApp section in the values.yaml file defines the environment variables required for different profiles. Here are the supported environment variables and their explanations:
Secrets¶
- LOGS_BUCKET: Specifies the bucket name for storing logs (
databy default). - ALLOW_SELF_SIGNED_CERTS: Controls whether TLS verification is enabled for requests to external resources like Minio, Container Registry, and Ingress (
falseby default). - DISABLE_HTTPS: Determines whether HTTP or HTTPS is used for the Private Container Registry (
falseby default).
Database Parameters¶
- POSTGRES_USER: The username for the PostgreSQL database.
- POSTGRES_PASSWORD: The password for the PostgreSQL database.
- POSTGRES_HOST: The host address of the PostgreSQL database.
- POSTGRES_PORT: The port number for the PostgreSQL database (
5432by default). - POSTGRES_DB: The database name for PostgreSQL.
Parameters Required by minio Profile¶
- MINIO_SERVER_HOST: The endpoint URL for the Minio server (e.g.,
http://<minio_endpoint>:9000). - MINIO_SERVER_ROOT_USER: The root username for the Minio server.
- MINIO_SERVER_ROOT_PASSWORD: The root password for the Minio server.
Parameters Required by private_registry Profile¶
- DOCKERHUB_USERNAME: The username for DockerHub or other private OCI registry.
- DOCKERHUB_PASSWORD: The password for DockerHub or other private OCI registry.
External Secrets¶
The externalsecret section allows defining environment variables that are retrieved from an external secret store. These values can override any keys defined in the secret section. 例:
build-service:
buildService:
envApp:
externalsecret:
POSTGRES_USER:
name: external-postgres-secret
key: username
POSTGRES_PASSWORD:
name: external-postgres-secret
key: password
This configuration retrieves the POSTGRES_USER and POSTGRES_PASSWORD from an external secret named external-postgres-secret.
QEMU Image Builder for a rootless mode without capabilities (security)¶
In order to comply with pod security policies, Image Builder pods might be required to run with the following security context applied:
build-service:
imageBuilder:
securityContext:
allowPrivilegeEscalation: false
seccompProfileType: "RuntimeDefault"
seLinuxType: "" # Leave empty for non-SELinux clusters; use "container_t" for SELinux-enabled environments
capabilities:
drop:
- ALL
add: []
Without SETUID and SETGID capabilities added, BuildKit is unable to build images (in rootless mode), so it needs to be run inside a QEMU virtual machine. To make that possible, we ship a separate Docker image (tag) of ImageBuilder, which must be used when all capabilities are dropped as shown above:
build-service:
imageBuilder:
tag: "11.1.25-qemu-image"
resources:
requests:
cpu: "2"
memory: "4G"
limits:
cpu: "2"
memory: "8G"
Note: Ensure the LOGS_BUCKET in values.yaml is set to your actual S3 bucket name (default is "data"):
buildService:
envApp:
secret:
LOGS_BUCKET: your-s3-bucket-name
Please note, that running Image Builder within QEMU causes its performance to degrade slightly while requiring more compute resources (CPU and RAM) comparing to the default mode.
Image scanning¶
Image Build Service supports pre-push security scanning of container images. This feature is opt-in and disabled by default. To enable it, configure the following in values.yaml:
build-service:
imageBuilder:
imageScanner:
enabled: true
image: "customer-registry.com/customer-scanner:v1.0.0" # Custom scanner image (must include curl)
command: ["/bin/sh", "-c"] # Required for report upload functionality
args:
- |
snyk container test --file=/shared/image.tar \
--severity-threshold=medium \
--json-file-output=/shared/scan-report.json \
|| exit 1
env:
SNYK_TOKEN: "customer-token-here" # Scanner-specific credentials
reportUploadPath: "s3://customer-bucket/scan-reports" # Optional: cloud storage path
resources: # Optional: defaults to 512Mi/256Mi memory, 500m/250m CPU
limits:
memory: "1Gi"
cpu: "1000m"
requests:
memory: "512Mi"
cpu: "500m"
Requirements:
- Custom scanner image: You must provide a custom scanner container image that includes both your scanning tool and curl. Base scanner images (e.g., aquasec/trivy:latest) cannot be used directly.
- Scanner interface: The scanner must read from /shared/image.tar and write a JSON report to /shared/scan-report.json. Exit code 0 allows the build to continue; non-zero stops the build.