Skip to content

Image Build Service

This service aims to support all of DR's requirements for container image building. The main feature is accepting incoming REST API calls for building Docker images from a given set of artifacts, running those builds on k8s infrastructure, and publishing the resulting image to a target repository.

This is the list of known internal consumers (DataRobot components) of Image Build Service: - カスタムモデル - カスタムアプリ - カスタムジョブ - GenAI (Buzok) - Notebooks

For additional information on object storage, please refer to object-storage-configuration

Configurations for onprem

IBS runs buildkit behind the scenes. For the known limitations, troubleshooting guide and for running BuildKit daemon as a non-root user please refer to https://github.com/moby/buildkit/blob/master/docs/rootless.md

For best performance, the node group running IBS should be running on Linux kernel 5.11+ and set buildService.envApp.secret.BUILDKIT_OCI_WORKER_SNAPSHOTTER to "overlayfs" in the IBS charts. 例:

build-service:
  buildService:
    envApp:
      secret:
        BUILDKIT_OCI_WORKER_SNAPSHOTTER: "overlayfs" 

Example AWS AMIs with Linux Kernel Support: - Amazon Linux 2023 (AL2023) - ubuntu-eks/k8s_1.28/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20240301

To check the kernel version run on the node:

    uname -r 

Image signing

In 10.2.3 we added the ability to sign images using the Notary project and AWS Signer. A few things are needed to utilize this feature.

Please follow the AWS Signer guide to setup image signing prerequisites on aws. - Create a signing profile https://docs.aws.amazon.com/signer/latest/developerguide/signing-profiles.html - Make sure the IAM identity has the right permissions: https://docs.aws.amazon.com/signer/latest/developerguide/image-signing-prerequisites.html#signer-iam-policy

When the AWS config is done we can provide Build service with the required values. An example is provided below but it is important to know that NOTATION_PROFILE and NOTATION_REGION has to match the previously created signer.

build-service:
  buildService:
    envApp:
      secret:
        IMAGE_SIGNER_TYPE: "notation"
        NOTATION_PROFILE: "arn:aws:signer:us-east-1:1234567890:/signing-profiles/your-signing-profiles"
        NOTATION_PLUGIN: "com.amazonaws.signer.notation.plugin"
        NOTATION_REGION: "us-east-1" 

Build service will push the signature to ECR when an image has been built so it is important that access is given.

Object Storage and other options in springProfile

The springProfile values in the values.yaml file support various configurations, including different object storage options. The format is <cluster_type>,<registry_type>,<storage_type>. The supported configurations are:

  • rhos,private_registry,{s3,azure,gcs,minio}
  • aws,private_registry,{s3,azure,gcs,minio}
  • rhos,ecr,{s3,azure,gcs,minio}
  • aws,ecr,{s3,azure,gcs,minio}
  • azure,acr,{s3,azure,gcs,minio}
  • gcp,gcr,{s3,azure,gcs,minio}

If not specified, springProfile is generated automatically by checking registry and storage-specific environment variables.

The first value (cluster type) is always set to aws, unless it's installed in an OpenShift cluster, then rhos.

The second value (registry type) is always set to private_registry unless it should use ECR and authenticate to it with IRSA (IAM roles for ServiceAccounts).

The third value (storage type) is chosen between s3, azure, gcs, and minio.

Example for GKE installation:

build-service:
  buildService:
    springProfile: aws,private_registry,gcs 

OAuth2 Registry Authentication (ACR / GCR)

Build Service supports OAuth2 Bearer token authentication for Private Registry, ACR and GCR, with automatic fallback to Basic authentication.

Azure Container Registry

build-service:
  buildService:
    springProfile: "azure,acr,s3"
    envApp:
      secret:
        DOCKERHUB_USERNAME: "<service_principal_appId>"
        DOCKERHUB_PASSWORD: "<service_principal_password>"
        AZURE_TENANT_ID: "<your_tenant_id>" 

AZURE_TENANT_ID is required. The service principal must have AcrPush role on the target registry.

Google Container Registry / Artifact Registry

build-service:
  buildService:
    springProfile: "google,gcr,s3"
    envApp:
      secret:
        DOCKERHUB_USERNAME: "_json_key"
        DOCKERHUB_PASSWORD: "<raw_service_account_json>" 

DOCKERHUB_PASSWORD must be the raw JSON from the service account key file (not base64-encoded). The service account must have roles/artifactregistry.writer.

Extra image pull credentials

By default, IBS is configured to mount global image pull credentials into Image Builder pods, so it can pull base images from the same registry during custom image builds. In cases where custom images (models and/or apps) aren't trusted, an administrator might want to disable them during installation by setting the following Helm values:

build-service:
  imagePullSecrets:
    mount: false 

In such cases, or when multiple different image registries are going to be used for pulling base images, additional image pull credentials can be provided for Image Builder:

build-service:
  imageBuilder:
    # -- Image pull credentials for the image builder
    imagePullCredentials:
      # -- Plain text credentials for OCI (docker) registries
      # Each entry should have host, user, and password fields
      plain:
        - host: docker.io
          user: dockerhub_user1
          password: dockerhub_password
        - host: private-registry.example
          user: user1
          password:

      # -- Secrets of type kubernetes.io/dockerconfigjson for OCI (docker) registries
      # Each entry is an existing k8s secret name
      secret:
        - name: dockerhub-image-pull-secret-read-only

      # -- External secrets for OCI (docker) registries
      # Works only if buildService.secretManager.enabled is true
      # Each entry is an external secret reference to create a k8s secret from
      # Each entry should have secretName, remoteRefKey, and remoteRefProperty fields
      externalSecret:
        - secretName: dockerhub-image-pull-secret-external
          remoteRefKey: /ibs/image-builder-pull-credentials
          remoteRefProperty: dockerconfigjson 

Resources requests and limits

For supporting building of extra large custom images (usually 4+GB), Image Builder might need to be configured with increased CPU, memory, and storage requests and limits:

build-service:
  imageBuilder:
    resources:
      requests:
        cpu: "1"
        memory: "4G"
      limits:
        cpu: "2"
        memory: "4G" 

The default requests/limits are: CPU 1/1, RAM 1GB/1GB, Storage 1GB/100GB.

Environment Variables under envApp

The envApp section in the values.yaml file defines the environment variables required for different profiles. Here are the supported environment variables and their explanations:

Secrets

  • LOGS_BUCKET: Specifies the bucket name for storing logs (data by default).
  • ALLOW_SELF_SIGNED_CERTS: Controls whether TLS verification is enabled for requests to external resources like Minio, Container Registry, and Ingress (false by default).
  • DISABLE_HTTPS: Determines whether HTTP or HTTPS is used for the Private Container Registry (false by default).

Database Parameters

  • POSTGRES_USER: The username for the PostgreSQL database.
  • POSTGRES_PASSWORD: The password for the PostgreSQL database.
  • POSTGRES_HOST: The host address of the PostgreSQL database.
  • POSTGRES_PORT: The port number for the PostgreSQL database (5432 by default).
  • POSTGRES_DB: The database name for PostgreSQL.

Parameters Required by minio Profile

  • MINIO_SERVER_HOST: The endpoint URL for the Minio server (e.g., http://<minio_endpoint>:9000).
  • MINIO_SERVER_ROOT_USER: The root username for the Minio server.
  • MINIO_SERVER_ROOT_PASSWORD: The root password for the Minio server.

Parameters Required by private_registry Profile

  • DOCKERHUB_USERNAME: The username for DockerHub or other private OCI registry.
  • DOCKERHUB_PASSWORD: The password for DockerHub or other private OCI registry.

External Secrets

The externalsecret section allows defining environment variables that are retrieved from an external secret store. These values can override any keys defined in the secret section. 例:

build-service:
  buildService:
    envApp:
      externalsecret:
        POSTGRES_USER:
          name: external-postgres-secret
          key: username
        POSTGRES_PASSWORD:
          name: external-postgres-secret
          key: password 

This configuration retrieves the POSTGRES_USER and POSTGRES_PASSWORD from an external secret named external-postgres-secret.

QEMU Image Builder for a rootless mode without capabilities (security)

In order to comply with pod security policies, Image Builder pods might be required to run with the following security context applied:

build-service:
  imageBuilder:
    securityContext:
      allowPrivilegeEscalation: false
      seccompProfileType: "RuntimeDefault"
      seLinuxType: ""  # Leave empty for non-SELinux clusters; use "container_t" for SELinux-enabled environments
      capabilities:
        drop:
        - ALL
        add: [] 

Without SETUID and SETGID capabilities added, BuildKit is unable to build images (in rootless mode), so it needs to be run inside a QEMU virtual machine. To make that possible, we ship a separate Docker image (tag) of ImageBuilder, which must be used when all capabilities are dropped as shown above:

build-service:
  imageBuilder:
    tag: "11.1.25-qemu-image"
    resources:
      requests:
        cpu: "2"
        memory: "4G"
      limits:
        cpu: "2"
        memory: "8G" 

Note: Ensure the LOGS_BUCKET in values.yaml is set to your actual S3 bucket name (default is "data"):

buildService:
  envApp:
    secret:
      LOGS_BUCKET: your-s3-bucket-name 

Please note, that running Image Builder within QEMU causes its performance to degrade slightly while requiring more compute resources (CPU and RAM) comparing to the default mode.

Image scanning

Image Build Service supports pre-push security scanning of container images. This feature is opt-in and disabled by default. To enable it, configure the following in values.yaml:

build-service:
  imageBuilder:
    imageScanner:
      enabled: true
      image: "customer-registry.com/customer-scanner:v1.0.0"  # Custom scanner image (must include curl)
      command: ["/bin/sh", "-c"]  # Required for report upload functionality
      args:
        - |
          snyk container test --file=/shared/image.tar \
            --severity-threshold=medium \
            --json-file-output=/shared/scan-report.json \
            || exit 1
      env:
        SNYK_TOKEN: "customer-token-here"  # Scanner-specific credentials
      reportUploadPath: "s3://customer-bucket/scan-reports"  # Optional: cloud storage path
      resources:  # Optional: defaults to 512Mi/256Mi memory, 500m/250m CPU
        limits:
          memory: "1Gi"
          cpu: "1000m"
        requests:
          memory: "512Mi"
          cpu: "500m" 

Requirements: - Custom scanner image: You must provide a custom scanner container image that includes both your scanning tool and curl. Base scanner images (e.g., aquasec/trivy:latest) cannot be used directly. - Scanner interface: The scanner must read from /shared/image.tar and write a JSON report to /shared/scan-report.json. Exit code 0 allows the build to continue; non-zero stops the build.