Skip to content

Prepare DataRobot Helm chart values

This page describes how to prepare and customize the DataRobot Helm chart's values_dr.yaml file to define your specific installation parameters, including cluster resources, persistent storage configurations, networking details, and application-level settings.

Minimal values files

For platform-specific configuration examples, refer to the minimal values files located within the datarobot-prime/override Helm chart artifact directory.

tar xzf datarobot-prime-X.X.X.tgz
cd datarobot-prime/override

Note

Replace X.X.X with the latest release chart version.

You can use one of the below files as reference when creating values_dr.yaml the values files:

  • override/minimal_datarobot-google_values
  • override/minimal_datarobot-google-ext-pcs_values.yaml
  • override/minimal_datarobot-google_workload_identity.yaml

Variables

The following variables are used throughout the installation process. You will use these variables in the .yaml templates and example commands found in the installation sections.

Kubernetes

DATAROBOT_NAMESPACE : The primary namespace in which DataRobot will operate.

DR_APP_HELM_RELEASE_NAME : The name of the Helm release used for the main application chart. The recommended value is dr.

Google GKE

GCP_REGION : Google region used to deploy the resources.

GCP_BUCKET_NAME :Google Storage Bucket name.

GCP_PROJECT_NAME : Google Project name.

GCP_PROJECT_ID : Google Project id.

GCP_REPO_NAME : Google Artifact Registry repository name.

GCP_SERVICE_ACCOUNT_NAME : Your Service Account name.

GCP_SERVICE_ACCOUNT_EMAIL : Your Service Account email.

GCP_BASE64_SERVICE_ACCOUNT_KEY : Your Service Key encoded as base64.

Note

To avoid a multiline issue in the helm configuration, you must always encode the service account key as base64. To encode the service account key, use the following command cat ACCOUNT_KEY_FILE.json | base64.

Web portal

DR_WEBPAGE_FQDN : The fully qualified domain name (FQDN) of the web portal where users will log in (e.g., datarobot-app.company-name.com).

ADMIN_USER_EMAIL : The email address for the initial administrative user in the web portal (e.g., admin@datarobot.com).

ADMIN_USER_PASSWORD : The password for the initial administrative user.

DR_LICENSE_CONTENT : The encrypted content of the DataRobot license file.

Build-Service and Google Artifact Registry {: build-service-and-gar }

To enable the build-service to write images to the Artifact Registry, a dedicated service account key is required. In the datarobot helm chart's build-service values section, you must configure it as follows:

build-service:
  buildService:
    envApp:
      secret:
        DOCKERHUB_USERNAME: _json_key_base64
        DOCKERHUB_PASSWORD: GCP_BASE64_SERVICE_ACCOUNT_KEY

The service account must have the roles/artifactregistry.writer permission in your previously-created Google Artifact Registry.

Note

The service account key must always be encoded as base64 to avoid a multiline issue in the helm configuration. To encode the service account key, use the following command cat ACCOUNT_KEY_FILE.json | base64.

Custom Models Build-Service and Google Artifact Registry

To use custom models and custom tasks in the Google Artifact Registry, the following configuration options are required:

core:
  config_env_vars:
    IMAGE_BUILDER_CUSTOM_MODELS_REGISTRY_HOST: [GCP_REGION]-docker.pkg.dev
    IMAGE_BUILDER_CUSTOM_MODELS_ENVIRONMENT_REGISTRY_REPO: GCP_PROJECT_NAME/GCP_REPO_NAME/base-image
    IMAGE_BUILDER_CUSTOM_MODELS_REGISTRY_REPO: GCP_PROJECT_NAME/GCP_REPO_NAME/managed-image
    IMAGE_BUILDER_EPHEMERAL_CUSTOM_MODELS_REGISTRY_REPO: GCP_PROJECT_NAME/GCP_REPO_NAME/ephemeral-image

Note

  • Reaplce GCP_REGION with your actual Google region used to deploy the resources.
  • Replace GCP_PROJECT_NAME with your actual Google Cloud Project name.
  • Reaplce GCP_REPO_NAME with your actual GAR repository name.

Object storage

DataRobot primarily employs GKE Workload Identity to access object storage containers in Google Cloud, as this method provides a secure and efficient way to authenticate Kubernetes workloads without relying on hard-coded credentials. However, for legacy reasons, a service account key is also supported.

Workload identity

As part of the DataRobot installation process, you must create a dedicated service account to ensure secure access to Google Cloud resources.

The service account requires the following roles:

  • roles/storage.objectUser
  • roles/storage.insightsCollectorService

Use the following gcloud commands to configure the service account. Replace placeholders with your project details.

To store configuration values, set the following variables (ensure that the below commands include the necessary context):

export PROJECT_ID="GCP_PROJECT_ID"
export NAMESPACE="DATAROBOT_NAMESPACE"
export SA_NAME="GCP_SERVICE_ACCOUNT_NAME"

Note

  • Replace GCP_PROJECT_ID with your actual Google Cloud Project id.
  • Replace DATAROBOT_NAMESPACE with your DataRobot namespace.
  • Replace GCP_SERVICE_ACCOUNT_NAME with desired Sirvice Account name (i.e. datarobot-storage-sa)
gcloud iam service-accounts create "$SA_NAME" --project "$PROJECT_ID"
gcloud projects add-iam-policy-binding "$PROJECT_ID" --member="serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com" --role="roles/storage.objectUser"
gcloud projects add-iam-policy-binding "$PROJECT_ID" --member="serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com" --role="roles/storage.insightsCollectorService"

SALIST=('datarobot-storage-sa' 'dynamic-worker' 'prediction-server-sa' 'internal-api-sa' 'build-service' 'tileservergl-sa' 'nbx-notebook-revisions-account' 'buzok-account' 'exec-manager-qw' 'exec-manager-wrangling' 'lrs-job-manager' 'blob-view-service')
for sa in "${SALIST[@]}"
do
    gcloud iam service-accounts add-iam-policy-binding "$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com" \
    --role "roles/iam.workloadIdentityUser" \
    --member "serviceAccount:$PROJECT_ID.svc.id.goog[$NAMESPACE/$sa]"
done

Once completed, make sure the dr_values.yaml section follows the minimal_datarobot-google_workload_identity.yaml example file.

Note

For additional information, see the GKE Workload Identity documentation.

Using a service account key

As part of the DataRobot installation process, a dedicated service account and key can be created. To grant the service account access to your bucket, the following roles must be granted as described in the Google official documentation:

  • roles/storage.objectUser
  • roles/storage.insightsCollectorService

Once the information is available, the dr_values.yaml section looks like this:

global:
  filestore:
    type: google
    environment:
      GOOGLE_STORAGE_CREDENTIALS_SOURCE: content
      GOOGLE_STORAGE_BUCKET: GCP_BUCKET_NAME
      GOOGLE_STORAGE_KEYFILE_CONTENTS: GCP_BASE64_SERVICE_ACCOUNT_KEY

Note

For additional information, see the minimal_datarobot-google_values.yaml example file.

Feature flags

To control the availability of various functionalities and newly-released features, the DataRobot platform utilizes feature flags as an administrative mechanism.

Feature enablement requires assistance. Contact your DataRobot Representative or submit a request through the DataRobot Support Portal or by emailing support@datarobot.com.

core:
  config_env_vars:

TLS configuration

DataRobot provides specific configuration options to encrypt data-in-flight for different parts of the platform. See:

Configure StorageClass

DataRobot is deployed using the default StorageClass configured for your cluster. To specify a non-default StorageClass name globally for all persistent volumes requested by the main DataRobot platform chart, adjust the following setting in the chart's values_dr.yaml file:

global:
  storageClassName: DESIGNATED_STORAGE_CLASS_NAME

Note

Replace DESIGNATED_STORAGE_CLASS_NAME with the name of your StorageClass.

Persistent Critical Services (PCS)

This guide helps you customize the relevant PCS component for:

  • MongoDB
  • PostgreSQL
  • RabbitMQ
  • Redis
  • Elasticsearch

Review notebook configuration

DataRobot updates the naming for notebook chart values—this change may impact your process. For detailed instructions, see the Notebooks upgrade guide.

Generative AI service

When you install the generative AI (GenAI) service in a restricted network environment, two migrations need to be disabled and performed manually. For instructions, see the following pages:

Tile server

For information on updating the tile server in a restricted network, see the following page:

Custom tasks

For information on configuring custom tasks in a restricted network, see the following pages: