Skip to content

Prepare DataRobot Helm chart values

This page describes how to prepare and customize the DataRobot Helm chart's values_dr.yaml file to define your specific installation parameters, including cluster resources, persistent storage configurations, networking details, and application-level settings.

Minimal values files

For platform-specific configuration examples, refer to the minimal values files located within the datarobot-prime/override Helm chart artifact directory.

tar xzf datarobot-prime-X.X.X.tgz
cd datarobot-prime/override 

備考

Replace X.X.X with the latest release chart version.

You can use one of the below files as reference when creating values_dr.yaml the values files:

  • override/minimal_datarobot-aws_values.yaml
  • override/minimal_datarobot-aws-ext-pcs_values.yaml

変数

The following variables are used throughout the installation process. You will use these variables in the YAML templates and example commands found in the installation sections.

Kubernetes

DATAROBOT_NAMESPACE : The primary namespace in which DataRobot will operate.

DR_APP_HELM_RELEASE_NAME : The name of the Helm release used for the main application chart. The recommended value is dr.

AWS ECR

AWS_ECR_URL : An private Amazon Elastic Container Registry (ECR) URL used for both retrieving images and as storage for the Image Build Service.

AWS EKS

DATAROBOT_SERVICE_ACCOUNT : The global service account name that will be used by the majority of the deployments to communicate with AWS.

AWS_IRSA_ROLE_NAME : The EKS-specific annotation that maps an IAM role to service accounts for access to AWS resources.

AWS_REGION : The AWS region where the cluster is deployed.

AWS_S3_REGION : The AWS region of the DATAROBOT_S3_BUCKET.

Object storage

DATAROBOT_S3_BUCKET : An S3-compatible bucket used by DataRobot for object storage.

AWS_S3_HOST : IP or hostname of the S3 appliance (e.g., s3.us-east-1.amazonaws.com). You may additionally set the AWS_S3_REGION variable if you want to explicitly specify which region you run on, or if you are using a storage provider which provides an S3-compatible API.

S3_IS_SECURE : Whether the service is using HTTPS. The True value, which is the default, has only been tested in AWS S3.

FILE_STORAGE_PREFIX : Represents the prefix applied to all paths in the file storage medium after the root path.

  filestore:
    type: s3
    environment:
      S3_HOST: AWS_S3_HOST
      S3_BUCKET: DATAROBOT_S3_BUCKET
      S3_IS_SECURE: "True"
      S3_VALIDATE_CERTS: "True"
      S3_REGION: AWS_S3_REGION
      S3_PORT: "443"
      S3_SERVER_SIDE_ENCRYPTION: DISABLED 

備考

If you encounter uploading issues, you can disable multipart file uploads by specifying MULTI_PART_S3_UPLOAD: false . In general, multipart uploads are well-tested and support much larger file uploads, so you will likely not need to change the default.

S3 ingestion

To enable data ingestion from private objects stored in S3, see the AWS S3 Ingest guide.

Disabling TLS verification

If your environment is configured to use self-signed or unverified TLS certificates for its internal object storage connection, the following configuration options are necessary:

global:
  filestore:
    type: s3
    environment:
      S3_VALIDATE_CERTS: false 

Server-side encryption settings

You can configure DataRobo to enable server-side encryption (SSE) for data at rest when it stores new files to S3 (this does not affect existing files). You can use either an S3-managed or customer-managed (CMK) key for encryption.

The following configuration settings are available to configure server-side encryption:

S3_SERVER_SIDE_ENCRYPTION: With the default value AES256, data is encrypted using S3-managed keys. Alternatively, set to aws:kms to use server-side encryption with KMS-managed keys. Set to DISABLED to completely disable server-side encryption.

AWS_S3_SSE_KMS_KEY_ID: Encrypts data using a particular KMS key. Set to the identity of a specific customer-managed key, or leave blank to let AWS create a key on your behalf (see AWS managed CMK). This setting only applies when S3_SERVER_SIDE_ENCRYPTION is set to aws:kms.

備考

  • Server-side encryption means the encryption keys are independently obtained by the S3 service and are hidden from the DataRobot application. If the keys are deleted or access is lost, DataRobot cannot help decrypt the data.
  • S3 will make a billable call to the AWS KMS service every time DataRobot makes a read or write request against an encrypted object. Refer to the AWS documentation on reducing the costs of AWS KMS resource usage with SSE.

Web portal

DR_WEBPAGE_FQDN : The Fully-qualified domain name (FQDN) of the web portal where users will log in (e.g., datarobot-app.company-name.com).

ADMIN_USER_EMAIL : The email address for the initial administrative user in the web portal (e.g., admin@datarobot.com).

ADMIN_USER_PASSWORD : The password for the initial administrative user.

DR_LICENSE_CONTENT : The encrypted content of the DataRobot license file.

Feature flags

The DataRobot platform utilizes feature flags as an administrative mechanism to control the availability of various functionalities and newly-released features.

Feature enablement requires assistance. Contact your DataRobot Representative or submit a request through the DataRobot Support Portal or by emailing support@datarobot.com.

core:
  config_env_vars: 

IRSA role configration

Once the AWS IRSA role is configured, you can use it to deploy in AWS with the following YAML override configuration:

global:
  serviceAccount:
    name: DATAROBOT_SERVICE_ACCOUNT
    annotations:
      eks.amazonaws.com/role-arn: AWS_IRSA_ROLE_NAME 

備考

  • Replace DATAROBOT_SERVICE_ACCOUNT with your AWS service account name.
  • Replace AWS_IRSA_ROLE_NAME with your AWS IRSA role name.

TLS Configuration

DataRobot provides specific configuration options to encrypt data-in-flight for different parts of the platform. See:

Configure StorageClass

DataRobot is deployed using default StorageClass configured for your cluster. To specify a non-default StorageClass name globally for all persistent volumes requested by the main DataRobot platform chart, adjust the following setting in the chart's values_dr.yaml file:

global:
  storageClassName: DESIGNATED_STORAGE_CLASS_NAME 

備考

Replace DESIGNATED_STORAGE_CLASS_NAME with the name of your StorageClass.

Persistent Critical Services (PCS)

This guide helps you customize the relevant PCS component for:

  • MongoDB
  • PostgreSQL
  • RabbitMQ
  • Redis
  • Elasticsearch

Review notebook configuration

DataRobot updated the naming for notebook chart values—this change may impact your process. For detailed instructions, see the Notebooks upgrade guide.

Generative AI service

When you install the Generative AI (GenAI) service in a restricted network environment, two migrations need to be disabled and performed manually. For instructions, see the following pages:

Tile server

For information on updating the tile server in a restricted network, see the following page:

カスタムタスク

For information on configuring custom tasks in a restricted network, see the following pages: