Skip to content

Customtasks configurations

Required installation values:

data transmission image pull secrets

The data transmission image is packaged with the app and should be part of the local Docker repo. In order to use that, you'll need the following settings.

  • CUSTOM_TASK_DATA_TRANSMISSION_IMAGE_NAME
  • description: The docker location for the custom-task-data-transmission image which handles upload and download during a CustomTask's fit action.
  • default: docker.io/DataRobot/custom-task-data-transmission:11.0.0-12d6cc08996ebcafc0f107e78d8fa4b270a40501-71
  • CUSTOM_TASK_DATA_TRANSMISSION_IMAGE_PULL_SECRET
  • description: This value lists the name for an already created V1Secret for image pull secrets that CustomTaskExecutions can use to pull down the static data transmission image from the repo it's hosted on.
  • default: "DataRobot-image-pullsecret"

These values must be set and should already be set to correct default values. If they're set to the correct values then you should be able to pull the image in the test pod.

Important installation values

These are the most likely values to be set on installation. The shared namespace defaults to the helm chart's namespace. In addition, you almost certainly need to set up network policies in order for custom tasks to successfully run. See the network policies section for details.

  • CUSTOM_TASK_EXECUTION_SHARED_IMAGE_PULL_SECRET
  • description: This value lists the name for an already created V1Secret for image pull secrets that CustomTaskExecutions can use to configure V1Pods. This is used for execution images hosted on Image Build Service. If the value is empty, defaults to creating a secret name and V1Secret per execution at runtime.
  • default: ''
  • CUSTOM_TASK_EXECUTION_SHARED_NAMESPACE
  • description: This value describes the namespace where all CustomTask fit jobs get assigned. If the value is empty the default behavior of one namespace per fit job applies.
  • default: "{{ .Release.Namespace }}"
  • CUSTOM_TASK_EXECUTION_NODE_SELECTOR_KEY
  • description: Run CustomTask fit on nodes with label: {this_value: CUSTOM_TASK_EXECUTION_NODE_SELECTOR_VALUE}. If CUSTOM_TASK_EXECUTION_NODE_SELECTOR_KEY or CUSTOM_TASK_EXECUTION_NODE_SELECTOR_VALUE is empty, doesn't add a node selector. If you input a node selector that doesn't exist, jobs hang forever.
  • default: ''
  • CUSTOM_TASK_EXECUTION_NODE_SELECTOR_VALUE
  • description: Run CustomTask fit on nodes with label: {CUSTOM_TASK_EXECUTION_NODE_SELECTOR_KEY: this_value}. If CUSTOM_TASK_EXECUTION_NODE_SELECTOR_KEY or CUSTOM_TASK_EXECUTION_NODE_SELECTOR_VALUE is empty, doesn't add a node selector. If you input a node selector that doesn't exist, jobs hang forever.
  • default: ''
  • CUSTOM_TASK_EXECUTION_NODE_TOLERATION
  • description: Run CustomTask fit on nodes with specific taints. Empty values doesn't add node tolerations. Only one toleration allowed.
  • default: ''
  • CUSTOM_TASK_EXECUTION_SERVICE_ACCOUNT_NAME
  • description: When this is set, it's the name of the service account that's used to run pods during a custom task execution
  • default: ''

All other settable values

Below are all the other values that can be set to control custom task settings. Most of these have sane defaults set and it's very rare that they would require additional settings

Name Description Default
CUSTOM_TASKS_MAX_NUMBER_OF_TASKS_IN_BP The maximum number of CustomTasks that can be in a single UserBlueprint. 3
CUSTOM_TASK_EPHEMERAL_PREDICT_REPLICAS The number of replicas for each LRS in a during scoring for Custom(Training)Tasks. 1
CUSTOM_TASK_EXECUTION_IMAGE_BUILD_TIMEOUT null 1200
CUSTOM_TASK_EXECUTION_MAX_ARTIFACT_SIZE The maximum size for artifacts created by a user's CustomTask fit code. 10737418240
CUSTOM_TASK_EXECUTION_POD_FALLBACK_MEMORY_BYTES When resource bundle is not present, we default to this fallback. Important for local dev. 4294967296
CUSTOM_TASK_EXECUTION_POD_FALLBACK_MILLI_CPUS When resource bundle is not present, we default to this fallback. Important for local dev. 1000
CUSTOM_TASK_EXECUTION_RESOURCE_BUNDLE_ID The resource bundle for determining custom task fit limits/requests None
CUSTOM_TASK_EXECUTION_STORAGE_SIZE The maximum allowed disk space usage for all artifacts generated by a user's CustomTask fit code. This does not include storage of the training data which is handled separately by DataSet code. 16106127360
CUSTOM_TASK_EXECUTION_SUPPORT_CONTAINER_MEMORY_BYTES The memory requirements for containers that transfer user data to/from DataRobot. Whatever is left in the resource bundle is assigned to the fit container 104857600
CUSTOM_TASK_EXECUTION_SUPPORT_CONTAINER_MILLI_CPUS The CPU requirements for containers that transfer user data to/from DataRobot. Whatever is left in the resource bundle is assigned to the fit container 100
CUSTOM_TASK_EXECUTION_TIMEOUT In seconds, the max time for the container running a user's CustomTask fit code. 7200
CUSTOM_TASK_FIT_DYNAMIC_MEMORY_LIMIT_BUFFER If measuring memory usage during fit to optimize an LRS, the amount of extra memory added beyond that value when creating the lrs. 1073741824
CUSTOM_TASK_FIT_START_TIMEOUT The maximum time in seconds to wait for a CustomTask fit job to start. This includes creating the request, updating mongo, building all images (including possibly creating new ExecutionEnvironmentVersions) and starting the K8s V1Job. 10800
CUSTOM_TASK_LRS_MAX_RATE_LIMIT_EXCEEDED_WAIT The maximum time in seconds to wait for an LRS to become available for the user. Each user has a maximum number of allowed LRSes that can be up at any one time. During training, these LRSes are spun up, used once and then deleted. If the time they would need to wait to create a new LRS is longer than this, then their job will fail. 10800
CUSTOM_TASK_PREDICT_MEM_LIMIT The memory limit for an LRS for a Custom(Training)Task as opposed to a Custom(Inference)Model. 4294967296
CUSTOM_TASK_PREDICT_REPLICAS The number of replicas for each LRS in a deployment containing Custom(Training)Tasks. 3
CUSTOM_TASK_VERSION_MAX_FILES Maximum number of files allowed to upload to a custom task version. Minimum of 10 is to reflect that we support metadata files including input/output schema and hyperparameters, so it should not be 1. 100
ENABLE_CUSTOM_TASK_HYPERPARAMETERS null False
EPHEMERAL_LRS_CPU_REQUEST The amount of CPU resource for an LRS that is spun up when training a CustomTask or any other ephemeral LRS. Replaces: CUSTOM_MODEL_EPHEMERAL_PREDICT_CPU_REQUEST 0.5
EPHEMERAL_LRS_MAXIMUM_LIFETIME The maximum lifetime, in seconds, for an ephemeral LRS. When an LRS is created for any kind of scoring during any kind of fit job, this is the amount of time until that LRS is automatically cleaned up. 86400
EPHEMERAL_LRS_MEM_REQUEST The amount of memory resource for an LRS that is spun up when training a CustomTask or any other ephemeral LRS. Replaces: CUSTOM_TASK_EPHEMERAL_MEM_REQUEST 134217728
IMAGE_BUILDER_CUSTOM_TASK_EXECUTION_REGISTRY_REPO The repo where CustomTask docker images are stored. If unset, defaults to IMAGE_BUILDER_CUSTOM_MODELS_REGISTRY_REPO 'managed-image'

Setting up network policies

A default network policy for custom tasks to interact with DataRobot is configured automatically. In some situations additional policies might be desired such as a deny all policy. Examples are provided below.

All custom tasks have the label task-type=custom-task-fit and all tests and network policies need this label.

Testing connectivity

You may not need to configure any network policies at all. Here is a quick test pod to shell into.

in the pod, shell into it and try curl http://datarobot-nginx:80/config replacing with LB URL.

apiVersion: v1
kind: Pod
metadata:
  labels:
    datarobot-version: 10.2.0
    task-type: custom-task-fit  # MUST BE SET LIKE THIS!
  name: connectivity-test
  namespace: dr-app-charts-r10p2-air-gap-daily  # SET TO CORRECT NAMESPACE
spec:
  containers:
  - command:
    - sh
    - -c
    - sleep 500
    image: docker.io/datarobot/custom-task-data-transmission:11.0.0-12d6cc08996ebcafc0f107e78d8fa4b270a40501-71  # set to the value of CUSTOM_TASK_DATA_TRANSMISSION_IMAGE_NAME
    imagePullPolicy: IfNotPresent
    name: generic-container
    securityContext:
      allowPrivilegeEscalation: false
      privileged: false
      runAsGroup: 2501
      runAsNonRoot: true
      runAsUser: 2500
    volumeMounts:
    - mountPath: /mnt
      name: data-store
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
  - name: datarobot-image-pullsecret
  - name: datarobot-image-pull-secret
  securityContext:
    fsGroup: 4000
  serviceAccountName: default
  volumes:
  - emptyDir: {}
    name: data-store

Example network policies

All custom tasks have the label: task-type=custom-task-fit. Y You may want to set up a default deny all policy or explicit access to public ips. Always use the same pod selector with task-type=custom-task-fit.

  • A deny all policy that blocks all traffic to and from the custom task:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
   name: custom-task-deny-all
   namespace: awesome-datarobot
spec:
   podSelector:
      matchLabels:
         task-type: custom-task-fit
   policyTypes:
      - Ingress
      - Egress
  • A public policy that allows traffic to the internet:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: custom-task-public-access
  namespace: awesome-datarobot
spec:
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 10.0.0.0/8
        - 172.16.0.0/12
        - 192.168.0.0/16
  podSelector:
    matchLabels:
      egress-network-access: public
      task-type: custom-task-fit
  policyTypes:
  - Egress

Restricted network installation guide

Setting up in a restricted network requires: