Skip to content

Global Models and Agentic Tools

Global models are used as guard models for RAG and agentic workflows. Tools are specifically meant for the use in agentic workflows

List of global models used as moderation guardrails we support

These are the names of registered global models used as moderation guardrails we support: 1. [Hugging Face] Toxicity Classifier 2. [Microsoft] Presidio PII Detection 3. [Hugging Face] Prompt Injection Classifier 4. [Hugging Face] Emotions Classifier 5. [Hugging Face] Zero-shot Classifier 6. [DataRobot] LLM Refusal Score 7. [Hugging Face] Sentiment Classifier 8. [DataRobot] Dummy Binary Classification

These are the names of registered global models used as agentic tools we support: 1. [Tool] Make Time Series Predictions 2. [Tool] Make AutoML Predictions 3. [Tool] Summarize DataFrame 4. [Tool] Search Data Registry 5. [Tool] Render Vega-Lite Chart 6. [Tool] Render Plotly Chart 7. [Tool] Make Text Generation Predictions 8. [Tool] Get Data Registry Dataset

Requirements

  1. On-prem docker registry
  2. Ability to pull the docker image from public DataRobot dockerhub repository
  3. Feature Flags:
    • ADMIN_API_ACCESS
    • ENABLE_CUSTOM_MODEL_PREDICT_RESPONSE_EXTRA_MODEL_OUTPUT
    • ENABLE_MLOPS_RESOURCE_REQUEST_BUNDLES
    • ENABLE_CUSTOM_MODEL_GPU_INFERENCE (If any GPU models are installed)
  4. The following GA_PREMIUM flags needs to be enabled via EngConfig:
    • ENABLE_MMM_GLOBAL_MODELS_IN_MODEL_REGISTRY
    • ENABLE_MLOPS_TEXT_GENERATION_TARGET_TYPE
    • ENABLE_GENAI_EXPERIMENTATION
    • ENABLE_CUSTOM_INFERENCE_MODEL
    • ENABLE_CUSTOM_MODEL_GITHUB_CI_CD
    • ENABLE_PUBLIC_NETWORK_ACCESS_FOR_ALL_CUSTOM_MODELS

Global Models Environment Installation

The global models execution environment image is part of the on-prem installation tarball. It is built, packed and distributed in the same way, other public dropin environments are being distributed (via single helm chart for the execution environments). Both global models and agentic tools use the same execution environment.

However, in a setup where, for some reason, the global models environment is not present, you will need to install the global models environment manually.

Global Models Installation

Starting from DataRobot 11.2, global models and agentic tools can be installed automatically using global-envs-models helm sub-chart. For DataRobot 11.1, the image to install global models and tools is not yet part of the on-prem installation tarball. So, the installation of global models is still to be executed manually.

Automatic Installation

To enable automatic installation of global models and agentic tools, edit the values.yaml file and enable the global-envs-models sub-chart:

global-envs-models:
  enabled: true

To apply the change to the cluster, run the helm upgrade command to upgrade the release.

Note: Ensure that the global models environment "[DataRobot] Python 3.11 Global Models and Tools" is already present on the setup before enabling this chart.

After the installation completes, verify that the global models and tools are registered in the DataRobot Model Registry.

⚠️ Important: Manual installation is only required if automatic installation has failed for some reason. If the automatic installation is successful, the following steps do not need to be executed.

Air-Gapped Installations: The automatic installation method works for air-gapped environments as well. All required images are included in the DataRobot installation tarball. Enable the global-envs-models sub-chart as described above. Manual installation should only be used as a fallback if the automatic installation fails or is not available.

Extracting Versions from the Helm Chart

The DataRobot Helm chart is the source of truth for all version information. Before performing manual installation, extract the required versions from the chart.

  1. If you don't have the chart already, download and unpack it:

    # Set the DataRobot version (e.g., 11.2.0)
    export DR_VERSION=<version>
    
    # Download the chart
    helm pull oci://registry-1.docker.io/datarobot/datarobot-prime --version $DR_VERSION
    
    # Unpack the chart
    tar -xzf datarobot-prime-$DR_VERSION.tgz
    cd datarobot-prime
    

  2. Extract the global-envs-models chart version (used as the image tag):

    export GLOBAL_MODELS_VERSION=$(helm dependency list . | awk '$1=="global-envs-models" {print $2}')
    echo "Global models version: $GLOBAL_MODELS_VERSION"
    

  3. Extract the environment IDs from the execution-environments chart:

    export GLOBAL_MODELS_ENVIRONMENT_ID=$(grep -A10 "env-global-models:" charts/execution-environments/values.yaml | grep "envid:" | awk '{print $2}' | tr -d '"')
    export GLOBAL_MODELS_ENVIRONMENT_VERSION_ID=$(grep -A10 "env-global-models:" charts/execution-environments/values.yaml | grep "envverid:" | awk '{print $2}' | tr -d '"')
    echo "Environment ID: $GLOBAL_MODELS_ENVIRONMENT_ID"
    echo "Environment Version ID: $GLOBAL_MODELS_ENVIRONMENT_VERSION_ID"
    

    Alternatively, using yq (if available):

    export GLOBAL_MODELS_ENVIRONMENT_VERSION_ID=$(yq -r '.environments[] | select(.imagename == "env-global-models") | .envverid' < charts/execution-environments/values.yaml)
    

Note: The extracted GLOBAL_MODELS_VERSION is used as the Docker image tag for both global-envs-models and env-global-models images.

Manual Installation

If the automatic installation via helm chart is not available or if you need to install global models manually, follow the steps below.

Steps:

  1. Ensure that global models environment is already present on the setup "[DataRobot] Python 3.11 Global Models and Tools"
  2. Extract the versions from the Helm chart to get GLOBAL_MODELS_VERSION.
  3. Pull the global models installation image from DataRobot's public dockerhub repository. The image size is approximately 5-6 GB:
    docker pull datarobot/global-envs-models:${GLOBAL_MODELS_VERSION}-image
    
    In case of an Air-Gapped installation, use bastion host
  4. Install the global models using command:
    docker run -it datarobot/global-envs-models:${GLOBAL_MODELS_VERSION}-image --webserver ${DATAROBOT_ENDPOINT} --api-token ${DATAROBOT_API_TOKEN}
    

Copying images via Bastion Host

In case of an Aig-Gapped installation it is recommended to use the Bastion Host to handle the images. Typical steps are as follows: 1. Bastion host should have access to the DataRobot public dockerhub repository 2. Pull the relevant image from dockerhub on bastion host 3. Save the image on the bastion host 4. Move the image to internal air-gapped environment 5. Load the image into local registry in air-gapped environmennt

Install global models environment manually

  1. Extract the versions from the Helm chart to get GLOBAL_MODELS_VERSION, GLOBAL_MODELS_ENVIRONMENT_ID, and GLOBAL_MODELS_ENVIRONMENT_VERSION_ID.

  2. Pull the environments installer image from DataRobot's public dockerhub repository. In case of an Air-Gapped installation, use bastion host.

    docker pull datarobot/environmentscli:latest
    

  3. Pull the global models environment image using the extracted version. In case of an Air-Gapped installation, use bastion host:

    docker pull datarobot/env-global-models:${GLOBAL_MODELS_VERSION}
    

  4. Save the global models environment image locally on disk. This is not a big image, tarball should be around 1-2GB:

    cd /tmp
    mkdir -p provisioning
    cd provisioning
    docker save datarobot/env-global-models:${GLOBAL_MODELS_VERSION} | gzip > global-models-environment-image-${GLOBAL_MODELS_VERSION}.tar.gz
    

  5. Export the environment variables. Make sure that the DATAROBOT_ENDPOINT URL includes /api/v2 at the end:

    export DATAROBOT_ENDPOINT="<datarobot-base-url-with-api-v2-at-end>"
    export DATAROBOT_API_TOKEN="<datarobot-api-token>"
    

  6. If the environment does not exist already, create one with the given id:

    docker run -it \
      -e DATAROBOT_API_TOKEN="${DATAROBOT_API_TOKEN}" \
      -e DATAROBOT_API="${DATAROBOT_ENDPOINT}" \
      datarobot/environmentscli:latest create \
      -i ${GLOBAL_MODELS_ENVIRONMENT_ID} \
      -n "[DataRobot] Python 3.11 GenAI" \
      -d "Environment for global models" \
      -u "customModel" \
      -l "python" \
      --public True
    

  7. Run the environmentscli utility to install the global models environment on the required DataRobot setup:

    docker run -it \
      -v /tmp/provisioning:/work \
      -e DATAROBOT_API_TOKEN="${DATAROBOT_API_TOKEN}" \
      -e DATAROBOT_API="${DATAROBOT_ENDPOINT}" \
      datarobot/environmentscli publish \
      -i ${GLOBAL_MODELS_ENVIRONMENT_ID} \
      -v ${GLOBAL_MODELS_ENVIRONMENT_VERSION_ID} \
      -d "Global Models environment created using environmentscli" \
      --wait 1200 \
      --image /work/global-models-environment-image-${GLOBAL_MODELS_VERSION}.tar.gz
    

  8. Ensure environment image is green on DataRobot UI