Global Models and Agentic Tools¶
Global models are used as guard models for RAG and agentic workflows. Tools are specifically meant for the use in agentic workflows
List of global models used as moderation guardrails we support¶
These are the names of registered global models used as moderation guardrails we support: 1. [Hugging Face] Toxicity Classifier 2. [Microsoft] Presidio PII Detection 3. [Hugging Face] Prompt Injection Classifier 4. [Hugging Face] Emotions Classifier 5. [Hugging Face] Zero-shot Classifier 6. [DataRobot] LLM Refusal Score 7. [Hugging Face] Sentiment Classifier 8. [DataRobot] Dummy Binary Classification
These are the names of registered global models used as agentic tools we support: 1. [Tool] Make Time Series Predictions 2. [Tool] Make AutoML Predictions 3. [Tool] Summarize DataFrame 4. [Tool] Search Data Registry 5. [Tool] Render Vega-Lite Chart 6. [Tool] Render Plotly Chart 7. [Tool] Make Text Generation Predictions 8. [Tool] Get Data Registry Dataset
Requirements¶
- On-prem docker registry
- Ability to pull the docker image from public DataRobot dockerhub repository
- Feature Flags:
- ADMIN_API_ACCESS
- ENABLE_CUSTOM_MODEL_PREDICT_RESPONSE_EXTRA_MODEL_OUTPUT
- ENABLE_MLOPS_RESOURCE_REQUEST_BUNDLES
- ENABLE_CUSTOM_MODEL_GPU_INFERENCE (If any GPU models are installed)
- The following GA_PREMIUM flags needs to be enabled via EngConfig:
- ENABLE_MMM_GLOBAL_MODELS_IN_MODEL_REGISTRY
- ENABLE_MLOPS_TEXT_GENERATION_TARGET_TYPE
- ENABLE_GENAI_EXPERIMENTATION
- ENABLE_CUSTOM_INFERENCE_MODEL
- ENABLE_CUSTOM_MODEL_GITHUB_CI_CD
- ENABLE_PUBLIC_NETWORK_ACCESS_FOR_ALL_CUSTOM_MODELS
Global Models Environment Installation¶
The global models execution environment image is part of the on-prem installation tarball. It is built, packed and distributed in the same way, other public dropin environments are being distributed (via single helm chart for the execution environments). Both global models and agentic tools use the same execution environment.
However, in a setup where, for some reason, the global models environment is not present, you will need to install the global models environment manually.
Global Models Installation¶
Starting from DataRobot 11.2, global models and agentic tools can be installed automatically using global-envs-models helm sub-chart.
For DataRobot 11.1, the image to install global models and tools is not yet part of the on-prem installation tarball. So, the installation of global models is still to be executed manually.
Automatic Installation¶
To enable automatic installation of global models and agentic tools, edit the values.yaml file and enable the global-envs-models sub-chart:
global-envs-models:
enabled: true
To apply the change to the cluster, run the helm upgrade command to upgrade the release.
Note: Ensure that the global models environment "[DataRobot] Python 3.11 Global Models and Tools" is already present on the setup before enabling this chart.
After the installation completes, verify that the global models and tools are registered in the DataRobot Model Registry.
⚠️ Important: Manual installation is only required if automatic installation has failed for some reason. If the automatic installation is successful, the following steps do not need to be executed.
Air-Gapped Installations: The automatic installation method works for air-gapped environments as well. All required images are included in the DataRobot installation tarball. Enable the
global-envs-modelssub-chart as described above. Manual installation should only be used as a fallback if the automatic installation fails or is not available.
Extracting Versions from the Helm Chart¶
The DataRobot Helm chart is the source of truth for all version information. Before performing manual installation, extract the required versions from the chart.
-
If you don't have the chart already, download and unpack it:
# Set the DataRobot version (e.g., 11.2.0) export DR_VERSION=<version> # Download the chart helm pull oci://registry-1.docker.io/datarobot/datarobot-prime --version $DR_VERSION # Unpack the chart tar -xzf datarobot-prime-$DR_VERSION.tgz cd datarobot-prime -
Extract the
global-envs-modelschart version (used as the image tag):export GLOBAL_MODELS_VERSION=$(helm dependency list . | awk '$1=="global-envs-models" {print $2}') echo "Global models version: $GLOBAL_MODELS_VERSION" -
Extract the environment IDs from the execution-environments chart:
export GLOBAL_MODELS_ENVIRONMENT_ID=$(grep -A10 "env-global-models:" charts/execution-environments/values.yaml | grep "envid:" | awk '{print $2}' | tr -d '"') export GLOBAL_MODELS_ENVIRONMENT_VERSION_ID=$(grep -A10 "env-global-models:" charts/execution-environments/values.yaml | grep "envverid:" | awk '{print $2}' | tr -d '"') echo "Environment ID: $GLOBAL_MODELS_ENVIRONMENT_ID" echo "Environment Version ID: $GLOBAL_MODELS_ENVIRONMENT_VERSION_ID"Alternatively, using
yq(if available):export GLOBAL_MODELS_ENVIRONMENT_VERSION_ID=$(yq -r '.environments[] | select(.imagename == "env-global-models") | .envverid' < charts/execution-environments/values.yaml)
Note: The extracted
GLOBAL_MODELS_VERSIONis used as the Docker image tag for bothglobal-envs-modelsandenv-global-modelsimages.
Manual Installation¶
If the automatic installation via helm chart is not available or if you need to install global models manually, follow the steps below.
Steps:¶
- Ensure that global models environment is already present on the setup "[DataRobot] Python 3.11 Global Models and Tools"
- Extract the versions from the Helm chart to get
GLOBAL_MODELS_VERSION. - Pull the global models installation image from DataRobot's public dockerhub repository. The image size is approximately 5-6 GB:
In case of an Air-Gapped installation, use bastion host
docker pull datarobot/global-envs-models:${GLOBAL_MODELS_VERSION}-image - Install the global models using command:
docker run -it datarobot/global-envs-models:${GLOBAL_MODELS_VERSION}-image --webserver ${DATAROBOT_ENDPOINT} --api-token ${DATAROBOT_API_TOKEN}
Copying images via Bastion Host¶
In case of an Aig-Gapped installation it is recommended to use the Bastion Host to handle the images. Typical steps are as follows: 1. Bastion host should have access to the DataRobot public dockerhub repository 2. Pull the relevant image from dockerhub on bastion host 3. Save the image on the bastion host 4. Move the image to internal air-gapped environment 5. Load the image into local registry in air-gapped environmennt
Install global models environment manually¶
-
Extract the versions from the Helm chart to get
GLOBAL_MODELS_VERSION,GLOBAL_MODELS_ENVIRONMENT_ID, andGLOBAL_MODELS_ENVIRONMENT_VERSION_ID. -
Pull the environments installer image from DataRobot's public dockerhub repository. In case of an Air-Gapped installation, use bastion host.
docker pull datarobot/environmentscli:latest -
Pull the global models environment image using the extracted version. In case of an Air-Gapped installation, use bastion host:
docker pull datarobot/env-global-models:${GLOBAL_MODELS_VERSION} -
Save the global models environment image locally on disk. This is not a big image, tarball should be around 1-2GB:
cd /tmp mkdir -p provisioning cd provisioning docker save datarobot/env-global-models:${GLOBAL_MODELS_VERSION} | gzip > global-models-environment-image-${GLOBAL_MODELS_VERSION}.tar.gz -
Export the environment variables. Make sure that the
DATAROBOT_ENDPOINTURL includes/api/v2at the end:export DATAROBOT_ENDPOINT="<datarobot-base-url-with-api-v2-at-end>" export DATAROBOT_API_TOKEN="<datarobot-api-token>" -
If the environment does not exist already, create one with the given id:
docker run -it \ -e DATAROBOT_API_TOKEN="${DATAROBOT_API_TOKEN}" \ -e DATAROBOT_API="${DATAROBOT_ENDPOINT}" \ datarobot/environmentscli:latest create \ -i ${GLOBAL_MODELS_ENVIRONMENT_ID} \ -n "[DataRobot] Python 3.11 GenAI" \ -d "Environment for global models" \ -u "customModel" \ -l "python" \ --public True -
Run the
environmentscliutility to install the global models environment on the required DataRobot setup:docker run -it \ -v /tmp/provisioning:/work \ -e DATAROBOT_API_TOKEN="${DATAROBOT_API_TOKEN}" \ -e DATAROBOT_API="${DATAROBOT_ENDPOINT}" \ datarobot/environmentscli publish \ -i ${GLOBAL_MODELS_ENVIRONMENT_ID} \ -v ${GLOBAL_MODELS_ENVIRONMENT_VERSION_ID} \ -d "Global Models environment created using environmentscli" \ --wait 1200 \ --image /work/global-models-environment-image-${GLOBAL_MODELS_VERSION}.tar.gz -
Ensure environment image is green on DataRobot UI