Use NVIDIA NeMo Guardrails with DataRobot moderation¶
Premium
The use of NVIDIA Inference Microservices (NIM) in DataRobot requires access to premium features for GenAI experimentation and GPU inference. NVIDIA NeMo Guardrails are a premium feature. Contact your DataRobot representative or administrator for information on enabling this feature.
Additional feature flags: Enable Moderation Guardrails (Premium), Enable Global Models in the Model Registry (Premium), Enable Additional Custom Model Output in Prediction Responses
DataRobot provides out-of-the-box guardrails and lets you customize your applications with simple rules, code, or models. Use NVIDIA Inference Microservices (NIM) to connect NVIDIA NeMo Guardrails to text generation models in DataRobot, allowing you to guard against off-topic discussions, unsafe content, and jailbreaking attempts. The following NVIDIA NeMo Guardrails are available and can be implemented using the associated evaluation metric type:
Model name | Evaluation metric type |
---|---|
llama-3.1-nemoguard-8b-content-safety |
Custom deployment |
llama-3.1-nemoguard-8b-topic-control |
Stay on topic for input / Stay on topic for output |
nemoguard-jailbreak-detect |
Custom deployment |
Use a deployed NIM with NVIDIA NeMo guardrails¶
To use a deployed llama-3.1-nemoguard-8b-topic-control
NVIDIA NIM with the Stay on topic evaluation metrics, register and deploy the NVIDIA NeMo Guardrail. Once you have created a custom model with the text generation target type, configure the Stay on topic evaluation metric.
To select and configure NVIDIA NeMo Guardrails for topic control:
-
In the Model workshop, open the Assemble tab of a custom model with the Text Generation target type and assemble a model, either manually from a custom model you created outside DataRobot or automatically from a model built in a Use Case's LLM playground.
When you assemble a text generation model with moderations, ensure you configure any required runtime parameters (for example, credentials) or resource settings (for example, public network access). Finally, set the Base environment to a moderation-compatible environment, such as [GenAI] Python 3.12 with Moderations:
Resource settings
DataRobot recommends creating the LLM custom model using larger resource bundles with more memory and CPU resources.
-
After you've configured the custom model's required settings, navigate to the Evaluation and moderation section and click Configure:
-
In the Configure evaluation and moderation panel, locate the metrics tagged with NVIDIA NeMo guardrail, and then click Stay on topic for input or Stay on topic for output.
Evaluation metric Requires Description Stay on topic for inputs NVIDIA NeMo guardrails configuration Use NVIDIA NeMo Guardrails to provide topic boundaries, ensuring prompts are topic-relevant and do not use blocked terms. Stay on topic for output NVIDIA NeMo guardrails configuration Use NVIDIA NeMo Guardrails to provide topic boundaries, ensuring responses are topic-relevant and do not use blocked terms. -
On the Configure evaluation and moderation page for the Stay on topic for input/ouput metric, in the LLM Type list, select NIM. Then, set the following:
Field Description Base URL Enter the base URL for the NVIDIA NIM deployment, for example: https://app.datarobot.com/api/v2/deployments/<deploymentId>/
.Credentials Select a DataRobot API key from the list. Credentials are defined on the Credentials management page. Files (Optional) Configure the NeMo files. Next to a file, click to modify the NeMo guardrails configuration files. In particular, update prompts.yml
with allowed and blocked topics andblocked_terms.txt
with the blocked terms, providing rules for NeMo guardrails to enforce. Theblocked_terms.txt
file is shared between the input and output stay on topic metrics; therefore, modifyingblocked_terms.txt
in the input metric modifies it for the output metric and vice versa. Only two NeMo stay on topic metrics can exist in a custom model, one for input and one for output. -
In the Moderation section, with Configure and apply moderation enabled, for each evaluation metric, set the following:
Field Description Moderation method Select Report or Report and block. Moderation message If you select Report and block, you can optionally modify the default message. -
After configuring the required fields, click Add to save the evaluation and return to the evaluation selection page. Then, select and configure another metric, or click Save configuration.
The guardrails you selected appear in the Evaluation and moderation section of the Assemble tab.
Use a deployed NIM as a custom model guardrail¶
To use a deployed llama-3.1-nemoguard-8b-content-safety
or nemoguard-jailbreak-detect
NVIDIA NIM with the Custom Deployment evaluation metric, first, register and deploy the NVIDIA NeMo Guardrails, then, when you create a custom model with the text generation target type, configure the Custom Deployment evaluation metric.
To select and configure NVIDIA NeMo Guardrails for content safety and jailbreaking detection:
-
In the Model workshop, open the Assemble tab of a custom model with the Text Generation target type and assemble a model, either manually from a custom model you created outside DataRobot or automatically from a model built in a Use Case's LLM playground.
When you assemble a text generation model with moderations, ensure you configure any required runtime parameters (for example, credentials) or resource settings (for example, public network access). Finally, set the Base environment to a moderation-compatible environment; for example, [GenAI] Python 3.12 with Moderations:
Resource settings
DataRobot recommends creating the LLM custom model using larger resource bundles with more memory and CPU resources.
-
After you've configured the custom model's required settings, navigate to the Evaluation and moderation section and click Configure:
-
In the Configure evaluation and moderation panel, click Custom Deployment.
-
On the Configure Custom Deployment page, configure the settings depending on the guard model you're connecting to the LLM:
llama-3.1-nemoguard-8b-content-safety
ornemoguard-jailbreak-detect
.For
llama-3.1-nemoguard-8b-content-safety
, configure the custom deployment as follows:Field Description Name Enter a descriptive name for the custom deployment metric you're creating. Deployment name In the list, locate the name of the llama-3.1-nemoguard-8b-content-safety model registered and deployed in DataRobot and click the deployment name. Input column name Enter text as the input column name. Output column name Enter content_PREDICTION as the output column name. For
nemoguard-jailbreak-detect
, configure the custom deployment as follows:Field Description Name Enter a descriptive name for the custom deployment metric you're creating. Deployment name In the list, locate the name of the nemoguard-jailbreak-detect model registered and deployed in DataRobot and click the deployment name. Input column name Enter text as the input column name. Output column name Enter jailbreak_True_PREDICTION as the output column name. -
In the Moderation section, with Configure and apply moderation enabled, for each evaluation metric, set the following:
Field Description Moderation method Select Report or Report and block. Moderation message If you select Report and block, you can optionally modify the default message. -
After configuring the required fields, click Add to save the evaluation and return to the evaluation selection page.
-
Select and configure another metric, or click Save configuration.
The guardrails you selected appear in the Evaluation and moderation section of the Assemble tab.
After you add guardrails to a text generation custom model, you can test, register, and deploy the model to make predictions in production. After making predictions, you can view the evaluation metrics on the Custom metrics tab and prompts, responses, and feedback (if configured) on the Data exploration tab.