# Moderations guardrails

> Moderations guardrails - Reference for guard configuration YAML, guard types, LLM backends, the
> Python API, and environment variables.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-06-10T05:26:01.506463+00:00` (UTC).

## Primary page

- [Moderations guardrails](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md): Full documentation for this topic (Markdown sidecar).

## Sections on this page

- [File structure](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#file-structure): In-page section heading.
- [Top-level options](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#top-level-options): In-page section heading.
- [Common guard fields](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#common-guard-fields): In-page section heading.
- [Intervention block](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#intervention-block): In-page section heading.
- [Actions](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#actions): In-page section heading.
- [Comparators](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#comparators): In-page section heading.
- [Guard types](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#guard-types): In-page section heading.
- [Out-of-the-Box (ootb)](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#out-of-the-box-ootb): In-page section heading.
- [Model guard](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#model-guard): In-page section heading.
- [NeMo Guardrails](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#nemo-guardrails): In-page section heading.
- [NeMo Evaluator](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#nemo-evaluator): In-page section heading.
- [LLM back-end options](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#llm-back-end-options): In-page section heading.
- [Supportedllm_typevalues](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#supported-llm_type-values): In-page section heading.
- [Available models (Google / AWS)](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#available-models-google-aws): In-page section heading.
- [Full annotated example](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#full-annotated-example): In-page section heading.
- [Using the config in Python](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#using-the-config-in-python): In-page section heading.
- [From a YAML file](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#from-a-yaml-file): In-page section heading.
- [Return types](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#return-types): In-page section heading.
- [evaluate_prompt/evaluate_prompt_asyncparameters](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#evaluate_prompt-evaluate_prompt_async-parameters): In-page section heading.
- [evaluate_response/evaluate_response_asyncparameters](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#evaluate_response-evaluate_response_async-parameters): In-page section heading.
- [evaluate_full_pipeline/evaluate_full_pipeline_asyncparameters](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#evaluate_full_pipeline-evaluate_full_pipeline_async-parameters): In-page section heading.
- [EvaluationResultfields](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#evaluationresult-fields): In-page section heading.
- [PipelineResultfields](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#pipelineresult-fields): In-page section heading.
- [Whatprescore_dfcontains](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#what-prescore_df-contains): In-page section heading.
- [Whatpostscore_dfcontains](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#what-postscore_df-contains): In-page section heading.
- [Agentic workflow example](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#agentic-workflow-example): In-page section heading.
- [From a plain Python dict](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#from-a-plain-python-dict): In-page section heading.
- [Parameters](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#parameters): In-page section heading.
- [From a Pydantic config object](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#from-a-pydantic-config-object): In-page section heading.
- [Parameters](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#parameters_1): In-page section heading.
- [Schema type → guard type mapping](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#schema-type-guard-type-mapping): In-page section heading.
- [LLM Gateway example — hate speech / guideline adherence](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#llm-gateway-example-hate-speech-guideline-adherence): In-page section heading.
- [Model guard example](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#model-guard-example): In-page section heading.
- [Streaming pipeline](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#streaming-pipeline): In-page section heading.
- [Method signatures](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#method-signatures): In-page section heading.
- [evaluate_full_pipeline_stream_asyncparameters](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#evaluate_full_pipeline_stream_async-parameters): In-page section heading.
- [Chunk signals](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#chunk-signals): In-page section heading.
- [Example](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#example): In-page section heading.
- [Advanced:stream_response_async](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#advanced-stream_response_async): In-page section heading.
- [With DRUM](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#with-drum): In-page section heading.
- [Testing guide](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#testing-guide): In-page section heading.
- [Environment variables](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#environment-variables): In-page section heading.
- [Always required](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#always-required): In-page section heading.
- [OTel tracing (optional)](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#otel-tracing-optional): In-page section heading.
- [deepeval telemetry](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#deepeval-telemetry): In-page section heading.
- [Credentials for LLM-eval guards using external providers](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#credentials-for-llm-eval-guards-using-external-providers): In-page section heading.

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html.md): Linked from this page.
- [Code-first tools](https://docs.datarobot.com/en/docs/api/code-first-tools/index.html.md): Linked from this page.
- [DataRobot Moderations library](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/index.html.md): Linked from this page.

## Documentation content

Guards evaluate prompts (prescore) and/or responses (postscore) and can block, report, or replace content based on configurable conditions.

## File structure

The yaml file structure contains configuration and is later imported to the library.

```
timeout_sec: 10
timeout_action: score
nemo_evaluator_deployment_id: "<your-nemo-evaluator-id>"

guards:
  - name: My Guard
    type: ootb
    stage: prompt
    # ...
```

## Top-level options

| Field | Type | Default | Description |
| --- | --- | --- | --- |
| timeout_sec | int | 10 | Seconds to wait per guard |
| timeout_action | string | score | score (allow) or block on timeout |
| nemo_evaluator_deployment_id | string | — | DataRobot deployment ID of the NeMo Evaluator microservice; required when any guard uses type: nemo_evaluator |
| enable_deepeval_telemetry | bool | false | Opt in to deepeval usage telemetry and local .deepeval/ artifacts. See Environment variables. |
| prompt_column_name | string | "promptText" | Name of the DataFrame column that holds the input text. Used in standalone Python when no DRUM deployment is active. Ignored when a DRUM deployment context is active. |
| response_column_name | string | "completion" | Name of the DataFrame column that holds the LLM response text. Used in standalone Python as a fallback when TARGET_NAME is not set. Lower priority than TARGET_NAME — if both are provided, TARGET_NAME wins. Ignored when a DRUM deployment context is active. |
| guards | list | required | List of guard definitions |

## Common guard fields

| Field | Required | Description |
| --- | --- | --- |
| name | Yes | Unique label; used as the key in result.metrics and as the DataRobot custom metric name |
| type | Yes | ootb · model · nemo_guardrails · nemo_evaluator |
| stage | Yes | prompt · response · [prompt, response] (list runs the guard at both stages) |
| description | No | Free-text label, ignored by the library |
| intervention | No | What to do when the condition fires (see Intervention block). Omit entirely to measure only — nothing is ever blocked |
| copy_citations | No | Boolean (true/false, default false). Passes retrieved RAG context to this guard. Required for rouge_1 and faithfulness to produce meaningful scores |
| is_agentic | No | Marks an agentic-workflow guard (default false). Required by agent_goal_accuracy |

```
# stage as a list — guard runs independently at both prompt and response stages
- name: Token Count Both
  type: ootb
  ootb_type: token_count
  stage: [prompt, response]
  intervention:
    action: block
    message: "Input or output exceeds the token limit."
    conditions:
      - comparator: greaterThan
        comparand: 100
```

## Intervention block

```
intervention:
  action: block               # "block" | "report" | "replace"
  message: "Blocked."         # returned to caller
  send_notification: false
  conditions:
    - comparand: 0.5
      comparator: greaterThan
```

> [!NOTE] One condition per intervention
> The `conditions` list accepts exactly one entry for `block` and `replace`; zero entries ( `conditions: []`) is valid for `report`. To combine conditions (e.g. block if score < 0.2 or > 0.9), use two separate guards.

### Actions

| Action | Effect |
| --- | --- |
| block | Reject and return message to the caller. message is optional in the schema but omitting it returns an empty string — always set it. |
| report | Record the metric and allow content through unchanged. Behaviorally identical to omitting the intervention block entirely; useful when you want the metric tracked but never want to block. |
| replace | Swap the text with the sanitized version returned by the deployment. Only valid for type: model guards. The deployment must return the replacement text in the field specified by model_info.replacement_text_column_name; if that field is absent a ValueError is raised. |

### Comparators

| Comparator | Comparand type | Description |
| --- | --- | --- |
| greaterThan / lessThan | number | Numeric threshold |
| equals / notEquals | number \| string | Exact equality. Use comparand: "TRUE" with NeMo Guardrails guards, whose score is the string "TRUE" or "FALSE" |
| is / isNot | boolean | Boolean equality |
| matches / doesNotMatch | list of strings | Class membership. matches fires if the prediction is in the list; doesNotMatch fires if it is not. |
| contains / doesNotContain | list of strings | Substring check against a list. contains fires if all items in the list are found as substrings of the prediction; doesNotContain fires if not all items are found. |

## Guard types

### Out-of-the-Box (ootb)

Set `type: ootb` and `ootb_type`. Install the required libraries for your use case:

```
pip install datarobot-moderations                          # base — token_count, rouge_1, cost, custom_metric
pip install 'datarobot-moderations[llm-eval]'              # + faithfulness, task_adherence, agent_guideline_adherence, agent_goal_accuracy
pip install 'datarobot-moderations[llm-eval,vertex]'       # + Google Vertex AI as LLM judge
pip install 'datarobot-moderations[llm-eval,bedrock]'      # + AWS Bedrock as LLM judge
pip install 'datarobot-moderations[llm-eval,nvidia]'       # + NVIDIA NIM as LLM judge
pip install 'datarobot-moderations[nemo]'                  # + NeMo Guardrails colang flow guard (type: nemo_guardrails)
pip install 'datarobot-moderations[nemo-evaluator]'        # + NeMo Evaluator microservice guard (type: nemo_evaluator)
pip install 'datarobot-moderations[datarobot-sdk]'         # required for type: model and llm_type: datarobot
pip install 'datarobot-moderations[all]'                   # everything
```

| ootb_type | Stage | Install extra | Description |
| --- | --- | --- | --- |
| token_count | prompt / response | (base) | Token count |
| rouge_1 | response | (base) | ROUGE-1 overlap with citations |
| faithfulness | response | llm-eval | LLM-judged hallucination detection |
| task_adherence | response | llm-eval | Task-completion score |
| agent_guideline_adherence | response | llm-eval | Guideline adherence |
| agent_goal_accuracy | response | llm-eval | Agentic goal-accuracy |
| cost | response | (base) | Estimated cost. Counts both prompt tokens (input_price/input_unit) and response tokens (output_price/output_unit). Must be at the response stage because both token counts are only available after the LLM responds. Currently only currency: USD is supported. |
| custom_metric | prompt / response | (base) | User-defined numeric metric |

```
# Token count — report only
- name: Prompt Token Count
  type: ootb
  ootb_type: token_count
  stage: prompt

# Token count — block on length
- name: Response Token Count
  type: ootb
  ootb_type: token_count
  stage: response
  intervention:
    action: block
    message: "Response too long."
    conditions:
      - comparand: 1000
        comparator: greaterThan

# ROUGE-1 (requires citations)
- name: Rouge 1
  type: ootb
  ootb_type: rouge_1
  stage: response
  copy_citations: true
  intervention:
    action: report
    conditions: []

# Faithfulness
- name: Faithfulness
  type: ootb
  ootb_type: faithfulness
  stage: response
  copy_citations: true
  llm_type: datarobot
  deployment_id: "<your-llm-id>"   # 24-char DataRobot deployment ID
  intervention:
    action: block
    message: "Hallucination detected."
    conditions:
      - comparand: 0.0
        comparator: equals

# Task Adherence
- name: Task Adherence
  type: ootb
  ootb_type: task_adherence
  stage: response
  llm_type: datarobot
  deployment_id: "<your-llm-id>"
  intervention:
    action: block
    message: "LLM did not complete the requested task."
    conditions:
      - comparator: lessThan
        comparand: 0.5

# Guideline Adherence
- name: Guideline Adherence
  type: ootb
  ootb_type: agent_guideline_adherence
  stage: response
  llm_type: datarobot
  deployment_id: "<your-llm-id>"
  additional_guard_config:
    agent_guideline: "Response must be polite and on-topic."   # free-text criterion for the LLM judge
  intervention:
    action: block
    message: "Response violates guidelines."
    conditions:
      - comparand: 0.0
        comparator: equals

# Agent Goal Accuracy
- name: Agent Goal Accuracy
  type: ootb
  ootb_type: agent_goal_accuracy
  stage: response
  is_agentic: true
  llm_type: datarobot
  deployment_id: "<your-llm-id>"
  intervention:
    action: report
    conditions: []

# Cost tracking
- name: Cost
  type: ootb
  ootb_type: cost
  stage: response
  additional_guard_config:
    cost:
      currency: USD
      input_price: 0.01
      input_unit: 1000
      output_price: 0.03
      output_unit: 1000
  intervention:
    action: report
    conditions: []
```

### Model guard

Wraps any DataRobot deployment you have already created (binary classifier, regression, multiclass, or text-generation). The library sends the text to that deployment and uses the prediction it returns to decide whether to block, report, or replace content.

```
# Binary classifier (e.g. toxicity, prompt injection)
# Works with any DataRobot binary classification deployment.
- name: Toxicity
  type: model
  stage: prompt
  deployment_id: "<your-deployment-id>"   # 24-char DataRobot deployment ID
  model_info:
    input_column_name: text               # field your deployment reads as input
    target_name: toxicity_toxic_PREDICTION  # prediction field returned by the deployment
    target_type: Binary        # Binary | Regression | Multiclass | TextGeneration
    class_names: []            # leave empty for Binary/Regression
  intervention:
    action: block
    message: "Toxic content blocked."
    conditions:
      - comparand: 0.5
        comparator: greaterThan

# PII detection with text replacement
# The deployment must return BOTH the score field (`target_name`)
# AND a sanitized-text field (`replacement_text_column_name`).
- name: PII Detector
  type: model
  stage: prompt
  deployment_id: "<your-pii-deployment-id>"
  model_info:
    input_column_name: text
    target_name: contains_pii_true_PREDICTION
    target_type: TextGeneration
    replacement_text_column_name: anonymized_text_OUTPUT
    class_names: []
  intervention:
    action: replace
    message: "PII removed from prompt."
    conditions:
      - comparand: 0.5
        comparator: greaterThan

# Multi-label / emotion classifier
- name: Emotion Classifier
  type: model
  stage: prompt
  deployment_id: "<your-emotion-deployment-id>"
  model_info:
    input_column_name: text
    target_name: target_PREDICTION
    target_type: TextGeneration
    class_names: [anger, fear, sadness, disgust, joy, neutral]
  intervention:
    action: block
    message: "Negative emotion detected."
    conditions:
      - comparand: [anger, fear, sadness, disgust]
        comparator: matches
```

### NeMo Guardrails

Flow-based content filtering. Requires `pip install 'datarobot-moderations[nemo]'`. Supported `llm_type` values include `openAi`, `azureOpenAi`, `nim`, and `llmGateway`.

Colang flow files must live in stage-specific subdirectories of `nemo_guardrails/`:

```
nemo_guardrails/
  prompt/      # config.yml + *.co files for stage: prompt
  response/    # config.yml + *.co files for stage: response
```

```
- name: Stay on topic
  type: nemo_guardrails
  stage: prompt
  llm_type: azureOpenAi
  openai_api_base: "https://<resource>.openai.azure.com/"
  openai_deployment_id: gpt-4o-mini
  intervention:
    action: block
    message: "This topic is outside the allowed scope."
    conditions:
      - comparand: "TRUE"
        comparator: equals
```

### NeMo Evaluator

Calls a DataRobot-hosted NeMo Evaluator microservice. Requires `pip install 'datarobot-moderations[nemo-evaluator]'`.

Two deployment IDs — what's the difference?

| Field | What it points to |
| --- | --- |
| nemo_evaluator_deployment_id (top-level) | Your NeMo Evaluator microservice deployment in DataRobot |
| deployment_id (per-guard) | The LLM deployment the evaluator uses to do the judging |

Both values must be valid 24-character DataRobot deployment IDs. Using a placeholder longer than 24 characters (e.g.`"<your-nemo-evaluator-id>"`) causes a load-time validation error: `String is longer than 24 characters`.

> llm_typemust bedatarobotfor allnemo_evaluatorguards.

| nemo_evaluator_type | Stage | Description |
| --- | --- | --- |
| llm_judge | prompt / response | Custom LLM-as-judge with your own prompts. score_parsing_regex is a regular expression applied to the LLM's raw text reply to extract a single numeric score — e.g. "([1-5])" picks the first digit 1–5 from any surrounding text. |
| context_relevance | response | Relevance of retrieved context to the question |
| response_groundedness | response | Groundedness in retrieved context |
| topic_adherence | response | Adherence to allowed topics |
| response_relevancy | response | Relevance of response to question |
| faithfulness | response | NeMo microservice faithfulness score |
| agent_goal_accuracy | response | Agentic goal-accuracy via NeMo |

```
nemo_evaluator_deployment_id: "<your-nemo-evaluator-id>"

guards:
  - name: Safety Judge
    type: nemo_evaluator
    stage: response
    nemo_evaluator_type: llm_judge
    llm_type: datarobot
    deployment_id: "<your-llm-id>"
    nemo_llm_judge_config:
      system_prompt: "Rate safety 1-5. Output ONLY the integer."
      user_prompt: "Response: {response}"
      score_parsing_regex: "([1-5])"   # regex to extract the numeric score from the LLM's text output
      custom_metric_directionality: higherIsBetter   # "higherIsBetter" | "lowerIsBetter"
    intervention:
      action: block
      message: "Response failed safety evaluation."
      conditions:
        - comparand: 2
          comparator: lessThan

  - name: Topic Adherence
    type: nemo_evaluator
    stage: response
    nemo_evaluator_type: topic_adherence
    llm_type: datarobot
    deployment_id: "<your-llm-id>"
    nemo_topic_adherence_config:
      metric_mode: f1          # "f1" | "precision" | "recall"
      reference_topics: [DataRobot, machine learning, AI platforms]
    intervention:
      action: report
      conditions: []

  - name: Response Relevancy
    type: nemo_evaluator
    stage: response
    nemo_evaluator_type: response_relevancy
    llm_type: datarobot
    deployment_id: "<your-llm-id>"
    nemo_response_relevancy_config:
      embedding_deployment_id: "<your-embedding-id>"
    intervention:
      action: report
      conditions: []
```

## LLM back-end options

Some `ootb` guards (e.g.`faithfulness`, `task_adherence`) call an LLM to judge the text. You choose which LLM provider to use via `llm_type`.

> DataRobot credentials (DATAROBOT_ENDPOINT+DATAROBOT_API_TOKEN) are always required

### Supported llm_type values

| llm_type | LLM provider | Extra YAML fields | Extra install |
| --- | --- | --- | --- |
| datarobot | DataRobot-hosted LLM deployment | deployment_id | datarobot-sdk |
| openAi | OpenAI API | (none) | llm-eval |
| azureOpenAi | Azure OpenAI | openai_api_base, openai_deployment_id | llm-eval |
| google | Google Vertex AI | google_region, google_model | llm-eval,vertex |
| amazon | AWS Bedrock | aws_region, aws_model | llm-eval,bedrock |
| nim | NVIDIA NIM | openai_api_base | llm-eval,nvidia |
| llmGateway | DataRobot LLM Gateway | llm_gateway_model_id | datarobot-sdk |

`nemo_guardrails` supports: `openAi`, `azureOpenAi`, `nim`, `llmGateway` only `nemo_evaluator` supports: `datarobot` only

### Available models (Google / AWS)

The library maps a fixed set of model names to their provider API identifiers. Models not in this list are not supported.

| Provider | llm_type | google_model / aws_model |
| --- | --- | --- |
| Google Vertex AI | google | google-gemini-1.5-flash, google-gemini-1.5-pro, chat-bison |
| AWS Bedrock | amazon | amazon-titan, anthropic-claude-2, anthropic-claude-3-haiku, anthropic-claude-3-sonnet, anthropic-claude-3-opus, anthropic-claude-3.5-sonnet-v1, anthropic-claude-3.5-sonnet-v2, amazon-nova-lite, amazon-nova-micro, amazon-nova-pro |

## Full annotated example

> Replace every<...>placeholder with a real value before use.
> DataRobot deployment IDs are exactly 24 hexadecimal characters.

```
timeout_sec: 15
timeout_action: score

guards:
  # -- Prescore (prompt) --------------------------------------------------

  - name: Prompt Injection
    type: model
    stage: prompt
    deployment_id: "<prompt-injection-id>"
    model_info:
      input_column_name: text
      target_name: injection_injection_PREDICTION
      target_type: Binary
      class_names: []
    intervention:
      action: block
      message: "Prompt injection attempt detected and blocked."
      conditions:
        - comparand: 0.80
          comparator: greaterThan

  - name: Toxicity
    type: model
    stage: prompt
    deployment_id: "<toxicity-id>"
    model_info:
      input_column_name: text
      target_name: toxicity_toxic_PREDICTION
      target_type: Binary
      class_names: []
    intervention:
      action: block
      message: "Toxic content is not allowed."
      conditions:
        - comparand: 0.5
          comparator: greaterThan

  - name: PII Detector
    type: model
    stage: prompt
    deployment_id: "<pii-id>"
    model_info:
      input_column_name: text
      target_name: contains_pii_true_PREDICTION
      target_type: TextGeneration
      replacement_text_column_name: anonymized_text_OUTPUT
      class_names: []
    intervention:
      action: replace
      message: "PII detected and removed."
      conditions:
        - comparand: 0.5
          comparator: greaterThan

  - name: Topic Guardrail
    type: nemo_guardrails
    stage: prompt
    llm_type: azureOpenAi
    openai_api_base: "https://<resource>.openai.azure.com/"
    openai_deployment_id: gpt-4o-mini
    intervention:
      action: block
      message: "This topic is outside the allowed scope."
      conditions:
        - comparand: "TRUE"
          comparator: equals

  # -- Postscore (response) -----------------------------------------------

  - name: Response Token Count
    type: ootb
    ootb_type: token_count
    stage: response

  - name: Faithfulness
    type: ootb
    ootb_type: faithfulness
    stage: response
    copy_citations: true
    llm_type: datarobot
    deployment_id: "<llm-id>"
    intervention:
      action: block
      message: "The response appears to be hallucinated."
      conditions:
        - comparand: 0.0
          comparator: equals

  - name: Task Adherence
    type: ootb
    ootb_type: task_adherence
    stage: response
    llm_type: datarobot
    deployment_id: "<llm-id>"
    intervention:
      action: block
      message: "LLM did not complete the requested task."
      conditions:
        - comparator: lessThan
          comparand: 0.5

  - name: Cost
    type: ootb
    ootb_type: cost
    stage: response
    additional_guard_config:
      cost:
        currency: USD
        input_price: 0.01
        input_unit: 1000
        output_price: 0.03
        output_unit: 1000
    intervention:
      action: report
      conditions: []
```

## Using the config in Python

Guards can be configured from a YAML file, a plain Python dict, or a Pydantic object built entirely in Python. All approaches are fully equivalent — choose whichever fits your workflow.

### From a YAML file

#### Return types

| Method | Returns |
| --- | --- |
| evaluate_prompt(prompt) | (EvaluationResult, latency_seconds, prescore_df) |
| evaluate_response(response, prompt=None) | (EvaluationResult, latency_seconds, postscore_df) |
| evaluate_full_pipeline(prompt, llm_callable) | (PipelineResult, prescore_df, postscore_df) — postscore_df is None when the prompt was blocked; per-stage latency is not returned — use evaluate_prompt / evaluate_response directly when you need it |
| evaluate_prompt_async(prompt) | same as evaluate_prompt but non-blocking |
| evaluate_response_async(response, prompt=None) | same as evaluate_response but non-blocking |
| evaluate_full_pipeline_async(prompt, llm_callable) | same as evaluate_full_pipeline but non-blocking; llm_callable must be an async coroutine |
| evaluate_full_pipeline_stream_async(prompt, llm_callable) | AsyncGenerator[ChatCompletionChunk, None] — see Streaming pipeline |
| stream_response_async(completion, *, prompt, prescore_df, prescore_latency) | AsyncGenerator[ChatCompletionChunk, None] — lower-level; see Streaming pipeline |

`EvaluationResult.metrics` holds the guard scores keyed by guard name.

#### evaluate_prompt / evaluate_prompt_async parameters

| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| prompt | str | Yes | The user prompt text to evaluate against prescore guards |

#### evaluate_response / evaluate_response_async parameters

| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| response | str | Yes | The LLM response text to evaluate against postscore guards |
| prompt | str \\| None | No | The original user prompt. Required for guards that compare prompt and response (e.g. faithfulness, task_adherence, rouge_1). Omit only when no such guards are configured |
| pipeline_interactions | str \\| None | No | JSON-serialized MultiTurnSample dict from the DataRobot agentic pipeline. Enables agent_goal_accuracy to evaluate the full interaction trace instead of just the final response. |

#### evaluate_full_pipeline / evaluate_full_pipeline_async parameters

| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| prompt | str | Yes | The user prompt to evaluate |
| llm_callable | Callable[[str], str] (sync) or Callable[[str], Awaitable[str]] (async) | Yes | Callable that receives the (possibly sanitized) effective prompt and returns the LLM response. For the async variant this must be an async coroutine |

#### EvaluationResult fields

| Field | Type | Description |
| --- | --- | --- |
| blocked | bool | True if any guard blocked the text |
| blocked_message | str \\| None | The block message configured on the guard |
| replaced | bool | True if a replace-action guard fired |
| replacement | str \\| None | The sanitized replacement text (PII-scrubbed prompt, etc.) |
| metrics | dict[str, Any] | Guard scores keyed by guard name (e.g. {"Toxicity": 0.87}) |

#### PipelineResult fields

| Field | Type | Description |
| --- | --- | --- |
| prompt_evaluation | EvaluationResult | Prescore evaluation result |
| response | str \\| None | Final (possibly replaced) LLM response; None when blocked |
| response_evaluation | EvaluationResult \\| None | Postscore evaluation result; None when prompt was blocked |
| blocked (computed) | bool | True if either stage was blocked |
| replaced (computed) | bool | True if either stage was replaced |

#### What prescore_df contains

`prescore_df` is the raw pandas DataFrame produced by running all prescore (prompt-stage) guards on the input.It starts as a copy of the input and gains one set of columns per guard after execution.

| Column | Description |
| --- | --- |
| {prompt_column_name} | Original prompt text |
| {guard.metric_column_name} | Guard score (one column per guard, e.g. Toxicity_toxicity_toxic_PREDICTION) |
| {guard_name}_latency | Wall-clock seconds this guard took |
| blocked_{prompt_col} | True if any guard blocked the prompt |
| blocked_message_{prompt_col} | Block reason / message returned to the caller |
| replaced_{prompt_col} | True if a replace-action guard fired |
| replaced_message_{prompt_col} | Replacement text (sanitized prompt from PII guard, etc.) |
| reported_{prompt_col} | True when a report-action guard fired |
| Noneed_{prompt_col} | Internal sentinel for no-action guards |
| action_{prompt_col} | Comma-joined string of actions taken (e.g. "block", "report,block") |
| (per-guard enforced column) | Internal per-guard enforcement flag used by format_result_df |

#### What postscore_df contains

`postscore_df` is the raw pandas DataFrame produced by running all postscore (response-stage) guards on the LLM output.It starts with the predictions DataFrame (which includes the LLM response plus any pass-through columns) and gains guard result columns after execution.

| Column | Description |
| --- | --- |
| {response_column_name} | LLM's response text |
| {prompt_column_name} | User prompt (forwarded for faithfulness / task-adherence calculation) |
| CITATION_CONTENT_{N} | Retrieved RAG context chunks (when citations are enabled) |
| PROMPT_TOKEN_COUNT_from_usage | Prompt token count (when usage is provided by the LLM) |
| RESPONSE_TOKEN_COUNT_from_usage | Response token count (when usage is provided by the LLM) |
| agentic_pipeline_interactions | Agentic workflow interaction trace (for agent_goal_accuracy / task_adherence) |
| {association_id_column_name} | Association ID (if the deployment has one configured) |
| {guard.metric_column_name} | Guard score (one column per postscore guard, e.g. Response_Faithfulness_score) |
| {guard_name}_latency | Wall-clock seconds this guard took |
| blocked_{response_col} | True if any guard blocked the response |
| blocked_message_{response_col} | Block message returned to the caller |
| replaced_{response_col} | True if a replace-action guard fired on the response |
| replaced_message_{response_col} | Replacement text |
| reported_{response_col} | True when a report-action guard fired |
| Noneed_{response_col} | Internal sentinel for no-action guards |
| action_{response_col} | Comma-joined string of actions taken |
| (per-guard enforced column) | Internal per-guard enforcement flag |

> Note:prescore_dfandpostscore_dfare theraw executor outputs.In the DRUM pipeline,format_result_dfmerges them into a singleresult_dfthat also addsunmoderated_{response_col},moderated_{prompt_col},datarobot_latency,datarobot_token_count,
> anddatarobot_confidence_score.  Those derived columns arenotpresent in the DataFrames
> returned directly byevaluate_prompt/evaluate_response/evaluate_full_pipeline.

```
import os
from datarobot_dome.api import ModerationPipeline

os.environ["DATAROBOT_ENDPOINT"]  = "<your-endpoint>"
os.environ["DATAROBOT_API_TOKEN"] = "<your-token>"
# TARGET_NAME is optional — sets the response column name used by postscore guards.
# Resolution order: TARGET_NAME env var → response_column_name in config → default "completion".
# os.environ["TARGET_NAME"] = "resultText"

pipeline = ModerationPipeline.from_yaml("moderation_config.yaml")

# ── Prompt evaluation (prescore guards) ───────────────────────────────────────
# sync
result, latency, prescore_df = pipeline.evaluate_prompt("What is DataRobot?")
# async (inside an async function / FastAPI route / agent)
result, latency, prescore_df = await pipeline.evaluate_prompt_async("What is DataRobot?")

if result.blocked:
    print(f"Blocked: {result.blocked_message}")
elif result.replaced:
    print(f"Prompt sanitized to: {result.replacement}")

# ── Response evaluation (postscore guards) ────────────────────────────────────
# sync
result, latency, postscore_df = pipeline.evaluate_response(
    "DataRobot is an AI platform.",
    prompt="What is DataRobot?",   # required for faithfulness / task-adherence guards
)
# async
result, latency, postscore_df = await pipeline.evaluate_response_async(
    "DataRobot is an AI platform.",
    prompt="What is DataRobot?",
)
print(f"Latency: {latency:.3f}s  Blocked: {result.blocked}  Metrics: {result.metrics}")

# ── Full pipeline: prescore → LLM → postscore ─────────────────────────────────
# sync
def my_llm(prompt: str) -> str:
    return "DataRobot is an AI platform."   # replace with your LLM call

result, prescore_df, postscore_df = pipeline.evaluate_full_pipeline("What is DataRobot?", my_llm)

# async (llm_callable must be an async coroutine)
async def my_async_llm(prompt: str) -> str:
    return "DataRobot is an AI platform."   # replace with your async LLM call

result, prescore_df, postscore_df = await pipeline.evaluate_full_pipeline_async(
    "What is DataRobot?", my_async_llm
)

if result.blocked:
    stage = "prompt" if result.prompt_evaluation.blocked else "response"
    blocked_eval = (
        result.prompt_evaluation if result.prompt_evaluation.blocked
        else result.response_evaluation
    )
    print(f"Blocked at {stage}: {blocked_eval.blocked_message}")
elif result.replaced:
    print(f"Text replaced. Response: {result.response}")
else:
    print(f"Response: {result.response}")
    print(f"Metrics: {result.response_evaluation.metrics}")
```

#### Agentic workflow example

For agents, the library can evaluate the full interaction trace — every tool call, intermediate
message, and final response — not just the last reply. This gives the `agent_goal_accuracy` guard
accurate context to judge whether the agent actually achieved the user's goal.

The interaction trace ( `pipeline_interactions`) is a JSON-serialized [ragas.MultiTurnSample](https://docs.ragas.io) produced by the DataRobot agent after each task
run. Pass it directly to `evaluate_response`.

Config ( `docs/examples/agent_goal_accuracy_config.yaml`):

```
targets:
  - target: _default
    guards:
      - name: Agent Goal Accuracy
        type: ootb
        ootb_type: agent_goal_accuracy
        stage: response
        is_agentic: true
        llm_type: llmGateway
        llm_gateway_model_id: "azure/gpt-4o-mini"
        intervention:
          action: report  # measure-only: block/replace are ignored by the library
          conditions: []
```

> Measure-only guard:agent_goal_accuracy(likecostandguideline_adherence) always
> forcesintervene=Falseinternally regardless of theactionconfigured. The score is only
> available inresult.metrics["agent_goal_accuracy"]— use it to make blocking decisions in
> your own code when needed.

Python — with full interaction trace (recommended for agentic pipelines):

```
import json
from datarobot_dome.api import ModerationPipeline

pipeline = ModerationPipeline.from_yaml("docs/examples/agent_goal_accuracy_config.yaml")

task = "Book a flight from NYC to London"

# chat_completion is the object returned by the DataRobot agent SDK.
# `pipeline_interactions` is attached when the agent has tool calls / multi-turn
# history; it is None for a plain single-turn response.
chat_completion = my_agent.run(task=task)
agent_response = chat_completion.choices[0].message.content
interactions_json = getattr(chat_completion, "pipeline_interactions", None)

result, latency, postscore_df = pipeline.evaluate_response(
    response=agent_response,
    prompt=task,
    pipeline_interactions=interactions_json,  # JSON str, or None
)

score = result.metrics.get("agent_goal_accuracy")
passed = score is not None and score >= 0.5
print(f"score={score}  passed={passed}")

**Python — building the interaction trace manually** (when not using the DataRobot agent SDK):

```python
import json
from ragas import MultiTurnSample
from ragas.messages import AIMessage, HumanMessage, ToolCall, ToolMessage

# Reconstruct the trace from your agent's execution log. {: #reconstruct-the-trace-from-your-agents-execution-log }
sample = MultiTurnSample(
    user_input=[
        HumanMessage(content="Book a flight from NYC to London"),
        AIMessage(
            content="Searching for available flights…",
            tool_calls=[ToolCall(name="search_flights", args={"origin": "NYC", "dest": "LON"})],
        ),
        ToolMessage(content='[{"flight": "BA178", "price": 620}]'),
        AIMessage(content="I found BA178 departing tomorrow for $620. Shall I book it?"),
    ]
)
interactions_json = json.dumps(sample.to_dict())

result, latency, _ = pipeline.evaluate_response(
    response="I found BA178 departing tomorrow for $620. Shall I book it?",
    prompt="Book a flight from NYC to London",
    pipeline_interactions=interactions_json,
)
print(result.blocked, result.metrics)
```

> Withoutpipeline_interactionsthe guard falls back gracefully to evaluating the single
> prompt/response pair — useful during development before you have a live agent.

### From a plain Python dict

Use `ModerationPipeline.from_dict` when your configuration is already in dict form (e.g. loaded from JSON, fetched from an API, or assembled programmatically). The dict must follow the same schema as the YAML file.

#### Parameters

| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| config | dict | Yes | Guard configuration dictionary following the YAML schema |
| model_dir | str \\| None | No | Base directory used to resolve relative asset paths (e.g. NeMo guardrails .co flow files). Defaults to os.getcwd() |

```
import os
from datarobot_dome.api import ModerationPipeline

os.environ["DATAROBOT_ENDPOINT"]  = "<your-endpoint>"
os.environ["DATAROBOT_API_TOKEN"] = "<your-token>"
# os.environ["TARGET_NAME"] = "resultText"  # optional — see [Environment variables](#environment-variables) for resolution order {: #osenvirontarget_name-resulttext-optional-see-10-for-resolution-order }

pipeline = ModerationPipeline.from_dict(
    {
        "targets": [
            {
                "target": "_default",
                "guards": [
                    {
                        "name": "Token Count",
                        "type": "ootb",
                        "ootb_type": "token_count",
                        "stage": "prompt",
                    }
                ],
            }
        ]
    },
    model_dir="/path/to/nemo_guardrails_dir",  # optional; only needed for NeMo guards
)

result, latency, prescore_df = pipeline.evaluate_prompt("Hello")
print(result.metrics)
```

### From a Pydantic config object

Use `ModerationPipeline.from_config` to build the configuration entirely in Python — no YAML file required. This is useful for dynamic configurations, programmatic guard registration, or when embedding moderation in a larger application.

#### Parameters

| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| config | ModerationConfig | Yes | A fully-constructed ModerationConfig Pydantic object |
| model_dir | str \\| None | No | Base directory used to resolve relative asset paths (e.g. NeMo guardrails .co flow files). Defaults to os.getcwd() |

All schema types are importable from `datarobot_dome.schema`:

```
from datarobot_dome.schema import (
    ModerationConfig,
    TargetBlock,
    # Guard subtypes — pick the matching one per guard
    OOTBGuardSchema,
    ModelGuardSchema,
    NemoGuardrailsSchema,
    NemoEvaluatorSchema,
    # Nested schemas used inside guards
    AdditionalGuardConfigSchema,
    InterventionSchema,
    InterventionConditionSchema,
    ModelInfoSchema,
)
```

#### Schema type → guard type mapping

| Guard YAML type | Pydantic class |
| --- | --- |
| ootb | OOTBGuardSchema |
| model | ModelGuardSchema |
| nemo_guardrails | NemoGuardrailsSchema |
| nemo_evaluator | NemoEvaluatorSchema |

#### LLM Gateway example — hate speech / guideline adherence

```
import os
from datarobot_dome.api import ModerationPipeline
from datarobot_dome.schema import (
    AdditionalGuardConfigSchema,
    InterventionSchema,
    ModerationConfig,
    OOTBGuardSchema,
    TargetBlock,
)

os.environ["DATAROBOT_ENDPOINT"]  = "https://app.datarobot.com/api/v2"
os.environ["DATAROBOT_API_TOKEN"] = "<your-dr-token>"
# os.environ["TARGET_NAME"] = "resultText"  # optional — see [Environment variables](#environment-variables) for resolution order {: #osenvirontarget_name-resulttext-optional-see-10-for-resolution-order }

config = ModerationConfig(
    targets=[
        TargetBlock(
            target="_default",
            guards=[
                OOTBGuardSchema(
                    type="ootb",
                    name="Hate Speech",
                    stage="response",
                    ootb_type="agent_guideline_adherence",
                    llm_type="llmGateway",
                    llm_gateway_model_id="azure/gpt-4o-2024-11-20",
                    additional_guard_config=AdditionalGuardConfigSchema(
                        agent_guideline=(
                            "The response must not contain hate speech, slurs, or content "
                            "that demeans people based on race, religion, gender, nationality, "
                            "or any other protected characteristic."
                        )
                    ),
                    intervention=InterventionSchema(
                        action="report",
                        conditions=[],
                    ),
                )
            ],
        )
    ]
)

# Pass model_dir when your config references NeMo guardrails flow files: {: #pass-model_dir-when-your-config-references-nemo-guardrails-flow-files }
# pipeline = ModerationPipeline.from_config(config, model_dir="/path/to/nemo_guardrails_dir") {: #pipeline-moderationpipelinefrom_configconfig-model_dirpathtonemo_guardrails_dir }

text = "People from that group are living in France."
result, latency, postscore_df = pipeline.evaluate_response(response=text, prompt="Describe this text.")
score = result.metrics.get("agent_guideline_adherence_score")
print(f"score={score}  latency={latency:.3f}s")
```

#### Model guard example

```
import os
from datarobot_dome.api import ModerationPipeline
from datarobot_dome.schema import (
    InterventionConditionSchema,
    InterventionSchema,
    ModerationConfig,
    ModelGuardSchema,
    ModelInfoSchema,
    TargetBlock,
)

os.environ["DATAROBOT_ENDPOINT"]  = "<your-endpoint>"
os.environ["DATAROBOT_API_TOKEN"] = "<your-token>"
# os.environ["TARGET_NAME"] = "resultText"  # optional — see [Environment variables](#environment-variables) for resolution order {: #osenvirontarget_name-resulttext-optional-see-10-for-resolution-order }

config = ModerationConfig(
    targets=[
        TargetBlock(
            target="_default",
            guards=[
                ModelGuardSchema(
                    type="model",
                    name="Toxicity",
                    stage="prompt",
                    deployment_id="<your-toxicity-deployment-id>",
                    model_info=ModelInfoSchema(
                        input_column_name="text",
                        target_name="toxicity_toxic_PREDICTION",
                        target_type="Binary",
                        class_names=[],
                    ),
                    intervention=InterventionSchema(
                        action="block",
                        message="Toxic content blocked.",
                        conditions=[
                            InterventionConditionSchema(comparand=0.5, comparator="greaterThan")
                        ],
                    ),
                )
            ],
        )
    ]
)

pipeline = ModerationPipeline.from_config(config)
```

### Streaming pipeline

`evaluate_full_pipeline_stream_async` is the primary high-level API for streaming.
It encapsulates prescore evaluation, the thread/queue bridge to `ModerationIterator`, and
postscore guard execution — callers supply only a prompt and a streaming LLM callable.

#### Method signatures

| Method | When to use |
| --- | --- |
| evaluate_full_pipeline_stream_async(prompt, llm_callable) | Preferred. Hides all internal state — no prescore_df required. |
| stream_response_async(completion, *, prompt, prescore_df, prescore_latency) | Advanced: when you need to inspect the EvaluationResult from prescore before starting the LLM stream (e.g. to act on a REPLACE result). |

#### evaluate_full_pipeline_stream_async parameters

| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| prompt | str | Yes | The user prompt |
| llm_callable | Callable[[str], AsyncIterator[ChatCompletionChunk]] | Yes | Sync callable that receives the (possibly sanitized) effective prompt and returns an async iterator of chunks. Called only when the prompt is not blocked. |

#### Chunk signals

| finish_reason | Meaning |
| --- | --- |
| None or "stop" | Normal chunk — content is in chunk.choices[0].delta.content |
| "content_filter" | A guard intervened. delta.content holds the block message. The LLM was never called if this is the first (and only) chunk. |

#### Example

```
import asyncio
import os
from datarobot_dome.api import ModerationPipeline
from datarobot_dome.schema import (
    InterventionSchema, ModerationConfig, OOTBGuardSchema, TargetBlock,
)

os.environ["DATAROBOT_ENDPOINT"]  = "<your-endpoint>"
os.environ["DATAROBOT_API_TOKEN"] = "<your-token>"

pipeline = ModerationPipeline.from_config(
    ModerationConfig(
        targets=[
            TargetBlock(
                target="_default",
                guards=[
                    OOTBGuardSchema(
                        name="Prompt Token Limit",
                        type="ootb",
                        ootb_type="token_count",
                        stage="prompt",
                        intervention=InterventionSchema(
                            action="block",
                            conditions=[{"comparator": "greaterThan", "comparand": 200}],
                            message="Prompt too long.",
                        ),
                    ),
                ],
            )
        ]
    )
)

async def my_llm_stream(prompt: str):
    """Wrap a sync OpenAI stream as an async iterator."""
    import openai
    client = openai.OpenAI(
        api_key=os.environ["DATAROBOT_API_TOKEN"],
        base_url=f"{os.environ['DATAROBOT_ENDPOINT']}/genai/llmgw",
    )
    for chunk in client.chat.completions.create(
        model="azure/gpt-4o-2024-11-20",
        messages=[{"role": "user", "content": prompt}],
        stream=True,
    ):
        yield chunk

async def run(prompt: str) -> None:
    print(f"Prompt: {prompt!r}")
    async for chunk in pipeline.evaluate_full_pipeline_stream_async(prompt, my_llm_stream):
        finish_reason = chunk.choices[0].finish_reason
        content = chunk.choices[0].delta.content
        if finish_reason == "content_filter":
            print(f"[BLOCKED] {content}")
            return
        if content:
            print(content, end="", flush=True)
    print()

asyncio.run(run("What is DataRobot?"))
```

#### Advanced: stream_response_async

Use when you need the prescore `EvaluationResult` before streaming begins:

```
result, latency, prescore_df = await pipeline.evaluate_prompt_async(prompt)
if result.blocked:
    # handle block before ever calling the LLM
    return result.blocked_message

effective = result.replacement if result.replaced else prompt

async for chunk in pipeline.stream_response_async(
    my_llm_stream(effective),
    prompt=effective,
    prescore_df=prescore_df,      # must come from evaluate_prompt_async
    prescore_latency=latency,
):
    ...
```

### With DRUM

Place `moderation_config.yaml` alongside your custom model code, then:

```
drum score --verbose \
  --code-dir ./ \
  --target-type textgeneration \
  --input ./input.csv \
  --runtime-params-file values.yaml
```

## Testing guide

Set these environment variables before running any test (see [Environment variables](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#environment-variables) for details):

```
export DATAROBOT_ENDPOINT="https://app.datarobot.com/api/v2"
export DATAROBOT_API_TOKEN="your-token"
export TARGET_NAME="resultText"
```

Guards fall into four groups based on the credentials they require:

| Group | Guard types | Extra credentials needed |
| --- | --- | --- |
| Local | token_count, rouge_1, cost, custom_metric | (none beyond the base vars above) |
| DataRobot deployment | type: model, any ootb with llm_type: datarobot or llm_type: llmGateway | Only DATAROBOT_API_TOKEN; provide a real deployment_id |
| External LLM provider | Any ootb with llm_type: openAi, azureOpenAi, google, amazon, nim | Provider-specific env var (see Environment variables) |
| NeMo | type: nemo_guardrails, type: nemo_evaluator | Provider key for NeMo Guardrails; DATAROBOT_API_TOKEN for NeMo Evaluator |

See [Guard types](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#guard-types) for complete YAML examples per guard type and [Using the config in Python](https://docs.datarobot.com/en/docs/api/code-first-tools/moderations-library/moderations-guardrails.html.md#using-the-config-in-python) for Python usage patterns.

## Environment variables

### Always required

| Variable | Description |
| --- | --- |
| DATAROBOT_ENDPOINT | DataRobot instance URL, e.g. https://app.datarobot.com/api/v2 |
| DATAROBOT_API_TOKEN | DataRobot API token |
| TARGET_NAME | The name of the DataFrame column that holds the LLM response text (e.g. resultText). Resolution order for the response column (highest to lowest priority): (1) DRUM deployment target_name (always wins when MLOPS_DEPLOYMENT_ID is set), (2) TARGET_NAME env var, (3) response_column_name in the config file, (4) built-in default "completion". DRUM sets this automatically; in standalone Python you can set it here or declare response_column_name in the YAML/ModerationConfig — but the env var takes precedence if both are provided. |
| DISABLE_MODERATION | Set to true to disable all guards at runtime. |

### OTel tracing (optional)

OTel traces are emitted whenever `OTEL_EXPORTER_OTLP_ENDPOINT` is set.  The
remaining two variables are optional — their corresponding request headers are
omitted when the variable is absent, which allows traces to be forwarded to an
unauthenticated local OTLP collector such as the `af-component-agent-playground` UI without needing credentials.

| Variable | Required | Description |
| --- | --- | --- |
| OTEL_EXPORTER_OTLP_ENDPOINT | ✅ | Base URL of the OTLP HTTP collector, e.g. http://localhost:4318. The library appends /v1/traces automatically. |
| OTEL_SERVICE_NAME | ❌ | Adds X-DataRobot-Entity-Id to trace requests. Required when routing to the DataRobot production collector; omit for local collectors. |
| OTEL_COLLECTOR_TOKEN | ❌ | Adds Authorization: Bearer <token> to trace requests. Required for production/deployed collectors; omit for local collectors. |

Local playground example:

```
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
# OTEL_SERVICE_NAME and OTEL_COLLECTOR_TOKEN are not needed
```

Production example:

```
export OTEL_EXPORTER_OTLP_ENDPOINT="https://collector.datarobot.com"
export OTEL_SERVICE_NAME="deployment-abc123"
export OTEL_COLLECTOR_TOKEN="my-token"
```

### deepeval telemetry

The `task_adherence` guard uses `deepeval` internally. By default, moderations opts out of
deepeval's usage telemetry — no `.deepeval/` directory is created and no data is sent externally.

To opt in, set `enable_deepeval_telemetry: true` in your config (only takes effect when a `task_adherence` guard is present; deepeval is loaded lazily):

```
enable_deepeval_telemetry: true   # default: false

guards:
  - name: Task Adherence
    type: ootb
    ootb_type: task_adherence
    stage: response
```

To opt out explicitly via environment variable (e.g. in CI or container environments):

```
export DEEPEVAL_TELEMETRY_OPT_OUT=YES  # opt out (library default)
unset DEEPEVAL_TELEMETRY_OPT_OUT       # opt in
```

### Credentials for LLM-eval guards using external providers

When your guard uses `llm_type: datarobot`, it reuses `DATAROBOT_API_TOKEN` — no extra variable needed.

For external providers (OpenAI, Azure OpenAI, Google, AWS), set a guard-specific env var. The variable name is built from the guard's type, stage, and ootb_type:

```
MLOPS_RUNTIME_PARAM_MODERATION_{TYPE}_{STAGE}_{OOTB_TYPE}_{PROVIDER_SUFFIX}
```

| Guard (ootb_type) | Provider | Environment variable |
| --- | --- | --- |
| task_adherence | OpenAI | MLOPS_RUNTIME_PARAM_MODERATION_OOTB_RESPONSE_TASK_ADHERENCE_OPENAI_API_KEY |
| task_adherence | Azure OpenAI | MLOPS_RUNTIME_PARAM_MODERATION_OOTB_RESPONSE_TASK_ADHERENCE_AZURE_OPENAI_API_KEY |
| faithfulness | OpenAI | MLOPS_RUNTIME_PARAM_MODERATION_OOTB_RESPONSE_FAITHFULNESS_OPENAI_API_KEY |
| faithfulness | Azure OpenAI | MLOPS_RUNTIME_PARAM_MODERATION_OOTB_RESPONSE_FAITHFULNESS_AZURE_OPENAI_API_KEY |
| agent_guideline_adherence | Azure OpenAI | MLOPS_RUNTIME_PARAM_MODERATION_OOTB_RESPONSE_AGENT_GUIDELINE_ADHERENCE_AZURE_OPENAI_API_KEY |
| agent_guideline_adherence | Google Vertex AI | MLOPS_RUNTIME_PARAM_MODERATION_OOTB_RESPONSE_AGENT_GUIDELINE_ADHERENCE_GOOGLE_SERVICE_ACCOUNT |
| agent_goal_accuracy | Azure OpenAI | MLOPS_RUNTIME_PARAM_MODERATION_OOTB_RESPONSE_AGENT_GOAL_ACCURACY_AZURE_OPENAI_API_KEY |
| agent_goal_accuracy | AWS Bedrock | MLOPS_RUNTIME_PARAM_MODERATION_OOTB_RESPONSE_AGENT_GOAL_ACCURACY_AWS_ACCOUNT |
| nemo_guardrails (prompt) | Azure OpenAI | MLOPS_RUNTIME_PARAM_MODERATION_NEMO_GUARDRAILS_PROMPT_AZURE_OPENAI_API_KEY |

Value format per provider:

```
# OpenAI / Azure OpenAI {: #openai-azure-openai }
'{"type":"credential","payload":{"credentialType":"api_token","apiToken":"YOUR_KEY"}}'

# Google Vertex AI {: #google-vertex-ai }
'{"type":"credential","payload":{"credentialType":"gcp","gcpKey":{...}}}'

# AWS Bedrock {: #aws-bedrock }
'{"type":"credential","payload":{"credentialType":"s3","awsAccessKeyId":"...","awsSecretAccessKey":"...","awsSessionToken":"..."}}'
```
