# Configure LLM provider fallback

> Configure LLM provider fallback - Learn how to configure primary and fallback LLM providers for
> automatic failover in your agentic workflows.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-21T05:20:22.332570+00:00` (UTC).

## Primary page

- [Configure LLM provider fallback](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-llm-fallback.html.md): Full documentation for this topic (Markdown sidecar).

## Sections on this page

- [Prerequisites](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-llm-fallback.html.md#prerequisites): In-page section heading.
- [Configure fallback in code](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-llm-fallback.html.md#configure-fallback-in-code): In-page section heading.
- [LLMConfigfields](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-llm-fallback.html.md#llmconfig-fields): In-page section heading.
- [Framework examples](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-llm-fallback.html.md#framework-examples): In-page section heading.
- [Configure fallback in workflow.yaml](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-llm-fallback.html.md#configure-fallback-in-workflow-yaml): In-page section heading.
- [Mixing provider types](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-llm-fallback.html.md#mixing-provider-types): In-page section heading.

## Related documentation

- [Agentic AI](https://docs.datarobot.com/en/docs/agentic-ai/index.html.md): Linked from this page.
- [Build](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/index.html.md): Linked from this page.
- [Add Python packages](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-python-packages.html.md): Linked from this page.

## Documentation content

You can configure a primary LLM provider with one or more fallback providers for automatic failover. When the primary is unavailable or returns an error, a `litellm.Router` automatically retries with the next fallback provider in the list. The `num_retries` option controls how many retries occur per provider before moving to the next. This works with any DataRobot-supported LLM provider, including the LLM gateway, hosted deployments, NIM deployments, and external APIs.

## Prerequisites

- datarobot-genai>=0.15.20 must be available in the execution environment. See Add Python packages for instructions.
- A working agent template (CrewAI, LangGraph, LlamaIndex, or DRAgent/NAT).
- At least two LLM providers or models configured (one primary, one or more fallbacks).

## Configure fallback in code

For DRUM-based templates (CrewAI, LangGraph, LlamaIndex), replace `get_llm()` with `get_router_llm()` in your `myagent.py` file.

### LLMConfig fields

The `primary` LLM and each fallback (in `fallbacks`) is defined as an `LLMConfig` object. Set only the fields relevant to your provider type (see [Mixing provider types](https://docs.datarobot.com/en/docs/agentic-ai/agentic-develop/agentic-llm-fallback.html.md#mixing-provider-types) for a multi-fallback example):

| Field | Type | Description |
| --- | --- | --- |
| use_datarobot_llm_gateway | bool | Use the DataRobot LLM gateway as the provider. |
| llm_default_model | str | The model identifier (e.g., azure/gpt-4o-mini). |
| llm_deployment_id | str | DataRobot deployment ID for hosted LLM deployments. |
| nim_deployment_id | str | DataRobot NIM deployment ID. |
| datarobot_endpoint | str | DataRobot API endpoint URL. |
| datarobot_api_token | str | DataRobot API token. |

### Framework examples

The LLM fallback system follows a similar pattern for each DRUM-based template:

**CrewAI:**
```
from datarobot_genai.core.config import LLMConfig
from datarobot_genai.crewai.llm import get_router_llm

primary = LLMConfig(
    use_datarobot_llm_gateway=True,
    llm_default_model="{LLM_DEFAULT_MODEL}",
)
fallbacks = [
    LLMConfig(
        use_datarobot_llm_gateway=True,
        llm_default_model="anthropic/claude-opus-4-20250514",
    )
]

llm = get_router_llm(primary, fallbacks, {"num_retries": 1})
```

**LangGraph:**
```
from datarobot_genai.core.config import LLMConfig
from datarobot_genai.langgraph.llm import get_router_llm

primary = LLMConfig(
    use_datarobot_llm_gateway=True,
    llm_default_model="{LLM_DEFAULT_MODEL}",
)
fallbacks = [
    LLMConfig(
        use_datarobot_llm_gateway=True,
        llm_default_model="anthropic/claude-opus-4-20250514",
    )
]

llm = get_router_llm(primary, fallbacks, {"num_retries": 1})
```

**LlamaIndex:**
```
from datarobot_genai.core.config import LLMConfig
from datarobot_genai.llamaindex.llm import get_router_llm

primary = LLMConfig(
    use_datarobot_llm_gateway=True,
    llm_default_model="{LLM_DEFAULT_MODEL}",
)
fallbacks = [
    LLMConfig(
        use_datarobot_llm_gateway=True,
        llm_default_model="anthropic/claude-opus-4-20250514",
    )
]

llm = get_router_llm(primary, fallbacks, {"num_retries": 1})
```


> [!TIP] Multiple fallbacks
> You can specify multiple fallback providers in the `fallbacks` list. The router tries them in order if the primary fails.

## Configure fallback in workflow.yaml

For DRAgent/NAT templates, use `_type: datarobot-llm-router` with `primary` and `fallbacks` blocks in `workflow.yaml`:

```
# workflow.yaml
llms:
  datarobot_llm:
    _type: datarobot-llm-router
    primary:
      use_datarobot_llm_gateway: true
      llm_default_model: "{LLM_DEFAULT_MODEL}"
    fallbacks:
      - use_datarobot_llm_gateway: true
        llm_default_model: anthropic/claude-opus-4-20250514
    num_retries: 1
```

> [!NOTE] LLMConfig fields in YAML
> The `primary` and each item in `fallbacks` accept the same fields as `LLMConfig`: `use_datarobot_llm_gateway`, `llm_default_model`, `llm_deployment_id`, `nim_deployment_id`, `datarobot_endpoint`, and `datarobot_api_token`.

## Mixing provider types

The primary and fallback providers can use different provider types. For example, you can use the LLM gateway as primary and a deployment as fallback:

**Code (DRUM-based):**
```
primary = LLMConfig(
    use_datarobot_llm_gateway=True,
    llm_default_model="azure/gpt-4o-mini",
)
fallbacks = [
    LLMConfig(
        llm_deployment_id="YOUR_DEPLOYMENT_ID",
    ),
    LLMConfig(
        use_datarobot_llm_gateway=True,
        llm_default_model="anthropic/claude-opus-4-20250514",
    ),
]

llm = get_router_llm(primary, fallbacks, {"num_retries": 1})
```

**workflow.yaml (DRAgent/NAT):**
```
llms:
  datarobot_llm:
    _type: datarobot-llm-router
    primary:
      use_datarobot_llm_gateway: true
      llm_default_model: azure/gpt-4o-mini
    fallbacks:
      - llm_deployment_id: YOUR_DEPLOYMENT_ID
      - use_datarobot_llm_gateway: true
        llm_default_model: anthropic/claude-opus-4-20250514
    num_retries: 1
```


> [!WARNING] Retry and latency
> Each retry adds latency to the response. Set `num_retries` conservatively (e.g., `1`) to balance reliability and response time.
