Generative AI documentation > Code-based GenAI > Use the DataRobot LLM gateway

Use the DataRobot LLM gateway¶

The DataRobot LLM gateway is a service that unifies and simplifies LLM access across DataRobot. It provides a DataRobot API endpoint to interface with LLMs hosted by external LLM providers. To request LLM responses from the DataRobot LLM gateway, you can use any API client that supports OpenAI-compatible chat completion API, for example, the OpenAI Python API library.

Note: Provisioning LLMs using the LLM gateway is available for DataRobot-managed (Cloud) instances and single-tenant SaaS (STS) self-managed instances with supported pricing plans; it is not available for on-premise installations. Contact your DataRobot representative for details. The gateway itself, as a service for calling LLMs, is available on all platforms.

Setup¶

The DataRobot LLM gateway is a premium feature; contact your DataRobot representative or administrator for information on enabling it. To use the DataRobot LLM gateway with the OpenAI Python API library, first, make sure the OpenAI client package is installed and imported. You must also import the DataRobot Python client.

In [ ]:

Copied!

%pip install openai
%pip install openai

In [ ]:

Copied!





import datarobot as dr
from openai import OpenAI
from datarobot.models.genai.llm_gateway_catalog import LLMGatewayCatalog
from pprint import pprint
import datarobot as dr
from openai import OpenAI
from datarobot.models.genai.llm_gateway_catalog import LLMGatewayCatalog
from pprint import pprint

Connect to the LLM gateway¶

Next, initialize the OpenAI client. The base URL is the DataRobot LLM gateway endpoint. The example below assembles this URL by combining your DataRobot API endpoint (retrieved from dr_client) and /genai/llmgw. Usually this results in https://app.datarobot.com/api/v2/genai/llmgw, https://app.eu.datarobot.com/api/v2/genai/llmgw, https://app.jp.datarobot.com/api/v2/genai/llmgw, or your organization's DataRobot API endpoint URL with /genai/llmgw appended to it. The API key is your DataRobot API key.

In [ ]:

Copied!

dr_client = dr.Client()

DR_API_TOKEN = dr_client.token
LLM_GATEWAY_BASE_URL = f"{dr_client.endpoint}/genai/llmgw"

client = OpenAI(
    base_url=LLM_GATEWAY_BASE_URL,
    api_key=DR_API_TOKEN,
)

print(f"Your LLM gateway URL is {LLM_GATEWAY_BASE_URL}.")
dr_client = dr.Client()

DR_API_TOKEN = dr_client.token
LLM_GATEWAY_BASE_URL = f"{dr_client.endpoint}/genai/llmgw"

client = OpenAI(
    base_url=LLM_GATEWAY_BASE_URL,
    api_key=DR_API_TOKEN,
)

print(f"Your LLM gateway URL is {LLM_GATEWAY_BASE_URL}.")

Select models and make requests¶

In your code, you can specify any supported provider LLM and set up the message to send to the LLM as a prompt. An optional argument is client_id, where you can specify the caller service to use for metering: genai-playground, custom-model, or moderations.

This example calls the LLM gateway catalog endpoint to get one of the latest supported LLMs from each provider.

In [ ]:

Copied!





# Get catalog entries using the SDK (active, non-deprecated models by default)
catalog_entries = LLMGatewayCatalog.list(limit=20)
first_model_by_provider = {}

# Iterate through the catalog to select the first supported LLM from each provider 
for entry in catalog_entries:
    # Get the provider name from the provider key
    provider_name = entry.provider
    
    # If the provider isn't already recorded, store the full model string
    if provider_name not in first_model_by_provider:
        model_name = entry.model 
        first_model_by_provider[provider_name] = model_name

# Store the list of models
models = list(first_model_by_provider.values())

print(f"Selected models from {len(first_model_by_provider)} providers:")
pprint(first_model_by_provider, sort_dicts=False)
# Get catalog entries using the SDK (active, non-deprecated models by default)
catalog_entries = LLMGatewayCatalog.list(limit=20)
first_model_by_provider = {}

# Iterate through the catalog to select the first supported LLM from each provider 
for entry in catalog_entries:
    # Get the provider name from the provider key
    provider_name = entry.provider
    
    # If the provider isn't already recorded, store the full model string
    if provider_name not in first_model_by_provider:
        model_name = entry.model 
        first_model_by_provider[provider_name] = model_name

# Store the list of models
models = list(first_model_by_provider.values())

print(f"Selected models from {len(first_model_by_provider)} providers:")
pprint(first_model_by_provider, sort_dicts=False)

After you define the client, models and message, you can make chat completion requests to the LLM gateway. The authentication uses DataRobot-provided credentials.

In [ ]:

Copied!





from IPython.display import display, Markdown

# Store a message to send to the LLM
message = [{"role": "user", "content": "Hello! What is your name and who made you?"}]

for model in models:
    response = client.chat.completions.create(
        model=model,
        messages=message,
    )
    response_text = response.choices[0].message.content
    output_as_markdown = f"""
**{model}:**

{response_text}

---
"""

    display(Markdown(output_as_markdown));
from IPython.display import display, Markdown

# Store a message to send to the LLM
message = [{"role": "user", "content": "Hello! What is your name and who made you?"}]

for model in models:
    response = client.chat.completions.create(
        model=model,
        messages=message,
    )
    response_text = response.choices[0].message.content
    output_as_markdown = f"""
**{model}:**

{response_text}

---
"""

    display(Markdown(output_as_markdown));

To further configure your chat completion request when making direct calls to an LLM gateway, specify LLM parameter settings like temperature, max_completion_tokens, and more. These parameters are also supported for custom models. For more information on the available parameters, see the OpenAI chat completion documentation.

In [ ]:

Copied!





model2 = models[0] if len(models) > 0 else "openai/gpt-4o"  # Use first model or fallback
message2 = [{"role": "user", "content": "Hello! What is your name and who made you? How do you feel about Agentic AI"}]
extra_body2 = {
    "temperature": 0.8,
    "max_completion_tokens": 2000,
}
response2 = client.chat.completions.create(
    model=model2,
    messages=message2,
    extra_body=extra_body2,
)

response_text2 = response2.choices[0].message.content
    
output_as_markdown2 = f"""
**{model2}:**

{response_text2}

---
"""

display(Markdown(output_as_markdown2));
model2 = models[0] if len(models) > 0 else "openai/gpt-4o"  # Use first model or fallback
message2 = [{"role": "user", "content": "Hello! What is your name and who made you? How do you feel about Agentic AI"}]
extra_body2 = {
    "temperature": 0.8,
    "max_completion_tokens": 2000,
}
response2 = client.chat.completions.create(
    model=model2,
    messages=message2,
    extra_body=extra_body2,
)

response_text2 = response2.choices[0].message.content
    
output_as_markdown2 = f"""
**{model2}:**

{response_text2}

---
"""

display(Markdown(output_as_markdown2));

Identify supported LLMs¶

To provide a list of LLMs supported by the LLM gateway, this example uses the LLM Gateway Catalog SDK to get the available models.

In [ ]:

Copied!





# Get all available models using the SDK convenience method
supported_llms = LLMGatewayCatalog.get_available_models()

print(f"Found {len(supported_llms)} available models:")
pprint(supported_llms[:10])  # Show first 10 models
if len(supported_llms) > 10:
    print(f"... and {len(supported_llms) - 10} more models")
# Get all available models using the SDK convenience method
supported_llms = LLMGatewayCatalog.get_available_models()

print(f"Found {len(supported_llms)} available models:")
pprint(supported_llms[:10])  # Show first 10 models
if len(supported_llms) > 10:
    print(f"... and {len(supported_llms) - 10} more models")

If you try to use an unsupported LLM, the LLM gateway returns an error message, relaying that the specified LLM is not in the LLM catalog.

In [ ]:

Copied!





# Verify model availability
unsupported_model = "unsupported-provider/random-llm"

try:
    # Check if the model is available before making the request
    model_entry = LLMGatewayCatalog.verify_model_availability(unsupported_model)
    print(f"Model {unsupported_model} is available: {model_entry.name}")
except ValueError as e:
    print(f"Model {unsupported_model} is not available: {e}")

# Alternative: still show the original error handling for comparison
messages3 = [
    {"role": "user", "content": "Hello!"}
]

try:
    response = client.chat.completions.create(
        model=unsupported_model,
        messages=messages3,
    )
    response.choices[0].message.content
except Exception as e:
    print(f"Direct API call error: {str(e)}")
# Verify model availability
unsupported_model = "unsupported-provider/random-llm"

try:
    # Check if the model is available before making the request
    model_entry = LLMGatewayCatalog.verify_model_availability(unsupported_model)
    print(f"Model {unsupported_model} is available: {model_entry.name}")
except ValueError as e:
    print(f"Model {unsupported_model} is not available: {e}")

# Alternative: still show the original error handling for comparison
messages3 = [
    {"role": "user", "content": "Hello!"}
]

try:
    response = client.chat.completions.create(
        model=unsupported_model,
        messages=messages3,
    )
    response.choices[0].message.content
except Exception as e:
    print(f"Direct API call error: {str(e)}")

You can also verify if a specific model is available before attempting to use it. This is useful for error handling and validation:

In [ ]:

Copied!





# Test different model IDs
test_models = [
    "azure/gpt-4o-2024-11-20",  # Example Azure model
    "openai/gpt-4o",            # Example OpenAI model  
    "non-existent-model"        # This should fail
]

for model_id in test_models:
    try:
        model_entry = LLMGatewayCatalog.verify_model_availability(model_id)
        print(f"✓ {model_id} is available:")
        pprint({
            "name": model_entry.name,
            "provider": model_entry.provider,
            "context_size": f"{model_entry.context_size:,} tokens",
            "active": model_entry.is_active,
            "deprecated": model_entry.is_deprecated
        }, sort_dicts=False)
    except ValueError as e:
        print(f"✗ {model_id} is not available: {e}")
    print()
# Test different model IDs
test_models = [
    "azure/gpt-4o-2024-11-20",  # Example Azure model
    "openai/gpt-4o",            # Example OpenAI model  
    "non-existent-model"        # This should fail
]

for model_id in test_models:
    try:
        model_entry = LLMGatewayCatalog.verify_model_availability(model_id)
        print(f"✓ {model_id} is available:")
        pprint({
            "name": model_entry.name,
            "provider": model_entry.provider,
            "context_size": f"{model_entry.context_size:,} tokens",
            "active": model_entry.is_active,
            "deprecated": model_entry.is_deprecated
        }, sort_dicts=False)
    except ValueError as e:
        print(f"✗ {model_id} is not available: {e}")
    print()

Advanced filtering¶

The LLM Gateway Catalog SDK provides advanced filtering capabilities to help you find the right models for your needs:

In [ ]:

Copied!





# Get all models including deprecated ones
all_entries = LLMGatewayCatalog.list(
    only_active=False,
    only_non_deprecated=False,
    limit=20
)

active_count = sum(1 for entry in all_entries if entry.is_active)
deprecated_count = sum(1 for entry in all_entries if entry.is_deprecated)

print(f"Found {len(all_entries)} total entries:")
print(f"  - Active: {active_count}")
print(f"  - Deprecated: {deprecated_count}")

# Show deprecated models with replacement info
deprecated_models = [e for e in all_entries if e.is_deprecated]
if deprecated_models:
    print("\nDeprecated models with replacements (first 3 models):")
    for entry in deprecated_models[:3]:
        deprecated_info = {
            "model": entry.model,
            "name": entry.name,
            "provider": entry.provider
        }
        if entry.retirement_date:
            deprecated_info["retirement_date"] = entry.retirement_date
        if entry.suggested_replacement:
            deprecated_info["suggested_replacement"] = entry.suggested_replacement
        
        pprint(deprecated_info, sort_dicts=False)
        print()
# Get all models including deprecated ones
all_entries = LLMGatewayCatalog.list(
    only_active=False,
    only_non_deprecated=False,
    limit=20
)

active_count = sum(1 for entry in all_entries if entry.is_active)
deprecated_count = sum(1 for entry in all_entries if entry.is_deprecated)

print(f"Found {len(all_entries)} total entries:")
print(f"  - Active: {active_count}")
print(f"  - Deprecated: {deprecated_count}")

# Show deprecated models with replacement info
deprecated_models = [e for e in all_entries if e.is_deprecated]
if deprecated_models:
    print("\nDeprecated models with replacements (first 3 models):")
    for entry in deprecated_models[:3]:
        deprecated_info = {
            "model": entry.model,
            "name": entry.name,
            "provider": entry.provider
        }
        if entry.retirement_date:
            deprecated_info["retirement_date"] = entry.retirement_date
        if entry.suggested_replacement:
            deprecated_info["suggested_replacement"] = entry.suggested_replacement
        
        pprint(deprecated_info, sort_dicts=False)
        print()

In [ ]:

Copied!





# Get detailed catalog entries with rich information
catalog_entries = LLMGatewayCatalog.list(limit=5)
print("Detailed catalog entries:")
for entry in catalog_entries:
    entry_info = {
        "name": entry.name,
        "model": entry.model,
        "provider": entry.provider,
        "context_size": f"{entry.context_size:,} tokens",
        "active": entry.is_active,
        "deprecated": entry.is_deprecated
    }
    pprint(entry_info, sort_dicts=False)
    print()
# Get detailed catalog entries with rich information
catalog_entries = LLMGatewayCatalog.list(limit=5)
print("Detailed catalog entries:")
for entry in catalog_entries:
    entry_info = {
        "name": entry.name,
        "model": entry.model,
        "provider": entry.provider,
        "context_size": f"{entry.context_size:,} tokens",
        "active": entry.is_active,
        "deprecated": entry.is_deprecated
    }
    pprint(entry_info, sort_dicts=False)
    print()