Generative AI documentation > Code-based GenAI > Bolt-on Governance API

Bolt-on Governance API¶

This notebook outlines how to use the Bolt-on Governance API with deployed LLM blueprints. LLM blueprints deployed from the playground implement the chat() hook in the custom model's custom.py file by default.

Use the Bolt-on Governance API¶

You can use the official Python library for the OpenAI API to make chat completion requests to DataRobot LLM blueprint deployments:

In [ ]:

Copied!

!pip install openai=="1.51.2"
!pip install openai=="1.51.2"

In [ ]:

Copied!

from openai import OpenAI
from openai import OpenAI

Specify the ID of the LLM blueprint deployment and your DataRobot API token:

In [ ]:

Copied!

DEPLOYMENT_ID = "<SPECIFY_DEPLOYMENT_ID_HERE>"
DATAROBOT_API_TOKEN = "<SPECIFY_TOKEN_HERE>"

DEPLOYMENT_URL = f"https://app.datarobot.com/api/v2/deployments/{DEPLOYMENT_ID}"
DEPLOYMENT_ID = ""
DATAROBOT_API_TOKEN = ""

DEPLOYMENT_URL = f"https://app.datarobot.com/api/v2/deployments/{DEPLOYMENT_ID}"

Use the code below to create an OpenAI client:

In [ ]:

Copied!

client = OpenAI(base_url=DEPLOYMENT_URL, api_key=DATAROBOT_API_TOKEN)
client = OpenAI(base_url=DEPLOYMENT_URL, api_key=DATAROBOT_API_TOKEN)

Use the code below to request a chat completion. See the considerations below for more information on specifying the model parameter.

Note that specifying the system message in the request overrides the system prompt set in the LLM blueprint. Specifying other settings in the request, such as max_completion_tokens, overrides the settings of the LLM blueprint.

In [ ]:

Copied!





completion = client.chat.completions.create(
    model="datarobot-deployed-llm",
    messages=[
        {"role": "system", "content": "Answer with just a number."},
        {"role": "user", "content": "What is 2+3?"},
        {"role": "assistant", "content": "5"},
        {"role": "user", "content": "Now multiply the result by 4."},
        {"role": "assistant", "content": "20"},
        {"role": "user", "content": "Now divide the result by 2."},
    ],
)
completion = client.chat.completions.create(
    model="datarobot-deployed-llm",
    messages=[
        {"role": "system", "content": "Answer with just a number."},
        {"role": "user", "content": "What is 2+3?"},
        {"role": "assistant", "content": "5"},
        {"role": "user", "content": "Now multiply the result by 4."},
        {"role": "assistant", "content": "20"},
        {"role": "user", "content": "Now divide the result by 2."},
    ],
)

In [ ]:

Copied!

print(completion)
print(completion)

Use the following cell to request a chat completion with a streaming response.

In [ ]:

Copied!





streaming_response = client.chat.completions.create(
    model="datarobot-deployed-llm",
    messages=[
        {"role": "system", "content": "Explain your thoughts using at least 100 words."},
        {"role": "user", "content": "What would it take to colonize Mars?"},
    ],
    stream=True,
)
streaming_response = client.chat.completions.create(
    model="datarobot-deployed-llm",
    messages=[
        {"role": "system", "content": "Explain your thoughts using at least 100 words."},
        {"role": "user", "content": "What would it take to colonize Mars?"},
    ],
    stream=True,
)

In [ ]:

Copied!





for chunk in streaming_response:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end="")
for chunk in streaming_response:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end="")

Considerations¶

When using the Bolt-on Governance API, consider the following:

If you implement the chat completion hook without modification, the chat() hook behaves differently than the score() hook. Specifically, the unmodified chat() hook passes in the model parameter through the completion_create_params argument while the score() hook specifies the model in the custom model code.
If you add a deployed LLM to the playground, the validation uses the value entered into the "Chat model ID" field as the model parameter value. Ensure the LLM deployment accepts this value as the model parameter. Alternatively, you can modify the implementation of the chat() hook to override the value of the model parameter, defining the intended model (for example, using a runtime parameter). For more information, see GenAI troubleshooting.
The Bolt-on Governance API is also available in GPU environments for custom models running on datarobot-drum>=1.14.3.

Was this page helpful?

Great! Let us know what you found helpful.

What can we do to improve the content?

Thanks for your feedback!