Integrate tools into an agentic workflow¶
When building agents with external tools—either global tools or custom tools created as custom models in Workshop—the following components are required to call tools from an agent:
-
The
ToolClientclass instance defined intools_client.py. -
The authorization context defined in
auth.py. -
A deployed tool's
deployment_id, defined inmodel-metadata.yaml. -
The tool's metadata, defined in a tool module; for example,
tool_ai_catalog_search.py.
To assemble an agentic workflow using agentic tools deployed from Registry, an example workflow could include the following files:
| File | Contents |
|---|---|
custom.py |
The custom model code, implementing the Bolt-on Governance API (the chat hook) to call the LLM and also passing those parameters to the agent (defined in agent.py). |
agent.py |
The agent code, implementing the agentic workflow in the MyAgent class, and the ToolClient class instance required to interface with agentic tools. |
tools_client.py |
The ToolClient class code, defining the API endpoint and deployment ID for the deployed tool, getting the authorization context (if required), and providing interfaces for the score, score_unstructured, and chat hooks. |
helpers.py |
The code defining helper functions for the agent. |
tool_deployment.py |
The BaseTool class code, containing all necessary metadata for implementing tools. |
tool.py |
The code for interfacing with the deployed tool, defining the input arguments and schema. Often, this file won't be named tool.py, as you may implement more than one tool. In this example, this functionality is defined in tool_ai_catalog_search.py. |
model-metadata.yaml |
The custom model metadata and runtime parameters required by the agentic workflow. |
requirements.txt |
The libraries (and versions) required by the agentic workflow. |
Implement the ToolClient class instance¶
Every agent template and framework requires the ToolClient class, defined in tools_client.py, to offload tool call processing to deployed global tools. The tool client calls the deployed tool and returns the results to the agent. To import the ToolClient module into an agent.py file, use the import statement below:
from tools_client import ToolClient
The ToolClient is structured as shown in the datarobot-agent-templates repository. It defines the API endpoint and deployment ID for the deployed tool, gets the authorization context (if required), and provides interfaces for the score, score_unstructured, and chat hooks.
After you import the ToolClient into agent.py, you can initialize the tool_client method in the MyAgent class with your DataRobot api_key and base_url. These keyword arguments are defined as environment variables in the runtime environment and passed to the ToolClient class's constructor (the __init__ method).
class MyAgent:
# More agentic workflow code.
@property
def tool_client(self):
return ToolClient(
api_key=self.api_key,
base_url=self.api_base,
)
# More agentic workflow code.
(Optional) Initialize authorization context for external tools¶
Authorization context is required to allow downstream agents and tools to retrieve access tokens when connecting to external services. Two utility methods from the DataRobot Python SDK are implemented to set and retrieve the authorization context for the process:
-
set_authorization_context: A method to set the authorization context for the current process. -
get_authorization_context: A method to retrieve the authorization context for the current process.
OAuth utility method availability
These utility methods are available in the DataRobot Python SDK starting with version 3.8.0.
You can review an example auth.py file below, or in the datarobot-agent-templates repository.
from contextvars import ContextVar
from typing import Any, Dict, cast
from openai.types.chat import CompletionCreateParams
from datarobot.models.genai.agent.auth import set_authorization_context
def initialize_authorization_context(
completion_create_params: CompletionCreateParams,
) -> None:
"""Sets the authorization context for the agent.
Authorization context is required for propagating information needed by downstream
agents and tools to retrieve access tokens to connect to external services. When set,
authorization context will be automatically propagated when using ToolClient class.
"""
authorization_context = completion_create_params.get("authorization_context", {})
set_authorization_context(cast(Dict[str, Any], authorization_context))
In the custom.py example below, the chat() hook calls initialize_authorization_context (imported at the top of the file) each time a chat request is made to the agentic workflow, providing any credentials required for external tools.
from helpers_telemetry import *
from agent import MyAgent
from auth import initialize_authorization_context
# from datarobot_drum import RuntimeParameters
from helpers import (
CustomModelChatResponse,
to_custom_model_response,
)
from openai.types.chat import CompletionCreateParams
def load_model(code_dir: str) -> str:
"""The agent is instantiated in this function and returned."""
_ = code_dir
return "success"
def chat(
completion_create_params: CompletionCreateParams,
model: str,
) -> CustomModelChatResponse:
"""When using the chat endpoint, this function is called.
Agent inputs are in OpenAI message format and defined as the 'user' portion
of the input prompt.
"""
_ = model
# Initialize the authorization context for downstream agents and tools to retrieve
# access tokens for external services.
initialize_authorization_context(completion_create_params)
# Instantiate the agent, all fields from the completion_create_params are passed to the agent
# allowing environment variables to be passed during execution
agent = MyAgent(**completion_create_params)
# Execute the agent with the inputs
agent_result = agent.run(completion_create_params=completion_create_params)
if isinstance(agent_result, tuple):
return to_custom_model_response(
*agent_result, model=completion_create_params["model"]
)
return to_custom_model_response(
agent_result, model=completion_create_params["model"]
)
When authorization context is set, it is automatically propagated by the ToolClient class, defined in tool_client.py.
from datarobot.models.genai.agent.auth import get_authorization_context
class ToolClient:
"""Client for interacting with Agent Tools Deployments.
This class provides methods to call the custom model tool using various hooks:
`score`, `score_unstructured`, and `chat`. When the `authorization_context` is set,
the client automatically propagates it to the agent tool. The `authorization_context`
is required for retrieving access tokens to connect to external services.
"""
# More ToolClient code.
def _get_authorization_context(self) -> Dict[str, Any]:
"""Retrieve the authorization context.
Returns:
Dict[str, Any]: The authorization context.
"""
authorization_context = get_authorization_context()
return cast(Dict[str, Any], authorization_context)
# More ToolClient code.
If you're implementing a custom tool reliant on an external service, you can use the @datarobot_tool_auth decorator to streamline the process of retrieving the authorization context, extracting the relevant data, and connecting to the DataRobot API to obtain the OAuth access token from an OAuth provider configured in DataRobot. When only one OAuth provider is configured the decorator doesn't require the provider parameter, as it will use the only available provider; however, if multiple providers are (or will be) available, you should define this parameter.
from datarobot.models.genai.agent.auth import datarobot_tool_auth, AuthType
# More tool code.
@datarobot_tool_auth(
type=AuthType.OBO, # on-behalf-of
provider="google", # required with multiple OAuth providers
)
def list_files_in_google_drive(folder_name: str, token: str = "") -> list[dict]:
"""The value for token parameter will be injected by the decorator."""
# More tool code.
Interface with tool deployments¶
Global tools and tools custom-built in the Registry workshop must be deployed for the agent to call them. When these tools are deployed, communicating with them requires a deployment ID, used to interface with the tool through the DataRobot API. The primary method for providing a deployed tool's deployment ID to the agent is through environment variables, defined as runtime parameters in the agent's metadata. To provide this metadata, create or modify a model-metadata.yaml file to add the runtime parameter for each deployed tool the agent needs to communicate with. Define runtime parameters in runtimeParameterDefinitions.
runtimeParameterDefinitions:
- fieldName: AI_CATALOG_SEARCH_TOOL_DEPLOYMENT_ID
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
The example below illustrates a model-metadata.yaml file configured for an agenticworkflow implementing two global tools, Search Data Registry and Get Data Registry Dataset. The field names in the example below are used by DataRobot agent templates implementing these tools; however, the fieldName is configurable, and must match the implementation in the agent's code, located in the agent.py file.
---
name: agent_with_tools
type: inference
targetType: agenticworkflow
runtimeParameterDefinitions:
- fieldName: OTEL_SDK_ENABLED
defaultValue: true
type: boolean
- fieldName: LLM_DATAROBOT_DEPLOYMENT_ID
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
- fieldName: DATA_REGISTRY_SEARCH_TOOL_DEPLOYMENT_ID
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
- fieldName: DATA_REGISTRY_READ_TOOL_DEPLOYMENT_ID
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
Unless you provided the required values as the defaultValue, you must set the runtime parameter to provide the deployment IDs to the agent's code. You can do this in two ways:
-
Manually: Configuring the values in the UI, in the Registry workshop, before deploying the agent.
-
Automatically: Configuring the values in a Pulumi script and passing them to the custom model.
When the parameters are set, they're accessible in the agent code, as long as the RuntimeParameters class is imported from datarobot_drum, as shown in the simplified agent.py file example below.
import os
from datarobot_drum import RuntimeParameters
class MyAgent:
# More agentic workflow code.
@property
def tool_ai_catalog_search(self) -> BaseTool:
deployment_id = os.environ.get("AI_CATALOG_SEARCH_TOOL_DEPLOYMENT_ID")
if not deployment_id:
deployment_id = RuntimeParameters.get("AI_CATALOG_SEARCH_TOOL_DEPLOYMENT_ID")
return SearchAICatalogTool(
tool_client=self.tools_client,
deployment_id=deployment_id
)
@property
def tool_ai_catalog_read(self) -> BaseTool:
deployment_id = os.environ.get("AI_CATALOG_READ_TOOL_DEPLOYMENT_ID")
if not deployment_id:
deployment_id = RuntimeParameters.get("AI_CATALOG_READ_TOOL_DEPLOYMENT_ID")
return ReadAICatalogTool(
tool_client=self.tools_client,
deployment_id=deployment_id,
)
# More agentic workflow code.
Define tool metadata¶
When building tools for an agent, the metadata defines how the agent LLM should call the tool. The more details the metadata provides, the more effectively the LLM uses the tool. The metadata includes the tool description and each arguments' schema and related description. Each framework has a unique way to define this metadata; however, in most cases you can leverage pydantic to import BaseModel to define the tool's arguments.
from pydantic import BaseModel as PydanticBaseModel
class SearchAICatalogArgs(PydanticBaseModel):
search_terms: str = Field(
default="",
description="Terms for the search. Leave blank to return all datasets."
)
limit: int = Field(
default=20,
description="The maximum number of datasets to return. "
"Set to -1 to return all."
)
The example below implements a simple BaseTool class for CrewAI, implemented in tool_deployment.py, containing all necessary metadata and available for reuse across multiple CrewAI tools.
from abc import ABC
from crewai.tools import BaseTool
from tools_client import ToolClient
class BaseToolWithDeployment(BaseTool, ABC):
model_config = {
"arbitrary_types_allowed": True
}
"""Adds support for arbitrary types in Pydantic models, needed for the ToolClient."""
tool_client: ToolClient
"""The tool client initialized by the agent with access to the ToolClient authorization context."""
deployment_id: str
"""The DataRobot deployment ID of the custom model executing tool logic."""
The SearchAICatalogTool, defined in tool_ai_catalog_search.py, invokes tool_deployment to build off the BaseToolWithDeployment module.
from pydantic import BaseModel as PydanticBaseModel
from tool_deployment import BaseToolWithDeployment
class SearchAICatalogTool(BaseToolWithDeployment):
name: str = "Search Data Registry"
description: str = (
"This tool provides a list of all available dataset names and their associated IDs from the Data Registry. "
"You should always check to see if the dataset you are looking for can be found here. "
"For future queries, you should use the associated dataset ID instead of the name to avoid ambiguity."
)
args_schema: Type[PydanticBaseModel] = SearchAICatalogArgs
def _run(self, **kwargs) -> List[Dict[str, str]]:
# Validate and parse the input arguments using the defined schema.
validated_args = self.args_schema(**kwargs)
# Call the tool deployment with the generated payload.
result = self.tool_client.call(
deployment_id=self.deployment_id,
payload=validated_args.model_dump()
)
# Format and return the results.
return json.loads(result.data).get("datasets", [])
The example below uses the CrewAI framework to implement a tool through the BaseTool, Agent, and Task classes. The three methods below initialize the Data Registry searching tool, define an LLM agent to search the Data Registry, and then define a task for the agent, in that order.
from crewai.tools import BaseTool
from tool_ai_catalog_search import SearchAICatalogTool
class MyAgent:
# More agentic workflow code.
@property
def search_ai_catlog_tool(self) -> BaseTool:
deployment_id = self.search_ai_catlog_deployment_id
if not deployment_id:
raise ValueError("Configure a deployment ID for the Search Data Registry tool.")
return SearchAICatalogTool(
tool_client=self.tool_client,
deployment_id=deployment_id
)
@property
def agent_ai_catalog_searcher(self) -> Agent:
return Agent(
role="Expert Data Registry Searcher",
goal="Search for and retrieve relevant files from Data Registry.",
backstory="You are a meticulous analyst that is skilled at examining lists of files and "
"determining the most appropriate file based on the context.",
verbose=self.verbose,
allow_delegation=False,
llm=self.llm_with_datarobot_llm_gateway,
)
@property
def task_ai_catalog_search(self) -> Task:
return Task(
description=(
"You should search for a relevant dataset id in the Data Registry "
"based on the provided dataset topic: {dataset_topic}."
),
expected_output=(
"Search for a list of relevant files in the Data Registry and "
"determine the most relevant dataset id that matches the given topic. "
"You should return the entire dataset id."
),
agent=self.agent_ai_catalog_searcher,
tools=[self.tool_ai_catalog_search],
)
# More agentic workflow code.
Agentic tool considerations¶
When deploying the application and agent separately from the agentic tool (for example, deploying the application and agent via Pulumi after local development and the tool manually in DataRobot), all components must be deployed by the same user. Custom models use the creator’s API key and require a shared scope to store and retrieve authentication data from the OAuth Providers Service.