Agent components¶
This overview details the components required to create an agent using DataRobot's agent framework. An agent artifact includes several standard files that contain metadata, hooks/functions, classes, and properties.
| Section | Description |
|---|---|
| Agent file structure | Describes important files and their organization for a DataRobot agent. |
| Functions and hooks | Details the mandatory functions and integration hooks needed for agent operation. |
| Agent class implementation | Details the general structure of the main agent class and its methods and properties. |
| Tool integration | Explains how agents use tools via the ToolClient class and framework-specific tool APIs. |
Agent file structure¶
Every DataRobot agent requires a specific set of files in the custom_model/ directory. These files work together to create a complete agent that can be deployed and executed by DataRobot.
custom_model/
├── __init__.py # Package initialization
├── agent.py # Main agent implementation, including prompts
├── custom.py # DataRobot integration hooks
├── config.py # Configuration management
├── mcp_client.py # MCP server integration (optional, for tool use)
└── model-metadata.yaml # Agent metadata configuration
| File | Description |
|---|---|
__init__.py |
Identifies the directory as a Python package and enables imports. |
model-metadata.yaml |
Defines the agent's configuration, runtime parameters, and deployment settings. |
custom.py |
Implements DataRobot integration hooks (load_model, chat) for agent execution. |
agent.py |
Contains the main MyAgent class with core workflow logic and framework-specific implementation. |
config.py |
Manages configuration loading from environment variables, runtime parameters, and DataRobot credentials. |
mcp_client.py |
Provides MCP server connection management for tool integration (optional, only needed when using MCP tools). |
Agent metadata (model-metadata.yaml)¶
The model-metadata.yaml file tells DataRobot how to configure and deploy the agent. It defines the agent's type, name, and any required runtime parameters.
---
name: agent_name
type: inference
targetType: agenticworkflow
runtimeParameterDefinitions:
- fieldName: LLM_DEPLOYMENT_ID
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
- fieldName: LLM_DEFAULT_MODEL
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
- fieldName: LLM_DEFAULT_MODEL_FRIENDLY_NAME
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
- fieldName: USE_DATAROBOT_LLM_GATEWAY
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
- fieldName: MCP_DEPLOYMENT_ID
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
- fieldName: EXTERNAL_MCP_URL
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
- fieldName: SESSION_SECRET_KEY
defaultValue: SET_VIA_PULUMI_OR_MANUALLY
type: string
| Field | Description |
|---|---|
name |
The agent's display name in DataRobot (used for identification and deployment). |
type |
The agent model type. Must be inference for all DataRobot agents. |
targetType |
The agent target type. Must be agenticworkflow for agentic workflow deployments. |
runtimeParameterDefinitions |
Defines optional runtime parameters for LLM configuration, MCP server connections, and other agent settings. |
LLM Provider Configuration
Agents support multiple LLM provider configurations including:
- LLM gateway direct: Use DataRobot's LLM Gateway directly
- LLM blueprint with external LLMs: Connect to external providers (Azure OpenAI, Amazon Bedrock, Google Vertex AI, Anthropic, Cohere, TogetherAI)s
- Deployed models: Use a DataRobot-deployed LLM via
LLM_DEPLOYMENT_ID
Functions and hooks (custom.py)¶
Agents use specific function signatures called "hooks" to integrate with DataRobot. The custom.py file contains the required functions that DataRobot calls to execute the agent. These functions connect DataRobot and the agent's logic. For more information, see the structured model hooks documentation. The following DataRobot custom model hooks are implemented in custom.py:
| Component | Description |
|---|---|
load_model() |
One-time initialization function called when DataRobot starts the agent. |
chat() |
Main execution function called for each user interaction/chat message. |
Other DataRobot hooks
The score() and score_unstructured() functions can be implemented if required for specific use cases.
load_model() hook¶
The load_model() hook is called once to initialize the agent. This is where any one-time configuration can be defined.
def load_model(code_dir: str) -> str:
"""The agent is instantiated in this function and returned.
Args:
code_dir: Path to the custom model directory
Returns:
str: "success" on successful initialization
"""
_ = code_dir
return "success"
chat() hook¶
The main entry point for the agent. DataRobot calls this function every time a user sends a message to the agent.
def chat(
completion_create_params: CompletionCreateParams,
model: str,
) -> CustomModelChatResponse:
"""Main entry point for agent execution via chat endpoint.
Args:
completion_create_params: OpenAI-compatible completion parameters
model: Model identifier
Returns:
CustomModelChatResponse: Formatted response with agent output
"""
Agent class implementation (agent.py)¶
The agent.py file contains the MyAgent class to implement the workflow logic. This is where you define how the agent behaves, the tools it uses, and how it processes inputs.
| Component | Description |
|---|---|
init() |
Method to initialize the agent with credentials, configuration, and framework-specific setup. |
invoke() |
Main execution method that processes inputs and returns framework-specific results. |
llm |
Property that returns the configured LLM instance for agent operations. |
Every agent follows this basic pattern, though the specific implementation varies by framework.
Framework-specific implementations
CrewAI, LangGraph, LlamaIndex, and NAT (NVIDIA NeMo Agent Toolkit) templates include framework-specific llm property and return types. The Generic Base template provides a minimal implementation that can be customized for any framework. NAT templates configure LLMs in workflow.yaml rather than through a Python llm property.
class MyAgent:
"""Agent implementation following DataRobot patterns."""
def __init__(
self,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
model: Optional[str] = None,
verbose: Optional[Union[bool, str]] = True,
timeout: Optional[int] = 90,
**kwargs: Any,
):
"""Initialize agent with credentials and configuration."""
self.api_key = api_key or os.environ.get("DATAROBOT_API_TOKEN")
self.api_base = api_base or os.environ.get("DATAROBOT_ENDPOINT")
self.model = model
self.timeout = timeout
# ... other initialization
@property
def llm(self) -> LLM: # Framework-specific type
"""Primary LLM configuration."""
if os.environ.get("LLM_DEPLOYMENT_ID"):
return self.llm_with_datarobot_deployment
else:
return self.llm_with_datarobot_llm_gateway
def invoke(self, completion_create_params: CompletionCreateParams) -> Union[
Generator[tuple[str, Any | None, dict[str, int]], None, None],
tuple[str, Any | None, dict[str, int]],
]:
"""Main execution method - REQUIRED."""
# Extract inputs
inputs = create_inputs_from_completion_params(completion_create_params)
# Execute agent workflow
# Return results
return response_text, pipeline_interactions, usage_metrics
__init__() method¶
The method that initializes the agent with configuration and credentials from DataRobot and framework-specific setup.
class MyAgent:
def __init__(self, api_key: Optional[str] = None,
api_base: Optional[str] = None,
model: Optional[str] = None,
verbose: Optional[Union[bool, str]] = True,
timeout: Optional[int] = 90,
**kwargs: Any):
"""Initialize agent with DataRobot credentials and configuration."""
invoke() method¶
The core execution method that DataRobot calls to run the agent. This method must be implemented and should contain the agent's main workflow logic. All frameworks use the same return type pattern.
def invoke(self, completion_create_params: CompletionCreateParams) -> Union[
Generator[tuple[str, Any | None, dict[str, int]], None, None],
tuple[str, Any | None, dict[str, int]],
]:
"""Main execution method - REQUIRED for DataRobot integration.
Args:
completion_create_params: Input parameters from DataRobot
Returns:
Union of generator (for streaming) or tuple (for non-streaming):
- response_text: str - The agent's response
- pipeline_interactions: Any | None - Event tracking data
- usage_metrics: dict[str, int] - Token usage statistics
"""
llm property¶
Defines which language model the agent uses for generating responses. The return type and implementation varies by framework.
The CrewAI, LangGraph, and LlamaIndex templates implement API base URL logic directly within their llm properties:
@property
def llm(self) -> LLM: # Framework-specific type
"""Primary LLM instance for agent operations.
Returns:
Framework-specific LLM type:
- CrewAI: LLM
- LangGraph: ChatLiteLLM
- LlamaIndex: DataRobotLiteLLM
"""
api_base = urlparse(self.api_base)
if os.environ.get("LLM_DEPLOYMENT_ID"):
# Handle deployment-specific URL construction
# ... implementation details ...
return LLM(model="openai/gpt-4o-mini", api_base=deployment_url, ...)
else:
# Handle LLM Gateway URL construction
# ... implementation details ...
return LLM(model="datarobot/azure/gpt-4o-mini", api_base=api_base.geturl(), ...)
The Generic Base template implements the llm property as follows:
@property
def llm(self) -> Any:
"""Primary LLM instance for agent operations.
Returns:
Any: Minimal implementation for custom frameworks
"""
if os.environ.get("LLM_DEPLOYMENT_ID"):
return self.llm_with_datarobot_deployment
else:
return self.llm_with_datarobot_llm_gateway
The NAT template configures LLMs in workflow.yaml rather than through a Python property. You can use the DataRobot LLM gateway (_type: datarobot-llm-gateway), DataRobot deployments (_type: datarobot-llm-deployment), or DataRobot NIM deployments (_type: datarobot-nim). The example below uses the DataRobot LLM gateway:
llms:
datarobot_llm:
_type: datarobot-llm-gateway
model_name: azure/gpt-4o-mini # Define the model name you want to use
temperature: 0.0
The LLM a specific agent uses is defined through the llm_name in the definition of that agent in the functions section:
functions:
planner:
_type: chat_completion
llm_name: datarobot_llm # Reference the LLM defined above
system_prompt: |
You are a content planner...
If more than one LLM is defined in the llms section, the various functions can use different LLMs to suit the task.
NAT-provided LLM interfaces
Alternatively, you can use any of the NAT-provided LLM interfaces instead of the LLM gateway. To use a NAT LLM interface, add the required configuration parameters such as api_key, url, and other provider-specific settings directly in the workflow.yaml file.
For information about using DataRobot deployments with NAT templates, see Configure LLM providers in code.
Tool integration¶
Agents can use tools to extend their capabilities with the ToolClient class, which enables agents to call DataRobot tool deployments. The functionality of helpers.py present in previous versions was moved to the datarobot-genai package along with other adapter code that connects agents to DataRobot's DRUM server in version 11.3.1.
ToolClient usage consideration
The ToolClient is specifically designed for calling user-deployed global tools within DataRobot. As this client serves that specialized use case, it is not required for the majority of agent tool implementations.
Framework-specific tools¶
Each framework provides its own native tool APIs for defining custom tools:
- CrewAI: Tools are passed to
Agentinstances via thetoolsparameter. - LangGraph: Tools are defined as part of graph nodes and edges.
- LlamaIndex: Tools are defined as functions and passed to agent constructors.
- NAT: Tools are defined in
workflow.yamlas functions and referenced in the workflow'stool_list(the available tool types are defined by thenat_toolsubmodules).
NAT agent configuration
Use react_agent instead of sequential_executor for flexible agents that decide which tools to run in which order, depending on the query.
Authorization context¶
The initialize_authorization_context() function from the datarobot-genai package is called in custom.py to automatically handle authentication for tools that require access tokens. This ensures tools can securely access external services using DataRobot's credential management system.