Skip to content

Agent components

This overview details the components required to create an agent using DataRobot's agent framework. An agent artifact includes several standard files that contain metadata, hooks/functions, classes, and properties.

Section Description
Agent file structure Describes important files and their organization for a DataRobot agent.
Functions and hooks Details the mandatory functions and integration hooks needed for agent operation.
Agent class implementation Details the general structure of the main agent class and its methods and properties.
Tool integration Explains how agents use tools via the ToolClient class and framework-specific tool APIs.

Agent file structure

Every DataRobot agent requires a specific set of files in the custom_model/ directory. These files work together to create a complete agent that can be deployed and executed by DataRobot.

custom_model/ directory
custom_model/
├── __init__.py           # Package initialization
├── agent.py              # Main agent implementation, including prompts
├── custom.py             # DataRobot integration hooks
├── config.py             # Configuration management
├── mcp_client.py         # MCP server integration (optional, for tool use)
└── model-metadata.yaml   # Agent metadata configuration
File Description
__init__.py Identifies the directory as a Python package and enables imports.
model-metadata.yaml Defines the agent's configuration, runtime parameters, and deployment settings.
custom.py Implements DataRobot integration hooks (load_model, chat) for agent execution.
agent.py Contains the main MyAgent class with core workflow logic and framework-specific implementation.
config.py Manages configuration loading from environment variables, runtime parameters, and DataRobot credentials.
mcp_client.py Provides MCP server connection management for tool integration (optional, only needed when using MCP tools).

Agent metadata (model-metadata.yaml)

The model-metadata.yaml file tells DataRobot how to configure and deploy the agent. It defines the agent's type, name, and any required runtime parameters.

model-metadata.yaml
---
name: agent_name
type: inference
targetType: agenticworkflow
runtimeParameterDefinitions:
  - fieldName: LLM_DEPLOYMENT_ID
    defaultValue: SET_VIA_PULUMI_OR_MANUALLY
    type: string
  - fieldName: LLM_DEFAULT_MODEL
    defaultValue: SET_VIA_PULUMI_OR_MANUALLY
    type: string
  - fieldName: LLM_DEFAULT_MODEL_FRIENDLY_NAME
    defaultValue: SET_VIA_PULUMI_OR_MANUALLY
    type: string
  - fieldName: USE_DATAROBOT_LLM_GATEWAY
    defaultValue: SET_VIA_PULUMI_OR_MANUALLY
    type: string
  - fieldName: MCP_DEPLOYMENT_ID
    defaultValue: SET_VIA_PULUMI_OR_MANUALLY
    type: string
  - fieldName: EXTERNAL_MCP_URL
    defaultValue: SET_VIA_PULUMI_OR_MANUALLY
    type: string
  - fieldName: SESSION_SECRET_KEY
    defaultValue: SET_VIA_PULUMI_OR_MANUALLY
    type: string
Field Description
name The agent's display name in DataRobot (used for identification and deployment).
type The agent model type. Must be inference for all DataRobot agents.
targetType The agent target type. Must be agenticworkflow for agentic workflow deployments.
runtimeParameterDefinitions Defines optional runtime parameters for LLM configuration, MCP server connections, and other agent settings.

LLM Provider Configuration

Agents support multiple LLM provider configurations including:

  • LLM gateway direct: Use DataRobot's LLM Gateway directly
  • LLM blueprint with external LLMs: Connect to external providers (Azure OpenAI, Amazon Bedrock, Google Vertex AI, Anthropic, Cohere, TogetherAI)s
  • Deployed models: Use a DataRobot-deployed LLM via LLM_DEPLOYMENT_ID

Functions and hooks (custom.py)

Agents use specific function signatures called "hooks" to integrate with DataRobot. The custom.py file contains the required functions that DataRobot calls to execute the agent. These functions connect DataRobot and the agent's logic. For more information, see the structured model hooks documentation. The following DataRobot custom model hooks are implemented in custom.py:

Component Description
load_model() One-time initialization function called when DataRobot starts the agent.
chat() Main execution function called for each user interaction/chat message.

Other DataRobot hooks

The score() and score_unstructured() functions can be implemented if required for specific use cases.

load_model() hook

The load_model() hook is called once to initialize the agent. This is where any one-time configuration can be defined.

custom.py
def load_model(code_dir: str) -> str:
    """The agent is instantiated in this function and returned.

    Args:
        code_dir: Path to the custom model directory

    Returns:
        str: "success" on successful initialization
    """
    _ = code_dir
    return "success"

chat() hook

The main entry point for the agent. DataRobot calls this function every time a user sends a message to the agent.

custom.py
def chat(
    completion_create_params: CompletionCreateParams,
    model: str,
) -> CustomModelChatResponse:
    """Main entry point for agent execution via chat endpoint.

    Args:
        completion_create_params: OpenAI-compatible completion parameters
        model: Model identifier

    Returns:
        CustomModelChatResponse: Formatted response with agent output
    """

Agent class implementation (agent.py)

The agent.py file contains the MyAgent class to implement the workflow logic. This is where you define how the agent behaves, the tools it uses, and how it processes inputs.

Component Description
init() Method to initialize the agent with credentials, configuration, and framework-specific setup.
invoke() Main execution method that processes inputs and returns framework-specific results.
llm Property that returns the configured LLM instance for agent operations.

Every agent follows this basic pattern, though the specific implementation varies by framework.

Framework-specific implementations

CrewAI, LangGraph, LlamaIndex, and NAT (NVIDIA NeMo Agent Toolkit) templates include framework-specific llm property and return types. The Generic Base template provides a minimal implementation that can be customized for any framework. NAT templates configure LLMs in workflow.yaml rather than through a Python llm property.

agent.py
class MyAgent:
    """Agent implementation following DataRobot patterns."""

    def __init__(
        self,
        api_key: Optional[str] = None,
        api_base: Optional[str] = None,
        model: Optional[str] = None,
        verbose: Optional[Union[bool, str]] = True,
        timeout: Optional[int] = 90,
        **kwargs: Any,
    ):
        """Initialize agent with credentials and configuration."""
        self.api_key = api_key or os.environ.get("DATAROBOT_API_TOKEN")
        self.api_base = api_base or os.environ.get("DATAROBOT_ENDPOINT")
        self.model = model
        self.timeout = timeout
        # ... other initialization

    @property
    def llm(self) -> LLM:  # Framework-specific type
        """Primary LLM configuration."""
        if os.environ.get("LLM_DEPLOYMENT_ID"):
            return self.llm_with_datarobot_deployment
        else:
            return self.llm_with_datarobot_llm_gateway

    def invoke(self, completion_create_params: CompletionCreateParams) -> Union[
        Generator[tuple[str, Any | None, dict[str, int]], None, None],
        tuple[str, Any | None, dict[str, int]],
    ]:
        """Main execution method - REQUIRED."""
        # Extract inputs
        inputs = create_inputs_from_completion_params(completion_create_params)

        # Execute agent workflow

        # Return results
        return response_text, pipeline_interactions, usage_metrics

__init__() method

The method that initializes the agent with configuration and credentials from DataRobot and framework-specific setup.

agent.py
class MyAgent:
    def __init__(self, api_key: Optional[str] = None, 
                 api_base: Optional[str] = None,
                 model: Optional[str] = None,
                 verbose: Optional[Union[bool, str]] = True,
                 timeout: Optional[int] = 90,
                 **kwargs: Any):
        """Initialize agent with DataRobot credentials and configuration."""

invoke() method

The core execution method that DataRobot calls to run the agent. This method must be implemented and should contain the agent's main workflow logic. All frameworks use the same return type pattern.

agent.py
    def invoke(self, completion_create_params: CompletionCreateParams) -> Union[
        Generator[tuple[str, Any | None, dict[str, int]], None, None],
        tuple[str, Any | None, dict[str, int]],
    ]:
        """Main execution method - REQUIRED for DataRobot integration.

        Args:
            completion_create_params: Input parameters from DataRobot

        Returns:
            Union of generator (for streaming) or tuple (for non-streaming):
            - response_text: str - The agent's response
            - pipeline_interactions: Any | None - Event tracking data
            - usage_metrics: dict[str, int] - Token usage statistics
        """

llm property

Defines which language model the agent uses for generating responses. The return type and implementation varies by framework.

The CrewAI, LangGraph, and LlamaIndex templates implement API base URL logic directly within their llm properties:

agent.py (CrewAI/LangGraph/LlamaIndex)
@property
def llm(self) -> LLM:  # Framework-specific type
    """Primary LLM instance for agent operations.

    Returns:
        Framework-specific LLM type:
        - CrewAI: LLM
        - LangGraph: ChatLiteLLM  
        - LlamaIndex: DataRobotLiteLLM
    """
    api_base = urlparse(self.api_base)
    if os.environ.get("LLM_DEPLOYMENT_ID"):
        # Handle deployment-specific URL construction
        # ... implementation details ...
        return LLM(model="openai/gpt-4o-mini", api_base=deployment_url, ...)
    else:
        # Handle LLM Gateway URL construction
        # ... implementation details ...
        return LLM(model="datarobot/azure/gpt-4o-mini", api_base=api_base.geturl(), ...)

The Generic Base template implements the llm property as follows:

agent.py (Generic Base)
@property
def llm(self) -> Any:
    """Primary LLM instance for agent operations.

    Returns:
        Any: Minimal implementation for custom frameworks
    """
    if os.environ.get("LLM_DEPLOYMENT_ID"):
        return self.llm_with_datarobot_deployment
    else:
        return self.llm_with_datarobot_llm_gateway

The NAT template configures LLMs in workflow.yaml rather than through a Python property. You can use the DataRobot LLM gateway (_type: datarobot-llm-gateway), DataRobot deployments (_type: datarobot-llm-deployment), or DataRobot NIM deployments (_type: datarobot-nim). The example below uses the DataRobot LLM gateway:

workflow.yaml (NAT)
llms:
    datarobot_llm:
    _type: datarobot-llm-gateway
    model_name: azure/gpt-4o-mini  # Define the model name you want to use
    temperature: 0.0

The LLM a specific agent uses is defined through the llm_name in the definition of that agent in the functions section:

workflow.yaml (NAT)
functions:
    planner:
    _type: chat_completion
    llm_name: datarobot_llm  # Reference the LLM defined above
    system_prompt: |
        You are a content planner...

If more than one LLM is defined in the llms section, the various functions can use different LLMs to suit the task.

NAT-provided LLM interfaces

Alternatively, you can use any of the NAT-provided LLM interfaces instead of the LLM gateway. To use a NAT LLM interface, add the required configuration parameters such as api_key, url, and other provider-specific settings directly in the workflow.yaml file.

For information about using DataRobot deployments with NAT templates, see Configure LLM providers in code.

Tool integration

Agents can use tools to extend their capabilities with the ToolClient class, which enables agents to call DataRobot tool deployments. The functionality of helpers.py present in previous versions was moved to the datarobot-genai package along with other adapter code that connects agents to DataRobot's DRUM server in version 11.3.1.

ToolClient usage consideration

The ToolClient is specifically designed for calling user-deployed global tools within DataRobot. As this client serves that specialized use case, it is not required for the majority of agent tool implementations.

Framework-specific tools

Each framework provides its own native tool APIs for defining custom tools:

  • CrewAI: Tools are passed to Agent instances via the tools parameter.
  • LangGraph: Tools are defined as part of graph nodes and edges.
  • LlamaIndex: Tools are defined as functions and passed to agent constructors.
  • NAT: Tools are defined in workflow.yaml as functions and referenced in the workflow's tool_list (the available tool types are defined by the nat_tool submodules).

NAT agent configuration

Use react_agent instead of sequential_executor for flexible agents that decide which tools to run in which order, depending on the query.

Authorization context

The initialize_authorization_context() function from the datarobot-genai package is called in custom.py to automatically handle authentication for tools that require access tokens. This ensures tools can securely access external services using DataRobot's credential management system.