# LLM custom inference template

> LLM custom inference template - The LLM custom inference model template enables you to deploy and
> accelerate your own LLM, along with

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-06T18:17:09.575478+00:00` (UTC).

## Primary page

- [LLM custom inference template](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/custom-model-dev/llm-template.html): Full documentation for this topic (HTML).

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [Developer learning](https://docs.datarobot.com/en/docs/api/dev-learning/index.html): Linked from this page.
- [AI accelerators](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/index.html): Linked from this page.
- [Custom model development](https://docs.datarobot.com/en/docs/api/dev-learning/accelerators/custom-model-dev/index.html): Linked from this page.

## Documentation content

[Access this AI accelerator on GitHub](https://github.com/datarobot-community/ai-accelerators/tree/main/generative_ai/LLM_custom_inference_model_template)

There are a wide variety of LLM model such as OpenAI (not Azure), Gemini Pro, Cohere and Claude. Managing and monitoring these LLM models is crucial to effectively using them. Data drift monitoring by DataRobot MLOps enables you to detect the changes in a user prompt and its responses. Sidecar models can prevent a jailbreak, replace Personally Identifiable Information (PII), and evaluate LLM responses with a global model in Registry. Data export functionality shows you of what a user desired to know at each moment and provides the necessary data you should be included in RAG system. Custom metrics indicate your own KPIs which you inform your decisions (e.g., token costs, toxicity, and hallucination).

In addition, DataRobot's RAG playground enables you to compare the RAG system of LLM models that you want to try once you deploy the models in MLOps. You can obtain the best LLM model to accelerate your business. The comparison of variety of LLM models is key element to success the RAG system.

The LLM custom inference model template enables you to deploy and accelerate your own LLM, along with "batteries-included" LLMs like Azure OpenAI, Google, and AWS.

Currently, DataRobot has a template for OpenAI (not Azure), Gemini Pro, Cohere, and Claude. To use this template follow the instructions outlined on GitHub.
