Use an LLM custom inference model template¶
Access this AI accelerator on GitHub
There are a wide variety of LLM model such as OpenAI (not Azure), Gemini Pro, Cohere and Claude. Managing and monitoring these LLM models is crucial to effectively using them. Data drift monitoring by DataRobot MLOps enables you to detect the changes in a user prompt and its responses. Sidecar models can prevent a jailbreak, replace Personally Identifiable Information (PII), and evaluate LLM responses with a global model in the Registry. Data export functionality shows you of what a user desired to know at each moment and provides the necessary data you should be included in RAG system. Custom metrics indicate your own KPIs which you inform your decisions (e.g., token costs, toxicity, and hallucination).
In addition, DataRobot's LLM Playground enables you to compare the RAG system of LLM models that you want to try once you deploy the models in MLOps. You can obtain the best LLM model to accelerate your business. The comparison of variety of LLM models is key element to success the RAG system.
The LLM custom inference model template enables you to deploy and accelerate your own LLM, along with "batteries-included" LLMs like Azure OpenAI, Google, and AWS.
Currently, DataRobot has a template for OpenAI (not Azure), Gemini Pro, Cohere, and Claude. To use this template follow the instructions outlined on GitHub.