Generative model monitoring¶
Monitoring support for generative models is a premium feature. Contact your DataRobot representative or administrator for information on enabling this feature.
Feature flags: Enable Monitoring Support for Generative Models, Enable the Injection of Runtime Parameters for Custom Models
Using the text generation target type for custom and external models, a premium LLMOps feature, deploy generative Large Language Models (LLMs) to make predictions, monitor service, usage, and data drift statistics, and create custom metrics. DataRobot supports LLMs through two deployment methods:
Create a text generation model as a custom inference model in DataRobot: Create and deploy a text generation model using DataRobot's Custom Model Workshop, calling the LLM's API to generate text instead of performing inference directly and allowing DataRobot MLOps to access the LLM's input and output for monitoring. To call the LLM's API, you should enable public network access for custom models.
Monitor a text generation model running externally: Create and deploy a text generation model on your infrastructure (local or cloud), using the monitoring agent to communicate the input and output of your LLM to DataRobot for monitoring.
Create and deploy a generative custom model¶
Custom inference models are user-created, pretrained models that you can upload to DataRobot (as a collection of files) via the Custom Model Workshop. You can then upload a model artifact to create, test, and deploy custom inference models to DataRobot's centralized deployment hub.
Add a generative custom model¶
To add a generative model to the Custom Model Workshop:
Click Model Registry > Custom Model Workshop and, on the Models tab, click + Add new model.
In the Add Custom Inference Model dialog box, under Target type, click Text Generation.
Enter a Model name and Target name. In addition, you can click Show Optional Fields to define the language used to build the model and provide a description.
Click Add Custom Model. The new custom model opens to the Assemble tab.
Assemble and deploy a generative custom model¶
To assemble, test, and deploy a generative model from the Custom Model Workshop:
On the left side of the Assemble tab, under Model, drag and drop files or click Browse local files to upload your LLM's custom model artifacts. Alternatively, you can import model files from a remote repository.
If you click Browse local files, you have the option of adding a Local Folder. The local folder should contain dependent files and additional assets required by your model, not the model itself. If the model file is included in the folder, it will not be accessible to DataRobot. Instead, the model file must exist at the root level. The root file can then point to the dependencies in the folder.
A basic LLM assembled in the Custom Model Workshop should include the following files:
The custom model code, calling the LLM service's API through public network access for custom models.
The runtime parameters required by the generative model.
The libraries (and versions) required by the generative model.
The dependencies from
requirements.txtappear under Model Environment in the Model Dependencies box.
After you add the required model files, add training data. To provide a training baseline for drift monitoring, you should upload a dataset containing at least 20 rows of prompts and responses relevant to the topic your generative model is intended to answer questions about. These prompts and responses can be taken from documentation, manually created, or generated.
Next, click the Test tab, click + New test, and then click Start test to run the Startup and Prediction error tests, the only tests supported for the Text Generation target type.
Click Register to deploy, provide the model information, and click Add to registry.
The model opens on the Registered Models tab.
In the registered model version header, click Deploy, and then configure the deployment settings.
You can now make predictions as you would with any other DataRobot model.
Create and deploy an external generative model¶
External model packages allow you to register and deploy external generative models. You can use the monitoring agent to access MLOps monitoring capabilities with these model types.
To create and deploy a model package for an external generative model:
Click Model Registry and on the Registered Models tab, click Add new package and select New external model package.
In the Register new external model dialog box, from the Prediction type list, click Text generation and add the required information about the agent-monitored generative model. To provide a training baseline for drift monitoring, in the Training data field, you should upload a dataset containing at least 20 rows of prompts and responses relevant to the topic your generative model is intended to answer questions about. These prompts and responses can be taken from documentation, manually created, or generated.
After you define all fields for the model package, click Register. The package is registered in the Model Registry and is available for use.
From the Model Registry > Registered Models tab, locate and deploy the generative model.
Monitor a deployed generative model¶
Data drift for generative models¶
To monitor drift in a generative model's prediction data, DataRobot compares new prompts and responses to the prompts and responses in the training data you uploaded during model creation. To provide an adequate training baseline for comparison, the uploaded training dataset should contain at least 20 rows of prompts and responses relevant to the topic your model is intended to answer questions about. These prompts and responses can be taken from documentation, manually created, or generated.
To learn how to adjust the Data Drift dashboard to focus on the model, time period, or feature you're interested in, see the Configure the Data Drift dashboard documentation.
The Feature Details chart includes new functionality for text generation models, providing a word cloud visualizing differences in the data distribution for each token in the dataset between the training and scoring periods. By default, the Feature Details chart includes information about the question (or prompt) and answer (or model completion/output):
|A word cloud visualizing the difference in data distribution for each user prompt token between the training and scoring periods and revealing how much each token contributes to data drift in the user prompt data.
|A word cloud visualizing the difference in data distribution for each model output token between the training and scoring periods and revealing how much each token contributes to data drift in the model output data.
The feature names for the generative model's input and output depend on the feature names in your model's data; therefore, the question and answer features in the example above will be replaced by the names of the input and output columns in your model's data.
You can also designate other features for data drift tracking; for example, you could decide to track the model's temperature, monitoring the level of creativity in the generative model's responses from high creativity (1) to low (0).
To interpret the feature drift word cloud for a text feature like question or answer, hover over a user prompt or model output token to view the following details:
|The tokenized text represented by the word in the word cloud. Text size represents the token's drift contribution and text color represents the dataset prevalence. Stop words are hidden from this chart.
|How much this particular token contributes to the feature's drift value, as reported in the Feature Drift vs. Feature Importance chart.
|How much more often this particular token appears in the training data or the predictions data.
When your pointer is over the word cloud, you can scroll up to zoom in and view the text of smaller tokens.