Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

GenAI starter walkthrough

This walkthrough shows you how to work with generative AI (GenAI) in DataRobot, where you can generate text content using a variety of pre-trained large language models (LLMs). Or, the content can be tailored to your domain-specific data by building vector databases and leveraging them in the LLM prompts.

In this walkthrough, you will:

  1. Create a playground.
  2. Create a vector database.
  3. Build and then compare LLM blueprints.


Download the following demo dataset based on the DataRobot documentation to try out the GenAI functionality:

Download demo data

1. Add a playground

A playground, another type of Use Case asset, is the space for creating and interacting with LLM blueprints, comparing the response of each to determine which to use in production. In Workbench, create a Use Case and then add a playground via the Playgrounds tab or the Add dropdown:

Read more: Create a Use Case, Add a playground

2. Add internal data for the vector database (VDB)

To tailor results using domain-specific data, assign data to an LLM blueprint for leverage during RAG operations. (You can also test LLM blueprint responses without any grounding knowledge provided by a vector database. To do so, skip ahead to step 4 to configure the LLM blueprint.)

You can add data directly to a Use Case either from a local file or data connections or from the AI Catalog. This walkthrough adds the data directly.

Add, the data you downloaded above, to the Use Case you created.

Read more: Add internal data, import from the AI Catalog

3. Add the vector database

If you do not add a VDB, prompt responses are not augmented with relevant DataRobot documentation; the LLM response contains only details it was able to capture from its training on data found on the internet. Instead, associate the data you added above for more complete information.

There are a variety of methods for adding a VDB. For example, from the Data tab in a Use Case:

On the resulting Create vector database page, set or confirm the following and then click Create vector database:

Setting Description Set to...
Name Change the vector database name to reflect the settings. datarobot docs --jina 20% / 256
Data source Confirm this is the data you uploaded.
Embedding model Use the recommended by model by keeping the pre-selected option. jinaai/jina-embedding-t-en-v1
Chunk overlap This value will help to maintain continuity when the DataRobot documentation is grouped into smaller chunks of text to embed in the vector database. 20%

Read more: Add a VDB, embeddings

4. Add and configure an LLM blueprint

Open the playground you created in step 1—it is from here that you access the controls for creating LLM blueprints, interacting with and fine-tuning them, and saving them for comparison and potential future deployment. From the playground, choose Create LLM blueprint and begin the configuration.

Select an LLM from the Configuration dropdown to serve as the base model. This walkthrough uses Azure OpenAI GPT-3.5 Turbo model. Optionally, modify configuration settings.

From the Vector database tab, choose the VDB you added in step 3.

From the Prompting tab, optionally add a system prompt and change context awareness. This example leaves the blueprint configuration as context-aware, the default.

Read more: LLM blueprint configuration settings, context-awareness

Save the configuration.

5. Send a prompt

Chatting is the activity of sending prompts and receiving responses from the LLM. Once you have set the configuration for your LLM, send it a prompt (from the entry box in the lower center panel). Note that starting a chat locks in the LLM selection.

For example: Write a Python program to run DataRobot Autopilot.

Read more: Chatting

6. Follow-up with additional prompts

In the playground, you can ask follow-up questions to determine if configuration refinements are needed before saving the LLM blueprint. Because the LLM is context-aware, you can reference earlier conversation history to continue the "discussion" with additional prompts.

From within the previous conversation, ask the LLM to make a change to "that code":

Confidence scores and citations reported with the response help provide a trust measure for the response.

Read more: Confidence scores

7. Test and tune

Use the configuration area to test and tune prompts until you are satisfied with the system prompt and settings. (You must save the the configuration before submitting the first prompt.) Choose Edit LLM blueprint to rename the LLM blueprint to something more easily recognizable.

Read more: Blueprint actions

8. Create a comparison LLM blueprint

Create one or more LLM blueprints to run a comparison. (You can send prompts to a single blueprint in the comparison view.) These steps create an LLM blueprint that does not use a VDB.

Click on the menu for the LLM blueprint you just built and select Copy to new LLM blueprint:

Rename the copy, for example, GPT3.5 without VDB, and remove the vector database that is assigned.

Click Save configuration.

9. Set up LLM blueprint comparison

Now, with more than one LLM blueprint in your playground, you can compare responses side-by-side. Click Comparison in the breadcrumbs:

Select up to three blueprints to compare.

Notice that the LLM blueprint information indicates the version that does not leverage a VDB.

Read more: Compare blueprints

10. Compare LLM blueprints

To start the comparison, send a prompt to both blueprints. The responses show that the blueprint with grounding knowledge from the VDB provides a more comprehensive and useful response.

Notice that the response from the LLM blueprint that uses a vector database includes a citation link. Click the link to view the text chunks that the LLM blueprint retrieved to augment the prompt sent to the LLM.

When you are satisfied with the LLM blueprint results, you can choose Send to model workshop from the LLM blueprint actions dropdown to register and ultimately deploy it.

Read more: Deploy an LLM

Updated June 25, 2024