Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

GenAI basic walkthrough

This walkthrough shows you how to work with DataRobot's GenAI, where you can generate text content using a variety of pre-trained large language models (LLMs). Or, the content can be tailored to your domain-specific data by building vector databases and leveraging them in the LLM prompts.

In this walkthrough, you will:

  1. Create a playground.
  2. Create a vector database.
  3. Build and then compare LLM blueprints.

Prerequisites

Download the following demo dataset based on the DataRobot documentation to try out the GenAI functionality:

Download demo data

1. Add a playground

A playground, another type of Use Case asset, is the space for creating and interacting with LLM blueprints, comparing the response of each to determine which to use in production. In Workbench, create a Use Case and then add a playground via the Playgrounds tab or the Add dropdown:

Read more: Create a Use Case, Add a playground

2. Add internal data for the vector database (VDB)

To tailor results using domain-specific data, assign data to an LLM blueprint for leverage during RAG operations. (You can also test LLM blueprint responses without any grounding knowledge provided by a vector database. To do so, skip ahead to step 4 to configure the LLM blueprint.)

You can add data directly to a Use Case either from a local file or data connections or from the AI Catalog. This walkthrough adds the data directly.

Add datarobot_english_documentation_5th_December.zip, the data you downloaded above, to the Use Case you created.

Read more: Add internal data, import from the AI Catalog

3. Add the vector database

If you do not add a VDB, prompt responses are not augmented with relevant DataRobot documentation; the LLM response contains only details it was able to capture from its training on data found on the internet. Instead, associate the data you added above for more complete information.

There are a variety of methods for adding a VDB. For example, from the Data tab in a Use Case:

On the resulting Create vector database page, set or confirm the following and then click Create vector database:

Setting Description Set to...
Name Change the vector database name to reflect the settings. datarobot docs - jina 20% / 256
Data source Confirm this is the data you uploaded. datarobot_english_documentation_5th_December.zip
Embedding model Use the recommended by model by keeping the pre-selected option. jinaai/jina-embedding-t-en-v1
Chunk overlap This value will help to maintain continuity when the DataRobot documentation is grouped into smaller chunks of text to embed in the vector database. 20%

Read more: Add a VDB, embeddings

4. Add and configure an LLM blueprint

Open the playground you created in step 1—it is from here that you access the controls for creating LLM blueprints, interacting with and fine-tuning them, and saving them for comparison and potential future deployment. Rename your blueprint to something you will recognize, and select an LLM from the Configuration dropdown to serve as the base model. This walkthrough uses Azure OpenAI GPT-3.5 Turbo model.

Optionally, modify configuration settings and/or add a system prompt, and select the vector database.

Read more: LLM blueprint configuration settings

5. Send a prompt

Chatting is the activity of sending prompts and receiving responses from the LLM. Once you have set the configuration for your LLM, send it a prompt (from the entry box in the lower center panel). Note that starting a chat locks in the LLM selection.

For example: Write a Python program to run DataRobot Autopilot.

Read more: Chatting

6. Follow-up with additional prompts

In the playground, you can ask follow-up questions to determine if configuration refinements are needed before saving the LLM blueprint. Because the LLM is context-aware, you can reference earlier conversation history to continue the "discussion" with additional prompts.

From within the previous conversation, ask the LLM to make a change to "that code":

Confidence scores and citations reported with the response help provide a trust measure for the response.

Read more: Confidence scores

7. Save the draft

Use the playground to test and tune prompts until you are satisfied with the system prompt and settings. Then, click Save as LLM blueprint either from the bottom of the right-hand panel or the menu. Saving the blueprint allows you to compare it to other blueprint setups, but be aware that once you save, you set the configuration for that blueprint. (To make modifications after saving, click Copy to new draft.)

Read more: Blueprint actions

8. Create a comparison LLM blueprint

You must create one or more additional LLM blueprints to run a comparison. These steps create an LLM blueprint that does not use a VDB.

Click on the menu for the LLM blueprint you just built and select Copy to new draft:

Rename the copy, for example, Chat with docs -- no VDB, and remove the vector database that is assigned.

Click Save as LLM blueprint.

9. Set up LLM blueprint comparison

Now, with more than one LLM blueprint in your playground, you can compare responses side-by-side. Click the Comparison tab and select up to three blueprints to compare.

Notice that the blueprint information indicates the version that does not leverage a VDB.

Read more: Compare blueprints

10. Compare LLM blueprints

To start the comparison, send a prompt to both blueprints. The responses show that the blueprint with grounding knowledge from the VDB provides a more comprehensive and useful response.

Notice that the response from the LLM blueprint that uses a vector database includes a citation link. Click the link to view the text chunks that the blueprint retrieved to augment the prompt sent to the LLM.

When you are satisfied with the LLM blueprint results, you can choose Send to model workshop to register and ultimately deploy it.

Read more: Deploy an LLM


Updated April 26, 2024