Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Suggested first steps

This page gets you started with a step-by-step generative or predictive AI tutorial. Everything you need to complete them is included, so after completing one, come back and try the other.

Both tutorials are suitable for all levels, regardless of whether the technology is new to you.

Availability information

DataRobot's Generative AI capabilities are a premium feature; contact your DataRobot representative for enablement information. However, you can try this functionality for yourself in a limited capacity in the DataRobot trial experience.

To save time in the future, invest an hour to complete these—know where to find information and how to fix avoidable mistakes.

Exercise 1: Generative AI playground

In the GenAI walkthrough, you will spend 20 minutes to create a Generative AI pipeline starting with raw documents and ending with multiple chatbot options from which you can deploy the best one.

Specifically, you will break up pages of technical documentation into chunks and store those chunks as vectors for easy retrieval. Next, you'll add and configure a large language model (LLM), which you can then chat with. Finally, you'll compare how different LLMs respond to natural language questions about the documents in your vector database.

Download demo data

Source:
The provided ZIP file contain hundreds of pages of DataRobot product documentation. This is an example of how a company can build a chat agent for users to ask detailed questions about their product. In your case it might be a refrigerator, a sound mixing board, or a piece of advanced software.

Exercise 2: End-to-end predictive AI

After completing this tutorial in roughly 40 minutes, you'll have completed more work than a typical machine learning team can achieve in a week.

This exercise is broken into two parts:

Do this... To accomplish this...
Build walkthrough Prepare data, build multiple models, then evaluate and compare the performance.
Operate and govern walkthrough Deploy the "best" model to an API endpoint, make batch predictions, and review deployment monitoring metrics.

These datasets comes pre-loaded in DataRobot Trial accounts. Other DataRobot users can download it here:

Download training data Download scoring data

Source:

The Hospital Readmissions sample data comes from a study of 70,000 inpatients with diabetes conducted by BioMed Research International. The researchers of the study collected this data from the Health Facts database provided by Cerner Corporation, which is a collection of clinical records across providers in the United States. Health Facts allows organizations that use Cerner’s electronic health system to voluntarily make their data available for research purposes. All the data was cleansed of PII in compliance with HIPAA.

Ready for more?

Visit the sample assets page for ready-to-use sample files and accompanying tutorials organized by problem type.


Updated July 30, 2024