Suggested first steps¶
Get started with a step-by-step generative or predictive AI walkthrough. Everything you need to complete these walkthroughs is included, so after completing one, come back and try the other.
Note
Both tutorials are suitable for all levels, regardless of whether the technology is new to you.
| Exercise | Estimated completion time |
|---|---|
| Generative AI (GenAI) playground | 20 minutes |
| End-to-end predictive AI (in two parts) | 40 minutes |
Availability information
DataRobot's Generative AI capabilities are a premium feature; contact your DataRobot representative for enablement information. However, you can try this functionality in a limited capacity by using the DataRobot trial experience.
Exercise 1: Generative AI playground¶
In the GenAI walkthrough, you will create a GenAI pipeline, starting with raw documents and ending with multiple chatbot options from which you can deploy the best one.
Specifically, you will break up pages of technical documentation into chunks and store those chunks as vectors for easy retrieval. Next, you'll add and configure a large language model (LLM), which you can then chat with. Finally, you'll compare how different LLMs respond to natural language questions about the documents in your vector database.
Source:
The provided ZIP file contain hundreds of pages of DataRobot product documentation. This is an example of how a company can build a chat agent for users to ask detailed questions about their product. In your case it might be a refrigerator, a sound mixing board, or a piece of advanced software.
Exercise 2: End-to-end predictive AI¶
After completing this tutorial, you'll have completed more work in less than an hour than a typical machine learning team can achieve in a week.
This exercise is broken into two parts:
| Do this walkthrough | To... |
|---|---|
| Build | Prepare data, build multiple models, then evaluate and compare the performance. |
| Operate and govern | Deploy the "best" model to an API endpoint, make batch predictions, and review deployment monitoring metrics. |
These datasets come pre-loaded in DataRobot Trial accounts. Other DataRobot users can download it here:
Download training data Download scoring data
Source:
The Hospital Readmissions sample data comes from a study of 70,000 inpatients with diabetes conducted by BioMed Research International. The researchers of the study collected this data from the Health Facts database provided by Cerner Corporation, which is a collection of clinical records across providers in the United States. Health Facts allows organizations that use Cerner’s electronic health system to voluntarily make their data available for research purposes. All the data was cleansed of PII in compliance with HIPAA.
Ready for more?¶
Visit the sample assets page for ready-to-use sample files and accompanying tutorials organized by problem type. Review the Walkthrough section for step-by-step exercises designed for learning various features of the platform.