Sample assets¶
Learn DataRobot faster using these sample datasets. In some cases, full tutorials using these assets are available, allowing you to try it yourself, step-by-step. Datasets are organized by problem type.
Datasets for building¶
Generative¶
Name | Description | Asset link(s) | Learn more | |
---|---|---|---|---|
Space station research | ZIP file of the space station research papers and a CSV of evaluation prompts. | Retrieval Augmented Generation | Download .zip | Video Walkthrough |
Medical Research Abstracts | ZIP file containing individual text files. Each text file is the abstract of a medical research paper. | Retrieval Augmented Generation | Download .zip | AI Accelerator |
Technical Documentation | ZIP file containing the technical documentation for DataRobot as of late 2023. | Retrieval Augmented Generation | Download .zip | Walkthrough |
Time series¶
Name | Description | Features | Asset link(s) | Learn more | |
---|---|---|---|---|---|
Car Sales, GUI and Code | Monthly sales volume for many vehicle makes and models with additional contextual variables. | Mulitseries, Multivariate time series | Numeric | Short and fuller versions of data; a Python notebook | Video Walkthrough |
Demand forecasting by SKU by store | Weekly units sold by Store and SKU for 50 products grouped into categories | SKU-Level Demand Forecasting | Numeric, Categorical | Training File Scoring File Calendar File |
AI Accelerator |
Regression¶
Name | Description | Features | Asset link(s) | Learn more | |
---|---|---|---|---|---|
Fuel Efficiency | Predict the miles per gallon (MPG) based on other vehicle attributes. | Regression | Numeric | Training Data | DR University Lab |
Wine Quality | Predict the quality score for white wines based on chemical composition. | Regression | Numeric | Training Data Scoring File |
DR University Lab |
Developer Salaries | Predict salaries based on developer skills, based on the Stack Overflow Developer Survey 2019. | Regression | Numeric, Categorical, Text | Training Data | DR University Lab |
Classification¶
Name | Description | Features | Asset link(s) | Learn more | |
---|---|---|---|---|---|
Hospital Readmissions | Predict whether a patient will be 'readmitted' to the hospital after being discharged. | Binary Classification | Numeric, Categorical, Text | Training Data | Walkthrough |
Loan Approvals | Predict whether a loan 'is_bad' based on information provided on an application. | Binary Classification | Numeric, Categorical, Text | Training Data Scoring File |
DR University Lab |
Flight Delays | Predict whether an airline departure will be delayed by 30 minutes or more. | Binary Classification | Numeric, Categorical | Training Scoring |
AI Accelerator |
Multiclass / multilabel classification¶
These projects can only be completed in DataRobot Classic.
Name | Description | Features | Asset link(s) | Learn more | |
---|---|---|---|---|---|
Plant Disease | ZIP file with several hundred images of plant leaves organized into folders by disease class. | Multiclass | Images | Download | — |
Apparel Multilabel | Pictures of clothing which fit into multiple categories such as both 'blue' and 'dress'. | Multilabel | Images | Download | — |
Updated July 15, 2024
Was this page helpful?
Great! Let us know what you found helpful.
What can we do to improve the content?
Thanks for your feedback!