Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Sample assets

Learn DataRobot faster using these sample datasets. In some cases, full tutorials using these assets are available, allowing you to try it yourself, step-by-step. Datasets are organized by problem type.

Datasets for building

Generative

Name Description Usage Asset link(s) Learn more
Space station research ZIP file of the space station research papers and a CSV of evaluation prompts. Retrieval Augmented Generation Download .zip Video
Walkthrough
Medical Research Abstracts ZIP file containing individual text files. Each text file is the abstract of a medical research paper. Retrieval Augmented Generation Download .zip AI Accelerator
Technical Documentation ZIP file containing the technical documentation for DataRobot as of late 2023. Retrieval Augmented Generation Download .zip Walkthrough

Time series

Name Description Usage Features Asset link(s) Learn more
Car Sales, GUI and Code Monthly sales volume for many vehicle makes and models with additional contextual variables. Mulitseries, Multivariate time series Numeric Short and fuller versions of data; a Python notebook Video
Walkthrough
Demand forecasting by SKU by store Weekly units sold by Store and SKU for 50 products grouped into categories SKU-Level Demand Forecasting Numeric, Categorical Training File
Scoring File Calendar File
AI Accelerator

Regression

Name Description Usage Features Asset link(s) Learn more
Fuel Efficiency Predict the miles per gallon (MPG) based on other vehicle attributes. Regression Numeric Training Data DR University Lab
Wine Quality Predict the quality score for white wines based on chemical composition. Regression Numeric Training Data
Scoring File
DR University Lab
Developer Salaries Predict salaries based on developer skills, based on the Stack Overflow Developer Survey 2019. Regression Numeric, Categorical, Text Training Data DR University Lab

Classification

Name Description Usage Features Asset link(s) Learn more
Hospital Readmissions Predict whether a patient will be 'readmitted' to the hospital after being discharged. Binary Classification Numeric, Categorical, Text Training Data Walkthrough
Loan Approvals Predict whether a loan 'is_bad' based on information provided on an application. Binary Classification Numeric, Categorical, Text Training Data
Scoring File
DR University Lab
Flight Delays Predict whether an airline departure will be delayed by 30 minutes or more. Binary Classification Numeric, Categorical Training
Scoring
AI Accelerator

Multiclass / multilabel classification

These projects can only be completed in DataRobot Classic.

Name Description Usage Features Asset link(s) Learn more
Plant Disease ZIP file with several hundred images of plant leaves organized into folders by disease class. Multiclass Images Download
Apparel Multilabel Pictures of clothing which fit into multiple categories such as both 'blue' and 'dress'. Multilabel Images Download

Updated July 15, 2024