Prep your data¶
To prep your data using DataRobot Data Prep, you start by importing your data. You can import a local dataset or you can connect to an external data source. This quickstart walks you through importing a local dataset.
To complete the quickstart, you first log in to DataRobot Data Prep. Once you log in, complete these steps:
- Add data to your library.
- Start a project.
- Prep your data in a project.
- Publish your data as an AnswerSet—a snapshot of your prepped data.
- Export your prepped data.
Add data to your Data Prep library¶
In this quickstart, you will import a local dataset into your library. You can also import the data directly into a project and you can import data from external data sources. To learn about these other options for importing, see Work with datasets.
In DataRobot Data Prep, select Library on the top left.
On the top of the Library page, click + import.
Click Upload local file.
Browse for the file or drag the file to the drag-and-drop area.
Check the preview of the dataset in the lower right.
If your data looks correct, click Finish on the top right.
Data Prep imports your dataset into the library and you can begin prepping it.
Start a Data Prep project¶
You can start a new project from:
- The Library page, where you select the dataset you want to use as the starting point for your project.
- The Projects page, where you start with an empty project and then add your data to it.
Start a new project from the library¶
- Select Library on the top left.
Locate the dataset you uploaded and click Create Project.
In the Start a new Project dialog, enter the project Name and an optional Description.
Start a new project from the Projects page¶
Rather than starting a project from the library, you can instead start from the Projects page:
- Select Projects on the top left.
On the top of the Library page, click + add.
Enter the project Name and an optional Description.
Click Save and Open.
Prep your data¶
Once you have started your project, you can begin prepping your data on the project preparation page.
Use the project Tools bar on the left to clean and transform your data:
See Work with project tools for detailed instructions.
Select the menu at the top of each column to apply column operations:
See Work with column data for detailed instructions.
To view, rearrange, and mute your data prep steps, you can use the Steps tool:
See Work with steps for detailed instructions.
Publish an AnswerSet¶
When you're ready to save and share the data you prepped, you can publish it to the library as an AnswerSet. An AnswerSet is like a dataset but it is the published result of your data prep. Once published, you can reuse the AnswerSet in other projects or export the AnswerSet to share with other applications.
To publish an AnswerSet for a project:
Click steps in the Tools bar.
The Steps pane opens.
Click the step you want to publish an AnswerSet from.
Data Prep defaults to the last step in the project, which is the step at the top.
At the top of the Steps pane, click Publish.
The Publish AnswerSet to Library window appears.
Enter a name for the AnswerSet in the Name field and an optional Description, then click Publish.
Data Prep publishes the AnswerSet to the library. The "Publishing AnswerSet" message appears.
Click Show in Library to view the AnswerSet in the library.
The AnswerSet includes the steps up to and including the step you selected.
Export your prepped data¶
You can export datasets and AnswerSets locally or to a connected data source. These steps show how to download a local copy of a previously published AnswerSet.
On the Library page, hover your mouse over the AnswerSet you want to export and click Export.
In the Exporting page, click Download locally.
In the Export Settings page, click Export.
The AnswerSet is downloaded to your computer as a CSV file. The Export Logs page appears.
See Export datasets for details.