Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Tour the basics of Data Prep

In this topic, you'll take a look at the main components of the Data Prep application.

Library

The Data Prep library is where you:

  • Add and manage datasets.
  • Publish AnswerSets—your prepped datasets.
  • Set up datasets for automation.
  • Add new versions.
  • Create profiles for your datasets.
  • View warnings or errors that occur when you import datasets.

You import a dataset into the library, then you can begin prepping your data in a Project. When you finish prepping your data, you can publish it back to the library as an AnswerSet—a published dataset.

For a deeper look, see Data Prep library.

Projects

The Projects page lists all projects that you have permission to view.

You add a dataset to a project, then you can explore the dataset and clean, transform, or combine it with other data.

You can publish changes to your library as an AnswerSet, which you can export or use within another project.

For a deeper look, see the Data Prep projects page.

Project preparation

You open a project by clicking it on the Projects page or by starting a new project from the library. Once you open a project, you can begin to prep your data.

Element Description
Tools bar On the left, you can access the project Tools bar that you will use to prep your data.
steps tool The Steps tool saves each operation you perform so that you can replay, mute, and rearrange the steps.
Display Preview pane Your data displays in the Data Preview pane.
column operations menu Above each column, you can access the column operations to update the columns.

For a deeper look, see Data Prep Project Preparation page.

Data

You can import data from local files on your computer or from connected data sources. Your Data Prep system administrator must configure the data sources before you can import from them. Some examples of connected data sources are:

  • Cloud storage like Amazon S3
  • The Hadoop Distributed File System (HDFS)
  • Relational databases like MySQL
  • Secure File Transfer Protocol (SFTP)

Data Prep navigation

The Data Prep header provides navigation, help, and account management functions:

Element Description
Navigation menu Navigate between the Data Prep pages:
  • Library: Access your imported and published data.
  • Projects: Prepare your data.
  • Admin: Make connections to data sources and control users’ permissions.
  • Project Flows: Automate data prep processes.
Note: The pages available to each user are based on the user's permissions.
Notification icon Indicates when Data Prep generates a warning or error. If highlighted, mouse over the icon to view the message.
Help Get Data Prep help.
User menu Access account-specific options like updating your password or logging out. You can also generate Tokens that are used to manage application access and authorization. Your Data Prep System Administrator will let you know when you need to generate tokens.
Search Page-specific search. For example, on the Library page, you can search for datasets and on the Projects page, you can search for projects.

Updated October 28, 2021
Back to top