Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Data Prep library

On the Library page, you can add new datasets and manage existing datasets, including Data Prep AnswerSets that you publish from your projects. In the library, you can also export datasets, set them up for automation, add new versions, create profiles for your datasets, and view any warnings or errors that occurred when a dataset was imported.

The following tables describe the library layout, as well as the actions you take in the library to work with your datasets.

Library layout

The following table describes the sections of the Library page.

Action Description
Library tabs Select a tab:
  • Datasets: Manage datasets.
  • Export Logs: View the logs generated during exports.
  • Data Sources: Add new data sources. Import and export from data sources that have been added to the library.
Add a new dataset To add a new dataset to your library, click Datasets + on the top left. On the Select Datasets page, import one or more datasets by selecting data sources or local datasets. If there are any errors during the import, a red warning icon displays adjacent to the dataset's listing on the page. For more information on any errors, mouse over the dataset's name and click Edit Details.
Search for a dataset on the page To search for a dataset in your library, click the magnifying glass icon on the top right. In the field that displays, begin typing the name of the dataset you want to locate. Potential matches display as you continue to type.
Filter the datasets displayed on the page Filter the list of datasets by categories such as version, creation time, and owner.
Sort columns The Library page lists the filtered datasets. The library columns provide attributes for each dataset, including the type, version, status, number of rows, tags, and data created and by whom.
Dataset actions You can perform actions on the datasets such as creating projects and exporting datasets. To delete a dataset, click the red X icon to the left of the dataset.

Library filters

Use the filters at the top of the page to filter the list of datasets displayed on the page.

The following table describes your options for filtering datasets.

Action Description
Show Versions Toggle to display all versions of every dataset and AnswerSet or only the latest version of each.
Creation Time Select the last seven or 30 days.
Ready for use? This filter only displays when interactive mode is enabled for your Data Prep projects. It allows you to quickly see which datasets have finished loading their interactive portions and are ready for use in a project.
Completed? Displays all datasets that have successfully finished importing into the library.
Owner Displays the datasets and AnswerSets that you have imported and created from Data Prep projects.
Data Source Click this field to display all of the data sources used to import datasets. You can select more than one data source by continuing to click and select.
Tags Tags are descriptive words that enable you to organize your datasets. Click the Tags field to display all of the tags currently assigned to datasets in the library. To locate a dataset by a specific tag, type the tag name and click Enter. If you add multiple tags, only the datasets containing all of the search tags are returned as matches. To add a new tag for a dataset, hover over the dataset and click inside the add tags field for that dataset.

Library columns

You can create a new column display order for the Library page by clicking a column's header and dragging it to a new location. To sort the library list by a particular column, click that column name. To sort on multiple columns, hold Shift and click additional columns.

Note

When you reorder items on the Library page, the changes are temporary and not retained when you leave the Library page or refresh your browser.

The following table describes the columns on the Library page.

Action Description
Name Displays the name of the dataset when it was imported into the library. To change the name, hover over More Actions for the dataset and click edit details. The General page for the dataset displays. You can change the dataset name and update metadata fields.
Type Allows you to quickly identify which datasets in your library are AnswerSets that were created from Data Prep projects. If the interactive mode feature is enabled for your Data Prep projects, AnswerSets are represented with the partial icon to indicate that you were working in interactive mode when it was created.
Version # Displays the number of versions for each dataset or AnswerSet. If there is more than one version, click All Versions for the dataset to view all versions. To return to viewing all datasets, click All Datasets on the top left.

Keep in mind, if you filter tags while drilled into the All Versions page, your search only applies to the All Versions page and not the entire Library page.

Version numbers do not necessarily correspond to the actual number of those datasets in the library.

Following are conditions under which a version number will not match the exact number of those datasets in the library:
  • When an import is canceled before it completes, a version number is automatically generated and subsequent imports will simply be incremental version number additions.
  • When a particular version of a dataset is deleted, the version numbers for the remaining datasets are not decremented.
Status Describes a dataset's load status as it's being imported into the library. In most cases, the status quickly progresses to "completed." However, for larger datasets, you will see interim states that indicate that your dataset is continuing to successfully import. The interim states you may see also depend on:
  • Whether the row count of the dataset can be predetermined prior to import: In most cases, Data Prep knows the number of rows in a dataset before the import process even begins. However, there are cases where the count cannot be predetermined—for example, imports from Salesforce and queries on JDBC data sources.
  • Whether interactive mode is enabled for your projects: When interactive mode is enabled, you'll notice the status icon has two concentric circles. The inner circle represents the interactive portion of your dataset. When the interactive portion is ready to be used in a project, the inner circle becomes a green check mark. The outer circle will then begin to fill green as the remainder of the dataset continues to load into the library. If any errors occur while importing the interactive portion or the remainder, a red warning icon displays in the respective concentric circle to indicate which part of the dataset failed to import into the library. See Loading states for examples of loading states. See Failure states for examples of the failure states.
You may see a "Pending" state in this column if you did not finish selecting the parsing options for the dataset. In this case, you will also see a Click to Finish button in the Created column. Click the button to open the Import page and finish the import.
# of Rows Displays the number of rows in a dataset. You can preview rows from a dataset by moving your mouse over the dataset and clicking the show preview link that displays in this column. When a dataset is currently in the import process and the row count is predetermined, the number displayed in this column continues to increase until the import is finished. If the dataset fails to import successfully, the number of rows that successfully imported are listed in this column. In this case, show preview displays a preview of those rows.
Tags Tags are labels that you can add to your datasets to help organize your data. To add tags to a dataset, click in the Tags column for that dataset, type a tag name and click the Add link that displays or press the Enter key.
Created Displays the user who imported the dataset and when it was imported. You may see a Click to Finish link in the column. This indicates the import was never initiated because the parse options were not finalized. Click this link to return to the Import page and finish the import process for the dataset.

Loading states

icon description
Icon displayed when interactive mode is not enabled and row count can be determined.
Icon displayed when interactive mode is not enabled and row count cannot be predetermined.
Icon displayed when interactive mode is enabled and row count cannot be predetermined.
Icon displayed when interactive mode is enabled and row count cannot be predetermined.
Icon displayed when loading is complete.

Failure states

icon description
Interactive mode not enabled: Dataset failed to import.
Interactive mode: Interactive portion did not successfully import.
Interactive mode: Interactive portion successfully completed but remainder of dataset failed to successfully import.

Actions you can take for a dataset

Three links that appear when you hover over a dataset provide you with the options you can take for that dataset.

The following table describes the actions you can perform on a dataset on the Library page.

Action Description
Create Project Create a new project using the dataset as your base dataset.
Export Export or download a dataset locally.
More Actions Provides additional options, depending on the features that are enabled for your Data Prep application:
  • edit details: Opens the dataset's General page. This is where you can update the dataset's name and metadata. This is also the page where you view warnings or errors that may have occurred during import. Datasets with warnings or errors are easy to locate in the list—they are flagged with a warning icon adjacent to the dataset name, the row color for the dataset is red, and the Status icon indicates a failure state.
  • add version: Add a new version of the existing dataset without overwriting the current version.
  • automate: Automate the dataset (if automation feature is enabled).
  • profile: Profile the dataset (if profiling feature is enabled).
  • open source: For any AnswerSet, open a project at the precise Step from which that AnswerSet was created.

Updated October 28, 2021
Back to top