Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Pipeline workspaces

A data flow pipeline is a collection of connected modules that process data and pass the output to subsequent modules for further processing. Each pipeline contains the module specifications, connections, and configurations needed to implement your machine learning data flows. The pipelines you build and execute are contained in workspaces.


Currently, a workspace contains a single pipeline. In the future, a workspace may support additional pipelines and related data assets.

Workspaces page

The Workspaces page lets you manage the workspaces where you will build your pipelines.

Access the Workspaces page in the AI Catalog:

You can perform the following actions on the Workspaces page:

Element Description
Add new workspace Click to add a workspace where you will create a pipeline.
Search field Enter strings to search for workspaces.
Tags and Owner filters You can filter by:
  • Tags: Filter by tags that you create. Create and add tags on a workspace's Info tab.
  • Owner: Filter by the owner of a workspace.
Workspace list Click a workspace to view its details or edit it.
Sort list Sort by Workspace, Created, or Last updated.
Last run Date and time of the last run and the user who executed it.
Actions menu
  • Edit Workspace: Edit an individual workspace.
  • Run Now: Run the entire pipeline in a workspace.
  • Duplicate Workspace: Duplicates a workspace.
  • Delete Workspace: Delete a workspace.
Schedule Hover over the calendar icon to view schedule information, including how often the pipeline is scheduled to run and when it will run next.

Add a workspace

To create a new workspace:

  1. In the AI Catalog, click the Workspaces tab and click Add new workspace.

  2. Name the workspace by clicking the pencil icon () next to the default workspace name. Then select the type of module to add.

    You can add a module that imports data to the pipeline, transforms data in the pipeline, or exports data from the pipeline.

    Once selected, the module appears in the workspace editor.

  3. Click the new module to select it (in this case, the CSV Reader module).

    You can perform the following actions in the workspace editor.

    Element Description
    Pencil icon () Rename your workspace.
    Open/Close tabs Select checkboxes to show or hide tabs in the workspace editor. Select Reset layout to default to display the default layout.
    Delete Delete the current workspace.
    Close Close the workspace editor and return to the Workspace Info page.
    Pipeline view To view the pipeline in the workspace editor, click Graph. To view the .yaml file for the pipeline, click pipeline.yaml.
    Connections tab The Connections tab lets you add or remove ports through which the module accepts input, produces output, and connects to other modules.
    Details tab The Details tab lets you specify the configuration of an individual module, for example, by defining a file path, setting credentials, indicating a delimeter, etc. For Spark SQL modules, this tab lets you modify your Spark SQL code.
    Edit pipeline file Displays the pipeline.yml tab where you can edit the yml file that specifies the pipeline's modules and connections. Click the Graph tab to return to the workspace editor.
    + Add new module Select a module type to add. The new module appears in the workspace. Use the workspace module tabs on the right to edit it and update its connections and configuration.
    Module tile The top left of a module tile shows the module type (S3, SQL, etc.). The name appears to the right of the type (in this case, "CSV Reader"). The number of rows successfully processed during the module's run displays beneath the module name. After selecting a module, you can:
    • Use the Connections and Details tabs to configure and edit it.
    • Use the actions menu on the upper right of a module tile to force a run, clone the module, or remove the module from the workspace.
    • Click Run to selection above the pipeline to run the pipeline to the selected module.

Duplicate a workspace

You can duplicate any existing workspace—copying everything from the original workspace except its run schedule and history. This allows you to leverage pre-existing content without affecting the configuration of the original.

  1. Click the actions menu of the workspace you want to duplicate and click Duplicate Workspace.

  2. Enter a name for the new workspace and click Duplicate.

Use tags to filter workspaces

After adding and exiting a workspace, you can add tags to help you find it on the Workspaces page:

  1. In the Info tab of the workspace, click the pencil icon () next to the Tags field.

  2. Enter one or more tag names, then enter Return or click outside of the field.

    The following special characters are not allowed in tag names: -$.{}"'#, Spaces are also not allowed but you can use an underscore (_) for a multi-word tag name.

  3. Once you've created a tag, you can use it to filter workspaces on the Workspaces page.

Updated March 11, 2022
Back to top