Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Export data

Once processing is complete, the goal is often to export the data to a storage location. Export modules output data from the pipeline to an external data source as a CSV.

See the section on data processing limits for each module type.

AI Catalog Export module

The AI Catalog Export module exports the data from the pipeline as a snapshotted dataset in the AI Catalog. This lets you combine data from multiple sources, clean them, and create a dataset that can be used for training models.

In the Details tab, select an export type— either Create new dataset or Update existing dataset. Use the tabs below to view the configuration options based on the selected export type.

To create a new dataset and export it to the AI Catalog, select Create new dataset and use the following options to configure the AI Catalog Export module:

Option Description
Dataset name Specify the name for the dataset you're creating in the AI Catalog.
Description Provide a description of the dataset.
Request timeout (seconds) Specify the number of seconds before timing out. DataRobot recommends only adjusting this value if the module encounters errors related to timing-out, in which case you can increase this value to account for network abnormalities.
Tags Enter tags to apply to the dataset. You can then filter on these tags in the AI Catalog to find your dataset.

To update an existing AI Catalog dataset as a snapshot, select Update existing dataset and use the following options to configure the AI Catalog Export module:

Option Description
Dataset Select a dataset in the AI Catalog to update.
Request timeout (seconds) Specify the number of seconds before timing out. DataRobot recommends only adjusting this value if the module encounters errors related to timing-out, in which case you can increase this value to account for network abnormalities.

CSV Writer module

The CSV Writer module exports the data from the pipeline to an AWS S3 location of your choice in a character-delimited format.

Use the following options to configure an a CSV Writer module in the Details tab:

Option Description
File path Enter the path to the delimited text file, including the bucket name.
S3 Credentials Use existing credentials from your profile’s “Credential Management” section or create a new set of credentials by providing the Access Key, Secret Key, and AWS Session Token details.
AWS Region Enter the region where the S3 bucket is located. The default value is us-east-1.
Include header Select to include a header row.
Overwrite existing file Select to overwrite an existing file.
Double quote all strings Add double quotes to strings.
Encoding Specify the type of encoding for the data. UTF-8 is the default.
Delimiter Specify the field delimiter. Comma is the default.

Updated April 19, 2022
Back to top