# Prepare data

> Prepare data - Apply transformations to a external source data, a Snowflake dataset for example,
> creating a recipe that can then be published to generate a new output dataset.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-06T18:17:10.054428+00:00` (UTC).

## Primary page

- [Prepare data](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/index.html): Full documentation for this topic (HTML).

## Related documentation

- [NextGen UI documentation](https://docs.datarobot.com/en/docs/workbench/index.html): Linked from this page.
- [Workbench](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/index.html): Linked from this page.
- [Data preparation](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/index.html): Linked from this page.
- [selecting a dataset from a data connection](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/add-data/connect.html#select-a-dataset): Linked from this page.
- [Data assetstile](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/explore-data/index.html): Linked from this page.
- [associated considerations](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/data-faq/index.html#considerations): Linked from this page.
- [Wrangler](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/index.html): Linked from this page.
- [SQL Editor](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/sql-editor.html): Linked from this page.
- [Publish a recipe](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/pub-recipe.html): Linked from this page.
- [Supported data stores](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/index.html): Linked from this page.
- [Wrangling large Snowflake datasets](https://docs.datarobot.com/en/docs/reference/data-ref/wrangle-snowflake.html): Linked from this page.

## Documentation content

DataRobot's wrangling capabilities provide a seamless, scalable, and secure way to access and transform data for modeling. In Workbench, "wrangle" is a visual interface for executing data cleaning at the source, whether that's the Data Registry in DataRobot or leveraging the compute environment and distributed architecture of your external data source.
Why wrangle data in DataRobot?

- It's fully integrated in Workbench—find the right datasets, apply transformations, and see the effects of those transformations on your dataset in realtime in one place.
- It's pushed down—when using a data connection, leverage the scale of your cloud data warehouse or lake.
- It's secure—limiting data movement means faster results, better performance, and enhanced security.

You can launch the data wrangler from the following areas in a Use Case:

- When selecting a dataset from a data connection , click Open in Wrangler in the top-right corner.
- On the Data assetstile , from the Actions menu next to a dataset.
- On the data explore page , from the Data actions dropdown.

When you wrangle a dataset, DataRobot pulls a uniform random sample of 10000 rows and calculates exploratory data insights on that sample, all while connected to your data source. Then, you build a recipe of operations you want to apply to the entire dataset—the transformations are first applied to the live sample to make sure it's being done correctly. When the recipe is ready to be published, it's pushed down to the data source where it's executed to materialize an output dataset.

DataRobot provides two different tools for wrangling data:

- Wrangler: A GUI-based tool that allows you to build a recipe using operations—each operation applying a specific transformation to the dataset.
- SQL Editor: A tool that allows you to build a recipe using SQL queries.

See the [associated considerations](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/data-faq/index.html#considerations) for important information about wrangling data in DataRobot.

This section covers the following topics:

| Topic | Description |
| --- | --- |
| Wrangler | Use Wrangler to build a recipe of one or more operations that allow you to interactively prepare data for modeling without moving it from your data source. |
| SQL Editor | Use the SQL Editor to create a recipe comprised of SQL queries which you can then publish to your data source and generate an output dataset. |
| Publish a recipe | Publish a recipe to push down transformations to your data source and generate an output dataset. |
| Reference |  |
| Associated considerations | Important additional information for working with wrangling. |
| Supported data stores | A complete list of supported data stores. |
| Wrangling large Snowflake datasets | Tips for improving the performance of wrangling in Snowflake. |
