# Add wrangling operations

> Add wrangling operations - Add transformations that will be applied to the source data to prepare it
> for modeling.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-06T18:17:10.053212+00:00` (UTC).

## Primary page

- [Add wrangling operations](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html): Full documentation for this topic (HTML).

## Sections on this page

- [Join](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#join): In-page section heading.
- [Aggregate](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#aggregate): In-page section heading.
- [Filter row](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#filter-row): In-page section heading.
- [De-duplicate rows](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#de-duplicate-rows): In-page section heading.
- [Find and replace](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#find-and-replace): In-page section heading.
- [Compute a new feature](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#compute-a-new-feature): In-page section heading.
- [Rename features](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#rename-features): In-page section heading.
- [Remove features](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#remove-features): In-page section heading.
- [Time-aware operations](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#time-aware-operations): In-page section heading.
- [Operation actions](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#operation-actions): In-page section heading.
- [Preview up to this operation](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#preview-up-to-this-operation): In-page section heading.
- [Reorder operations](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#reorder-operations): In-page section heading.
- [Read more](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#read-more): In-page section heading.

## Related documentation

- [NextGen UI documentation](https://docs.datarobot.com/en/docs/workbench/index.html): Linked from this page.
- [Workbench](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/index.html): Linked from this page.
- [Data preparation](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/index.html): Linked from this page.
- [Prepare data](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/index.html): Linked from this page.
- [Wrangler](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/index.html): Linked from this page.
- [view which queries were executed](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/explore-data/index.html#view-wrangling-recipe-sql): Linked from this page.
- [Derive time series features](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/ts-wrangling.html#derive-time-series-features): Linked from this page.
- [publishing phase](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/pub-recipe.html#configure-smart-downsampling): Linked from this page.
- [Description of summary statistics and histograms in DataRobot Classic.](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/histogram.html): Linked from this page.

## Documentation content

A recipe is composed of operations—transformations that will be applied to the source data to prepare it for modeling. Note that operations are applied sequentially, so you may need to [reorder the operations](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/add-operation.html#reorder-operations) in your recipe to achieve the desired result.

**Operation behavior**

When a wrangling recipe is pushed down to the connected cloud data platform, the operations are executed in their environment. To understand how operations behave, refer to the documentation for your data platform:

- Snowflake documentation
- BigQuery documentation
- Databricks documentation

Once the dataset is materialized in DataRobot and added to your Use Case, you can go to the Data tab and [view which queries were executed](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/explore-data/index.html#view-wrangling-recipe-sql) by the cloud data platform during push down.

The table below describes the wrangling operations currently available in Workbench:

| Operation | Description |
| --- | --- |
| Join | Join datasets that are accessible via the same connection instance. |
| Aggregate | Apply mathematical aggregations to features in your dataset. |
| Filter row | Filter the rows in your dataset according to specified value(s) and conditions |
| De-duplicate rows | Automatically remove all duplicate rows from your dataset. |
| Find and replace | Replace specific feature values in a dataset. |
| Compute new feature | Create a new feature using scalar subqueries, scalar functions, or window functions. |
| Rename features | Change the name of one or more features in your dataset. |
| Remove features | Remove one or more features from your dataset. |
| Derive time series features | Create customized feature engineering for time series experiments. |
| Lag features | Create one or more lags for a feature based off of the ordering feature. |
| Derive rolling statistics (numeric) | Apply statistical methods to create rolling statistics for a numeric feature. |
| Derive rolling statistics (categorical) | Create rolling statistics for a categorical feature. |

**Q: Can I perform majority class downsampling for unbalanced datasets?**

Yes, you can enable majority class downsampling during the [publishing phase](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/pub-recipe.html#configure-smart-downsampling) of wrangling. In Workbench, downsampling happens in-source and sampling weight is generated. The target and weights are then passed along to the experiment.

To add an operation to your recipe:

1. In the right panel, either click+ Add operationto add individual transformations to your recipe, orImport recipeto import an existing recipe.
2. Continue adding operations while analyzing their effect on the live sample. To add operations, you can: The live sample updates after DataRobot retrieves a new sample from the data source and applies the operation, allowing you to review the transformation in realtime.
3. When you're done, you canpublish the recipe.

### Join

Use the Join operation to combine datasets that are accessible via the same connection instance.

To join a table or dataset:

1. ClickJoinin the right panel.
2. Click+ Select datasetto browse and select a dataset from your connection instance.
3. Once you've opened and profiled the dataset you want to add, clickSelect.
4. Select the appropriateJoin typefrom the dropdown.
5. Select theJoin condition, which defines how the two datasets are related. In this example, both the datasets are related byorder_id.
6. (Optional) If you populate the field belowPrefix for features, all features added from the right dataset are marked with the specified prefix in the resulting dataset after the datasets are combined.
7. ClickAdd to recipe.

### Aggregate

Use the Aggregate operation to apply the following mathematical aggregations to the dataset (available aggregations vary by feature type):

- Sum
- Min
- Max
- Median
- Avg
- Standard deviation
- Count
- Count distinct
- Most frequent (Snowflake only)

To add an aggregation:

1. ClickAggregatein the right panel.
2. Fill in the available fields:
3. (Optional) To apply aggregations to additional features in this grouping, click+ Add feature.
4. ClickAdd to recipe. After adding the operation to the recipe, DataRobot renames aggregated features using the original name with the_AggregationFunctionsuffix attached. In this example, the new columns areage_maxandage_most_frequent.

### Filter row

Use the Filter row operation to filter the rows in your dataset according to specified value(s) and conditions.

To filter rows:

1. ClickFilter rowin the right panel.
2. Decide if you want to keep the rows that match the defined conditions or exclude them.
3. Choose the feature you want to filter. To do so, click inside the first field belowChoose conditionand select a feature from the dropdown.
4. In the dropdown below the feature, choose a condition type from the following options: Condition typeDescriptionEqualsReturn rows that are the same as the specified value or feature.Not equalsReturn rows that are not the same as the specified value or feature.Less thanReturn rows that are less than the specified value or feature.Less than or equalsReturn rows that are either less than or equal to the specified value or feature.Greater thanReturn rows that are greater than the specified value or feature.Greater than or equalsReturn rows that are either greater than or equal to the specified value or feature.Is nullReturn all rows that are null.Is not nullReturn all rows that are not null.BetweenReturn a range between one value or feature and another value or feature.ContainsReturn rows that contain the specified value or feature.
5. Below the condition type, select eitherValueorFeature.  Note that this step is not required for some condition types.
6. (Optional) ClickAdd conditionto define additional filtering criteria.
7. ClickAdd to recipe.

### De-duplicate rows

To de-duplicate rows, click De-duplicate rows in the right panel. This operation is immediately added to your recipe and applied to the live sample, removing all rows with duplicate information.

### Find and replace

Use the Find and replace operation to quickly replace specific feature values in a dataset. This is helpful to, for example, fix typos in a dataset.

To find and replace a feature value:

1. ClickFind and replacein the right panel.
2. UnderSelect feature, click the dropdown and choose the feature that contains the value you want to replace. DataRobot highlights the selected column.
3. UnderFind, choose the match criteria—Exact,Partial, orRegular Expression—and enter the feature value you want to replace. Then, underReplace, enter the new value.
4. ClickAdd to recipe.

### Compute a new feature

Use the Compute new feature operation to create a new output feature from existing features in your dataset. By applying domain knowledge, you can create features that do a better job of representing your business problem to the model than those in the original dataset.

To compute a new feature:

1. ClickCompute new featurein the right panel.
2. Enter a name for the new feature, and underExpression, define the feature using scalar subqueries, scalar functions, or window functions for your chosen cloud data platform: SnowflakeBigQueryDatabricksSpark SQLSee the Snowflake documentation for:Scalar subqueriesScalar functionsWindow functionsSee the BigQuery documentation for:Scalar subqueriesScalar functionsWindow functionsSee the Databricks documentation for:Scalar subqueriesScalar functionsWindow functionsSee the Spark SQL documentation for:Scalar functionsWindow functions This example usesREGEXP_SUBSTR, to extract the first number from the[<age_range_start> - <age_range_end>)from theagecolumn, andto_numberto convert the output from a string to a number. Expression formattingFor guidance on how to format your Compute new feature expressions, see theExpressionfield, which provides an example based on your data connection.
3. ClickAdd to recipe.

### Rename features

Use the Rename features operation to rename one or more features in the dataset.

To rename features:

1. ClickRename featuresin the right panel. Rename specific features from the live sampleAlternatively, you can click theActions menunext to the feature you want to rename. This opens the operation parameters in the right panel with the feature field already filled in.
2. UnderFeature name, click inside the first field and choose the feature you want to rename. Then, enter the new feature name in the second field.
3. (Optional) ClickAdd featureto rename additional features.
4. ClickAdd to recipe.

### Remove features

Use the Remove features operation to remove features from the dataset.

To remove features:

1. ClickRemove featuresin the right panel. Remove specific features from the live sampleAlternatively, you can click theActions menunext to the feature you want to remove. This opens the operation parameters in the right panel with the feature field already filled in.
2. UnderFeature name, click the dropdown and either start typing the feature name or scroll through the list to select the feature(s) you want to remove. Click outside of the dropdown when you're done selecting features. To remove every featureexceptthe ones you selected, select the box next toKeep selected features and remove the rest.
3. ClickAdd to recipe.

### Time-aware operations

For time-aware operations, see [time series data wrangling](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/wrangle-data/build-recipe/ts-wrangling.html). These operations include:

- Derive time series features
- Lag features
- Derive rolling statistics (numeric)
- Derive rolling statistics (categorical)

## Operation actions

After adding an operation to the recipe, you can access the Actions menu to the right of individual operations, allowing you to:

| Action | Description |
| --- | --- |
| Edit | Allows you to edit the conditions of an operation. |
| Skip step | Instructs DataRobot to skip specific operations when applying the recipe to live preview. If you publish the recipe, these operations will be visible on the recipe's list of operations, however, they will not be applied to the output dataset. |
| Preview up to this operation | Applies only the operations above the selected operation to the live preview. |
| + Add operation above | Adds an operation directly above the selected operation. |
| + Add operation below | Adds an operation directly below the selected operation. |
| Import recipe above | Imports the operations from an existing recipe directly above the selected operation. |
| Import recipe below | Imports the operations from an existing recipe directly below the selected operation. |
| Duplicate | Makes a copy of the selected operation. |
| Delete | Deletes the operation from the recipe. |

### Preview up to this operation

The Preview up to this operation action allows you to quickly test different combinations of operations on the live sample. When you select Preview up to this operation, the action is added to the recipe panel. The live preview only displays the operations listed above this action, so you can drag-and-drop the action below/above operations to see how different operations affect the preview.

To view the preview without any operations applied, drag-and-drop the action to the top of the recipe.

> [!NOTE] Note
> This operation is ignored when the recipe is published and is not visible to other members working on the same recipe.
> 
> If you use the + Add operation below action on the operation directly above Preview up to this operation, the operation is added below Preview up to this operation and not applied to the preview. If you use the + Add operation below action on the operation directly below Preview up to this operation, the operation is added below Preview up to this operation and not applied to the preview.

### Reorder operations

All operations in a wrangling recipe are applied sequentially, therefore, the order in which they appear affects the results of the output dataset.

To move an operation to a new location, click and hold the operation you want to move, and then drag it to a new position.

The live sample updates to reflect the new order.

## Read more

To learn more about the topics discussed on this page, see:

- Description of summary statistics and histograms in DataRobot Classic.
