NextGen experience > Data preparation > Wrangle data > Build recipes > Build a SQL recipe

Build a SQL recipe¶

The SQL Editor allows you to create a recipe comprised of SQL queries that enrich, transform, shape, and blend datasets together, which you can then publish to create a new output dataset.

To open the SQL Editor, click the Actions menu next to the dataset you want to work with and select Open SQL Editor.

	Element	Description
1	Info tab	Displays summary information and metadata for the SQL recipe.
2	Data inputs tab	Displays the data inputs and feature menu for associated with the SQL recipe.
3	Add data	Adds data inputs—from the same data engine—to the recipe.
4	Data inputs	Lists all data inputs currently added to the recipe.
5	Feature menu	Displays the features of the selected data input.
6	Editor	Allows you to enter SQL queries to manipulate your data.
7	Preview	Displays a preview of the above SQL queries.
8	Data engine	Displays the data engine used to perform SQL queries.
9	SQL reference documentation	Links to the SQL reference documentation of the appropriate data engine.
10	Run	Runs the SQL queries entered into the editor to update the preview.
11	Recipe actions	Provides options for working with the recipe: Publish: Opens publishing settings so that you can publish the recipe, pushing down the SQL queries to the data source and generating an output dataset. Clone: Makes and immediately opens a copy of the SQL recipe. Open in Wrangler: Opens the recipe in Wrangler. Doing this discards all SQL updates. Remove from Use Case: Removes the recipe from the Use Case.

You can also customize your view by clicking the Settings icon , which allows you to hide or display the following elements in the editor.

Add data inputs¶

To enrich your primary dataset, you can add data inputs from the same data engine as the original dataset. The original dataset is always positioned at the top of the data input list.

To add a data input:

Click Add data.
Select the data you want to add. You can make multiple selections at a time. Note that:
- If you're adding data from a live source, you must select the schema and table(s).
- If you're adding data to work with using the Spark engine (i.e., datasets), select one or more snapshotted or static datasets already associated with the current Use Case or that's in the Data Registry.
Click Add data input in the upper-right corner. All data inputs appear in the panel on the left.

Edit data inputs¶

To edit a data input, hover over the one you want to modify and click the pencil icon .

The metadata included in the Information section is different depending on whether the input is a static dataset or a live data source. See the tabs below for more details:

Static datasetLive data source

If the input is a static or snapshot dataset, the Information section displays the following:

Field	Description
Dataset name	The name of the dataset in DataRobot.
Dataset ID	The unique ID for the dataset in DataRobot.
Created on	The date and time the dataset was created in DataRobot.

If the input is a live data source, the Information section displays the following:

Field	Description
Data connection name	The name of the data connection associated with the data.
Full path	The full path of the data including the database, schema, and table.

The edit options are different for original data inputs and those added in the SQL Editor.

Original datasetData input

When editing the original data input, you have the following edit options:

	Element	Description
1	Alias name	Enter an alias name for the data input. An alias is a temporary name assigned to a table or column within a query to enhance readability and simplify complex queries.
2	Snapshot policy	Choose a snapshot policy for the data input: Latest: For dataset only. Use the latest available snapshot of the dataset. Fixed: For datasets only. Select a specific snapshot version, even if a more recent snapshot exists. Dynamic: For data connections only. Pull data from the associated data connection at the time you run the recipe.
3	Sampling method	Choose a sampling method to apply to the data input for the preview: Random: Instructs DataRobot to take a specified number of rows randomly from the data input. First-N Rows: Instructs DataRobot to use a specified number of rows for sampling. Date/time: Time-aware only. Instructs DataRobot to create a sample that contains the specified latest/earliest rows, ordered by date/time feature.
4	Number of rows	Enter the number of rows to pull from the source data for the sample.
5	Actions	The following actions are available when editing a data input: Save: Applies your changes to the data input. Cancel: Closes the Edit data input panel without making changes.

When editing a secondary data input, you have the following edit options:

	Element	Description
1	Alias name	Enter an alias name for the data input. An alias is a temporary name assigned to a table or column within a query to enhance readability and simplify complex queries.
2	Snapshot policy	Choose a snapshot policy for the data input: Latest: For dataset only. Use the latest available snapshot of the dataset. Fixed: For datasets only. Select a specific snapshot version, even if a more recent snapshot exists. Dynamic: For data connections only. Pull data from the associated data connection at the time you run the recipe.

Edit time-aware data inputs¶

If the original data input is time-aware, and you select Date/time as the sampling method, there are additional fields that must be filled in. For more information, see the documentation on time-aware wrangling.

Create a query¶

Once you've added data inputs, you can begin adding SQL queries to the editor. To access the SQL reference for your data engine, click the documentation icon .

Live data

If you are connected to a live data source (e.g., Snowflake, Databricks, or BigQuery), you can reference the full path from the data source to use them in the SQL query instead of adding inputs. The path must include the database, schema, and table name.

You can reference data inputs you've added using only the alias without providing the full path.

To enter a query you can either manually type the SQL query syntax in the editor or add features using the panel below your data inputs.

Add features via the panel¶

To add features from a data input, select the data input from the list. The panel below updates to display the features from the selected input.

From this menu, you can:

	Element	Description
1	Place name in editor	Adds the name of the data input.
2	Place all features in editor	Adds every feature in the data input.
3	Preview data	Opens a dataset preview in a separate, larger modal so that you can more easily analyze the features in your dataset.
4		Adds features individually.

When using the panel to add features, DataRobot moves the added feature(s) into the SQL editor at the point of your cursor.

Preview results¶

When the query is complete, click Run.

Use the window-shade scroll to display more rows in the preview; if necessary, use the horizontal scroll bar to scroll through all columns of a row.

If the query was not successful, DataRobot returns a notification banner.

Publish¶

From here, you can publish your SQL recipe to generate an output dataset.

Updated March 25, 2025

Was this page helpful?

Great! Let us know what you found helpful.

What can we do to improve the content?

Thanks for your feedback!