Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Build a SQL recipe

The SQL Editor allows you to create a recipe comprised of SQL queries that enrich, transform, shape, and blend datasets together, which you can then publish to create a new output dataset.

To open the SQL Editor, click the Actions menu next to the dataset you want to work with and select Open SQL Editor.

  Element Description
1 Info tab Displays summary information and metadata for the SQL recipe.
2 Data inputs tab Displays the data inputs and feature menu for associated with the SQL recipe.
3 Add data Adds data inputs—from the same data engine—to the recipe.
4 Data inputs Lists all data inputs currently added to the recipe.
5 Feature menu Displays the features of the selected data input.
6 Editor Allows you to enter SQL queries to manipulate your data.
7 Preview Displays a preview of the above SQL queries.
8 Data engine Displays the data engine used to perform SQL queries.
9 SQL reference documentation Links to the SQL reference documentation of the appropriate data engine.
10 Run Runs the SQL queries entered into the editor to update the preview.
11 Recipe actions Provides options for working with the recipe:
  • Publish: Opens publishing settings so that you can publish the recipe, pushing down the SQL queries to the data source and generating an output dataset.
  • Clone: Makes and immediately opens a copy of the SQL recipe.
  • Open in Wrangler: Opens the recipe in Wrangler. Doing this discards all SQL updates.
  • Remove from Use Case: Removes the recipe from the Use Case.

You can also customize your view by clicking the Settings icon , which allows you to hide or display the following elements in the editor.

Add data inputs

To enrich your primary dataset, you can add data inputs from the same data engine as the original dataset. The original dataset is always positioned at the top of the data input list.

To add a data input:

  1. Click Add data.
  2. Select the data you want to add. You can make multiple selections at a time. Note that:

    • If you're adding data from a live source, you must select the schema and table(s).
    • If you're adding data to work with using the Spark engine (i.e., datasets), select one or more snapshotted or static datasets already associated with the current Use Case or that's in the Data Registry.

  3. Click Add data input in the upper-right corner. All data inputs appear in the panel on the left.

Edit data inputs

To edit a data input, hover over the one you want to modify and click the pencil icon .

The metadata included in the Information section is different depending on whether the input is a static dataset or a live data source. See the tabs below for more details:

If the input is a static or snapshot dataset, the Information section displays the following:

Field Description
Dataset name The name of the dataset in DataRobot.
Dataset ID The unique ID for the dataset in DataRobot.
Created on The date and time the dataset was created in DataRobot.

If the input is a live data source, the Information section displays the following:

Field Description
Data connection name The name of the data connection associated with the data.
Full path The full path of the data including the database, schema, and table.

The edit options are different for original data inputs and those added in the SQL Editor.

When editing the original data input, you have the following edit options:

  Element Description
1 Alias name Enter an alias name for the data input. An alias is a temporary name assigned to a table or column within a query to enhance readability and simplify complex queries.
2 Snapshot policy Choose a snapshot policy for the data input:
  • Latest: For dataset only. Use the latest available snapshot of the dataset.
  • Fixed: For datasets only. Select a specific snapshot version, even if a more recent snapshot exists.
  • Dynamic: For data connections only. Pull data from the associated data connection at the time you run the recipe.
3 Sampling method Choose a sampling method to apply to the data input for the preview:
  • Random: Instructs DataRobot to take a specified number of rows randomly from the data input.
  • First-N Rows: Instructs DataRobot to use a specified number of rows for sampling.
  • Date/time: Time-aware only. Instructs DataRobot to create a sample that contains the specified latest/earliest rows, ordered by date/time feature.
4 Number of rows Enter the number of rows to pull from the source data for the sample.
5 Actions The following actions are available when editing a data input:
  • Save: Applies your changes to the data input.
  • Cancel: Closes the Edit data input panel without making changes.

When editing a secondary data input, you have the following edit options:

  Element Description
1 Alias name Enter an alias name for the data input. An alias is a temporary name assigned to a table or column within a query to enhance readability and simplify complex queries.
2 Snapshot policy Choose a snapshot policy for the data input:
  • Latest: For dataset only. Use the latest available snapshot of the dataset.
  • Fixed: For datasets only. Select a specific snapshot version, even if a more recent snapshot exists.
  • Dynamic: For data connections only. Pull data from the associated data connection at the time you run the recipe.

Edit time-aware data inputs

If the original data input is time-aware, and you select Date/time as the sampling method, there are additional fields that must be filled in. For more information, see the documentation on time-aware wrangling.

Create a query

Once you've added data inputs, you can begin adding SQL queries to the editor. To access the SQL reference for your data engine, click the documentation icon .

Live data

If you are connected to a live data source (e.g., Snowflake, Databricks, or BigQuery), you can reference the full path from the data source to use them in the SQL query instead of adding inputs. The path must include the database, schema, and table name.

You can reference data inputs you've added using only the alias without providing the full path.

To enter a query you can either manually type the SQL query syntax in the editor or add features using the panel below your data inputs.

Add features via the panel

To add features from a data input, select the data input from the list. The panel below updates to display the features from the selected input.

From this menu, you can:

  Element Description
1 Place name in editor Adds the name of the data input.
2 Place all features in editor Adds every feature in the data input.
3 Preview data Opens a dataset preview in a separate, larger modal so that you can more easily analyze the features in your dataset.
4 Adds features individually.

When using the panel to add features, DataRobot moves the added feature(s) into the SQL editor at the point of your cursor.

Preview results

When the query is complete, click Run.

Use the window-shade scroll to display more rows in the preview; if necessary, use the horizontal scroll bar to scroll through all columns of a row.

If the query was not successful, DataRobot returns a notification banner.

Publish

From here, you can publish your SQL recipe to generate an output dataset.


Updated March 25, 2025