Publish a recipe¶
Once the recipe is built and live sample looks ready for modeling, you can publish the recipe, pushing it down as a query to the data source. There, the query is executed by applying the recipe to the entire dataset and materializing a new output dataset. The output is sent back to DataRobot and added to the Use Case.
See the associated considerations for important additional information.
To publish a recipe:
After you're done wrangling a dataset, click Publish recipe.
Enter a name for the output dataset. DataRobot will use this name to register the dataset in the AI Catalog and Data Registry.
(Optional) Configure Automatic downsampling.
DataRobot sends the published recipe to Snowflake and where it is applied to the source data to create a new output dataset. In DataRobot, the output dataset is registered in the Data Registry and added to your Use Case.
Automatic downsampling is a technique used to reduce the size of a dataset by reducing the size of the majority class using random sampling. Consider enabling automatic downsampling if the size of your source data exceeds that of DataRobot's file size requirements.
To configure downsampling:
Enable the Automatic downsampling toggle in the Publishing Settings modal.
Specify the Maximum number of rows and Estimated size in megabytes.
From here, you can:
- Add more data.
- View Exploratory Data Insights to determine if you want to continue data wrangling.
- Use the dataset to set up an experiment and start modeling.
To learn more about the topics discussed on this page, see: