Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Transform data

DataRobot supports multiple methods of feature engineering—automatic and manual feature transformations for single datasets, as well as Feature Discovery for multiple datasets. See the table below to learn about the feature transformation options in DataRobot.

Topic Description Dataset Notes
Automatic transformations
Automatic feature transformations Understand date-type feature transformations generated by DataRobot. Primary Calculated during EDA1.
Interaction-based transformations Transform features based in interactions within your primary dataset by enabling an advanced option. Primary Enabled in project and calculated during EDA2.
Feature Discovery Perform multi-dataset, interaction-based feature creation. Secondary Configured in project and calculated during EDA2.
Automatic modeling transformations Understand the automated feature engineering DataRobot performs as part of the modeling process. All Performed during modeling.
Manual transformations
Manual feature transformations Manually transform features in your dataset, including variable type transformations. Primary Transformed in project.
AI Catalog transformations
Prepare data in AI Catalog with Spark SQL Enrich, transform, shape, and blend together datasets using Spark SQL queries within the AI Catalog.

What is feature engineering?

Feature engineering is the process of preparing a dataset for machine learning by changing existing features or deriving new features to improve model performance. Automated Feature Engineering uses AI to accelerate the transformation of data into machine learning assets, allowing you to build better machine learning models in less time.

Feature engineering takes place after data preparation and ingest, and before model building.

During EDA1, DataRobot analyzes and profiles every feature in each dataset—detecting feature types, automatically transforming date-type features, and assessing feature quality.

Before model building, you can take further advantage of Automated Feature Engineering by enabling interaction-based transformations for primary datasets or defining relationships between multiple datasets using Feature Discovery. You can also manually transform features in your dataset, including variable type transformations, with functions.

During EDA2, DataRobot uses these known interactions, or relationships, to discover relevant features for your ML models and automatically transforms them to address the unique requirements of each algorithm in the blueprint library.

After model building, navigate to the Leaderboard and select a model. There are a few places you can view which transformations DataRobot performed for individual models during the modeling process:

Feature Description Location
Blueprint Displays preprocessing, modeling algorithms, and post-processing tasks for the selected model. Click Describe > Blueprints.
Data Quality Handling report Displays feature and imputation information for supported blueprint tasks. Click Describe > Data Quality Handling.
Coefficients Allows you to download coefficients and preprocessing information, including feature transformations, for supported model types. Click Describe > Coefficients and click Export.

Updated December 6, 2024