Feature Discovery settings¶
The Feature Discovery process uses a variety of heuristics to determine the list of features to derive in a DataRobot project. In Feature Discovery Settings, you can control which transformations DataRobot will try when deriving new features (feature engineering controls), as well as set DataRobot to automatically remove redundant features and those with low impact (feature reduction).
To access Feature Discovery Settings, click the settings gear on the Define Relationships page.
Feature engineering controls¶
You can influence how DataRobot conducts feature engineering by setting feature engineering controls. You might want to do this to:
- Use your domain knowledge to guide the feature engineering process and improve the quality of the derived features.
- Speed up feature engineering.
- Improve accuracy by deriving more features, for example, using categorical statistics, skewness, and kurtosis.
- Exclude specific transforms that might be too complex to explain to business stakeholders. You can exclude these features post-modeling but that adds to the complexity of the modeling process.
Set the feature engineering options in the relationship editor prior to EDA2.
In Feature Discovery Settings, click the Feature Engineering tab. Consider which feature engineering transformations make the most sense for your project and select the ones you want DataRobot to try when deriving new features.
You can hover over a transformation to view a tool tip that describes it.
Latest vs. Latest within window
|Latest||Generates new features by exploring all historical data up until the end point of any defined FDWs. Note that this method ignores all FDW start points.||Disabled|
|Latest within window||Generates new features within the defined FDW. For time-aware feature engineering, only the data within the FDW is required when making predictions.||Enabled|
When you're done, click Save changes.
During Feature Discovery, DataRobot generates new features then removes the features that have low impact or are redundant. This is called feature reduction. You can instead include all features when building models by disabling feature reduction using the following method:
In the relationship configuration (the Define Relationships page), click the settings () gear. Select the Feature Reduction tab and toggle off Use supervised feature reduction: