Work with feature lists¶
DataRobot builds models using feature lists—subsets of features from your dataset. DataRobot automatically generates several feature lists. You can also create custom feature lists, using domain knowledge to select the features that will be most useful in building accurate models.
Following are examples of feature lists DataRobot generates.
Feature list | Description |
---|---|
All Features (default) | Includes all dataset features; performs no feature engineering. |
Informative Features | Includes features that are potentially valuable for modeling. The features that will not be useful are removed, for example, reference IDs, features that contain empty values, and features that are derived from the target. DataRobot also creates features, such as date type features (for example, day of the week and day of the month). |
Raw Features | Includes all features present in your dataset when uploaded. |
Univariate Selections | Includes features that meet a certain threshold for non-linear correlation with the target. This list is available after the target is set. |
DR Reduced Features | Includes the features DataRobot determines to be most important based on Feature Impact scores from a particular model. This list is generated after the models are built. |
Takeaways¶
This tutorial shows how to:
- View feature lists
- Create new feature lists
- Compare models built with different feature lists
- Run Autopilot on a specific feature list
View feature lists¶
View the features in the lists that DataRobot generates automatically.
-
Import your dataset and select a target.
The sample dataset featured in this tutorial contains patient data.
The goal is to predict the likelihood of patient readmission to the hospital. The target feature is
readmitted
. -
Scroll down to the Project Data tab.
By default, the All Features list displays.
This list identifies features DataRobot determines to be non-informative.
In this example, some features have too few values and some are duplicates.
-
Click the Feature List dropdown menu and select the Informative Features list.
The Informative Features list displays, and the non-informative features are removed.
Create a feature list¶
Select features to build your own custom feature lists.
-
In the Project Data tab, select features using the check boxes to the left of the feature names.
-
Click + Create feature list and enter the new feature list name to save your custom feature list.
Create a feature list from an existing list¶
Use the menu to select an existing feature list, then add or remove features to create a new feature list.
-
Click Menu on the top left of the Project Data tab and click Select features by feature list.
-
Add or remove features using the check boxes to the left of the feature names.
-
Click + Create feature list and enter the new feature list name to save your custom feature list.
Filter and select by var type¶
Filter and select features by variable data type.
-
Click Menu on the top left of the Project Data tab and click Select features by var type.
-
Add or remove features using the check boxes to the left of the feature names.
-
Click + Create feature list and enter the new feature list name to save your custom feature list.
Build and compare models¶
Compare models built with different feature lists by comparing optimization metrics. Choose metrics based on the project type, for example, regression, binary classification, or multiclass.
-
After loading your data and setting your target, click the Feature List dropdown menu and select a feature list.
-
Click Start to begin modeling.
-
When modeling is complete, click the Models tab at the top.
The Leaderboard lists the generated models and indicates the feature list that was used to generate each.
-
In the Metric dropdown menu, select a metric to use to compare the models.
This example uses LogLoss as the metric. For LogLoss, the lower the value the more acurate the model.
Rerun Autopilot on a feature list¶
After you build your models, you might decide to customize a feature list and generate more models.
-
On the Data tab, click the Feature Lists tab.
-
Click the menu to the right of the feature list you want to use to build new models and select
Rerun Autopilot.
-
In the Rerun Modeling window, select the Modeling mode and click Rerun.
Tip
You can also view, edit, export, and delete feature lists using the menu on the right of each feature list.
Learn more¶
Documentation:
- Explore feature list details.
- Assess feature impact—how important a feature is in the context of a particular model.