Skip to content

Transform features

The following sections describe manual, user-created transformations. Transformed features do not replace the original, raw features; rather, they are provided as new, additional features for building models.

Note

Transformed features (including numeric features created as user-defined functions) cannot be used for special variables, such as Weight, Offset, Exposure, and Count of Events.

Variable type transformations

DataRobot bases variable type assignment on the values seen during EDA—these values are displayed in various areas throughout NextGen. There are times, however, when you may need to change the type. For example, area codes may be interpreted as numeric but you would rather they map to categories. Or a categorical feature may be encoded as a number (that is intended to map to a feature value, such as 1=yes, 2=no) but without transformation is interpreted as a number.

Variable type transformations are only available when it is appropriate to the feature type, so there are certain cases where you cannot perform a transformation. These include columns that DataRobot has identified as special columns for both integral and float values. (Date columns are a special case and do support transforms.) Additionally, a column that is all numeric except for a single unique non-numeric value is treated as special. In this case, DataRobot converts the unique value to NaN and disallows conversion to prevent losing the value.

Note

When converting from numeric variable types to categorical, be aware that DataRobot drops any values after the decimal point. In other words, the value is truncated to become an integer. Also, when transforming floats with missing values to categorical, the new feature is converted, not rounded. For example, 9.9 becomes 9, not 10.

Tip

When making predictions DataRobot expects the columns in the prediction data to be the same as the original data. If a model uses the original variable plus the transformed variable, the prediction data must use the original feature name. DataRobot will calculate the derived features internally.

Availability in NextGen

You can perform transformations on dataset features from the following areas in NextGen:

Transform a feature

The feature transformation workflow below is the same across NextGen. To transform a feature:

  1. From the Features tile of either an experiment or the data explore page, do one of the following:

    • In the table, to the right of the feature you want to transform, click the icon.

    • Select the feature you want to transform and click Create feature transform.

  2. The options displayed in the resulting window are based on the original variable type of the feature:

      Element Description
    1 Transformation type Displays the new variable type of the feature after the transformation is performed.
    2 New feature name Provides a field to rename the new feature. By default, DataRobot uses the existing feature name with the new variable type appended.
    3 Create feature Creates the new feature. The new feature is then listed below the original.

      Element Description
    1 Transformation type Displays the new variable type of the feature after the transformation is performed.
    2 New feature name Provides a field to rename the new feature. By default, DataRobot uses the existing feature name with the new variable type appended.
    3 Create feature Creates the new feature. The new feature is then listed below the original.

      Element Description
    1 Transformation options Specifies a new feature type from the available variable types for the current feature using the dropdown. DataRobot performs specific transformations for numeric and categorial variable types.
    2 New feature name Provides a field to rename the new feature. By default, DataRobot uses the existing feature name with the new variable type appended.
    3 Create feature Creates the new feature. The new feature is then listed below the original.

    Date features allow you to select which date-specific derivations to apply, and whether the result should be considered a categorical or numeric value.

  3. Click Create feature. The transformed feature appears under the original feature. It can be included in any new feature lists and can also be used for modeling. When using a model that contains transformed features for predictions, DataRobot automatically includes the new feature in any uploaded dataset.

    You can create any number of transformations from the same feature. By default, DataRobot applies a unique name to each transformation. If you inadvertently create duplicate features, DataRobot marks them as such and ignores them in processing.