Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Modeling

When a spatial structure is present in the input dataset, Location AI’s modeling enhancements expand traditional automated feature engineering and improve model options. Location AI accomplishes this using several targeted techniques including:

  • Automated feature engineering of location variable geometric properties.
  • Creation of derived features from spatially lagged variables.
  • Feature derivation characterizing spatial hotspots, coldspots, and transitions.

Automated location feature engineering

Location AI’s ability to ingest, autorecognize, and transform geospatial data unlocks powerful capabilities for DataRobot model blueprints. For example, geometric properties associated with row-level geometries can be powerful predictors in machine learning models. Location AI unlocks this potential in geospatial data by automatically deriving features from the properties of the input geometries. DataRobot derives features for the following geometric properties:

  • MultiPoints

    • Centroid
  • Lines/MultiLines

    • Centroid
    • Length
    • Minimum bounding rectangle area
  • Polygons/MultiPolygons

    • Centroid
    • Perimeter
    • Area
    • Minimum bounding rectangle area

As with DataRobot’s automated derivation of date features, automatically derived geometry features are displayed within the Data tab as a child feature of the parent "Location" type feature.

Derived spatial lag features

Spatially lagged features are derived to gain insight into the spatial structure of the data (i.e., spatial autocorrelation) to help inform DataRobot models of spatial dependence patterns. Access the Location AI Spatial Neighborhood Featurizer by searching the Leaderboard for models that include a spatial featurizer. Expand the model and view the blueprint to access the individual tasks.

Location AI implements several techniques for automatically deriving spatially lagged features from the input dataset, including:

  • Spatial Lag: A k-nearest neighbor approach to calculate mean neighborhood values of numeric features at varying spatial lags and neighborhood sizes.

  • Spatial Kernel: Characterizes spatial dependence structure using a spatial kernel neighborhood technique. This technique characterizes spatial dependence structure for all numeric variables using varying kernel sizes, weighting by distance.

Derived local autocorrelation features

In addition to capturing spatial dependence structure in neighborhood features, Location AI uses local indicators of spatial association to capture hot and cold spots of spatial similarity within the context of the entire input dataset. The Spatial Neighborhood Featurizer calculates neighborhood indicators of association for all non-target numeric variables. The derived features characterize the relative magnitude of local spatial dependence in the input dataset. Features derived in this manner can help present particularly impactful local spatial dependence structures to DataRobot models, improving model accuracy where hot spots and cold spots or abrupt transitions in feature values are present.


Updated February 5, 2024