Time series (V7.2)¶
September 13, 2021
The DataRobot v7.2.0 release includes many new time series features, described below. See also details of Release 7.2.0 in the AutoML and MLOps release notes.
New time series features¶
See details of the following new GA features:
- Nowcasting predicts current values
- Partial history blueprints run as part of Autopilot
- External model comparison
- Remove redundant features from Feature Impact
- New series humility rule
See details of the following new preview features:
- Time series data prep tool
- Restore features pruned during derivation
- High-resolution calendars
- Scoring Code for time series
Generally available features¶
The following new features are now generally available.
Nowcasting to predict current values¶
Available as a GA feature, nowcasting is a method of time series modeling that predicts the current value of a target based on past and present data—a forecast window in which the start and end times are 0 (now). In other words, based on the current input values and recent history, what is the target right now? (Forecasting, by contrast, predicts future values based on past and present data.) With release 7.2, DataRobot automatically applies forecast window (FW) settings of [0, 0] for the forecast start and end times when nowcasting is selected at project start. Additionally, the Feature Derivation Window (FDW) end is set at a single time step prior to the current time step, allowing derivation of additional features for the target without risking target leakage.
For details, see nowcasting.
Partial history blueprints run in Autopilot¶
Previously available as a preview feature, the advanced option to allow partial history in predictions has been enhanced for general availability. Now, Autopilot runs approximately 40% fewer models, eliminating those models that produce less accurate results in the event of partial history availability. When enabled, Autopilot supports New Series blueprints as well. In addition, the feature now supports:
- single series projects
- row-based mode models
- classification projects
- projects with “do not derive target” enabled.
For details, see time series advanced options.
External model comparison improvements¶
Support for baseline accuracy comparison allows you to upload predictions from a non-DataRobot time series model and compare the predictions with DataRobot's time series models. With improvements to the preview version, the feature now provides the ability to specify multiple forecasts distances and also provides API support. With existing metrics that have been redesigned to scale to the external baseline (uploaded predictions), you can now get an at-a-glance accuracy measure and comparison from the Leaderboard.
For details, see external prediction comparison.
Remove redundant features from Feature Impact¶
Previously only available in AutoML, DataRobot now performs a feature redundancy check—and provides an option to remove redundant features—when calculating Feature Impact for time series projects. This is important as you may want to create one or more feature lists based on the top feature importances for a model. Note that if they are important to the time-series project, some top impactful, naive features may be retained.
For details, see the Feature Impact documentation.
New series humility rule¶
Now generally available, you can set a humility rule that uses a replacement model from the model registry instead of the Leaderboard. This decouples the model from a specific project and allows you to use model packages. Using a backup model from any compatible project provides for more flexibility (compatibility means using the same target, date/time partitioning column, feature derivation window, forecast distances, series name, etc.) This feature requires access to MLOps.
For details, see multiseries series humility rules.
Preview features¶
The following features are part of the preview program.
Time series data prep tool¶
With version 7.2, the time series data prep tool, available from both the time series Start page or within the AI Catalog, brings three substantial improvements:
-
Feature aggregation and imputation on text target features, not just numeric and categorical.
-
Guardrails to alert you when DataRobot has imputed more than 50% of target value (which could impact model accuracy).
-
Ability to apply data prep transformations to prediction datasets from the Leaderboard.
Restore features pruned during derivation¶
As part of the time series functionality, DataRobot generates derived features and then runs a feature reduction algorithm, removing features it detects as low impact. There may, however, be features that you want included in the generated feature lists or evaluated for feature impact. With this release, once EDA2 completes, you can add these features back into your available derived modeling data and create new feature lists that include them.
GA documentation (as of release 7.3)
High-resolution calendars¶
You can now derive calendar event-related features at a much more granular, timestamp-based level. Starting from a timestamp instead of the start-of-day helps to capture the effect of a specified calendar event, such as promotional sales that lasts from 9:30am to 11:30am in the morning. Additionally, you can now specify durations to further highlight the event specificity. To ensure accuracy, DataRobot provides guardrails to support calendar-derived features based on calendar events that overlap.
Scoring Code for time series¶
Now available for preview, you can export time series models in a Java-based Scoring Code package. Scoring Code is a portable, low-latency method of utilizing DataRobot models outside of the DataRobot application. The following blueprints may produce scoring code:
- AUTOARIMA with Fixed Error Terms
- ElasticNet Regressor (L2 / Gamma Deviance)
- ElasticNet Regressor (L2 / Poisson Deviance)
- Eureqa Generalized Additive Model
- eXtreme Gradient Boosted Trees Regressor
- eXtreme Gradient Boosted Trees Regressor with Early Stopping
- eXtreme Gradient Boosting on ElasticNet Predictions
- Light Gradient Boosting on ElasticNet Predictions
- Performance Clustered Elastic Net Regressor with Forecast Distance Modeling
- Performance Clustered eXtreme Gradient Boosting on Elastic Net Predictions
- RandomForest Regressor
- Ridge Regressor using Linearly Decaying Weights with Forecast Distance Modeling
- Ridge Regressor with Forecast Distance Modeling
- Vector Autoregressive Model (VAR) with Fixed Error Terms
The following capabilities are currently covered:
- Calendars (non-high resolution)
- Cross-Series
- Zero Inflated / Naive Binary
- Nowcasting (Historical Range Predictions)
- Blind History Gaps
Time series fixed issues¶
The following issues have been fixed since release 7.1.0.
-
TIME-8176: Fixes an issue when Prediction Explanations failed to compute with new series modelers.
-
TIME-8425: Anomaly assessment records are now filtered properly when backtest 0 is specified as filtering condition.
-
TIME-8992: Fixes an issue with custom feature lists for KIA new series modelers.
-
TIME-9074: Fixes an issue that caused an error in the computation of a valid forecast point range due to the incorrect minimum number of rows count required to perform the validation.
All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.