Time series (V7.1)¶
June 14, 2021
The DataRobot v7.1.0 release includes many new time series features, described below. See also details on other AutoML new features for more details.
New time series features¶
See details of the following new features, below:
- New badge identifies baseline models in time series projects
- Time series calendar improvements
- EWMA setting learns more from recent data
- Default to asynchronous partitioning for better accuracy
- Beta: Data prep improvements
- Beta: Cold start and partial history blueprints
- Beta: New Series humility rule
New badge identifies baseline models in time series projects¶
DataRobot now identifies which model is being used as the baseline model for time series projects with a badge on the leaderboard:
The baseline model is the model that uses the most recent value that matches the longest periodicity. That is, while a project could have multiple different naive predictions with different periodicity, DataRobot uses the longest naive predictions to compute the MASE score.
MASE is a measure of the accuracy of forecasts, and is a comparison of one model to a naive baseline model. It is one of the many selectable optimization metrics; this release also introduces access to documentation from the metric dropdown.
Time series calendar improvements¶
With this release, DataRobot offer calendar improvements that increase size limits and AI Catalog flexibility. Now, DataRobot supports file sizes up to 10MB for calendars used in time series projects. Additionally, calendars uploaded as a local file are automatically added to the AI Catalog. When files are in the catalog, you can view them and use them in a project. Or, you can download them, and, for example, edit and re-upload them. This is particularly useful for a generated calendar, as you can generate it in DataRobot, download and customize it, and re-upload it to the AI Catalog. In all cases, you can share any calendar file to other users in your organization.
EWMA setting learns more from recent data¶
This release introduces a setting for exponentially weighted moving averages (EWMA), available from the Advanced options link, that applies exponentially weighted moving average operations to features. EWMA places a greater weight and significance on the most recent data points, measuring trend direction over time, forcing more recent values to have more influence on the variance than older values.
Beta: Data prep improvements¶
Release 7.0 introduced a beta feature handling gaps when in time-based mode to allow datasets with irregular time steps. This release brings improvements to that feature, including allowing access to the tool from both the start screen and the AI Catalog.
Default to asynchronous partitioning for better accuracy¶
In time series v6.3 projects, asynchronous partitioning was an optional capability but is now the default behavior. With it, DataRobot automatically adjusts backtests so that they sufficiently cover important events and are representative of your data. Backtests are no longer generated based purely on data length, but are customized to the target so that they highlight and/or include specific regions of interest, for example, seasonal events, holidays, or regular anomalies. In other words, this capability identifies areas that may have insufficient backtest coverage and then improves the data that is sampled for each backtest by automatically adjusting the bounds of the training and validation partitions. The previous ability to manually customize backtest start and end dates is still available.
Beta: Cold start and partial history blueprints¶
"Cold start" is the ability to model on series that were not seen in the training data; partial history refers to predictions datasets with series history that is only partially known (historical rows are partially available within the feature derivation window). While some blueprints are designed to predict on new series given some history, others are not. This can lead to suboptimal predictions and so DataRobot would error when making predictions for a series for which the full history was not provided. (The full history was needed to derive the features for specific forecast points.) With this release, time series introduces blueprints optimized for cold start and also for partial history modeling. The new blueprints are run as part of Autopilot and are available for multiseries regression projects.
Set the Advanced Option to includes models that support partial history datasets.
Beta: "New Series" humility rule¶
New series humility rules were introduced in v7.0.0, creating rules that triggered off of series that were unseen in training data. With this release, and available as a beta feature, you can now set the humility rule to use a replacement model from the model registry instead of the Leaderboard. This decouples the model from a specific project and allows you to use model packages. Using a backup model from any compatible project provides for more flexibility (compatibility means using the same target, date/time partitioning column, feature derivation window, forecast distances, series name, etc.) This feature requires access to MLOps.
Time series fixed issues¶
The following issues have been fixed since release 7.0.2.
Time series¶
-
TIME-7599: Fixes an issue previously causing time series projects using a catalog dataset to fail if the dataset had been deleted after project creation and before the feature derivation process.
-
TIME-7800: Fixes broken model export for Eureqa GAM models.
-
TIME-7847: Fixes Prediction Explanation computation for time series bulk predictions launched with the Prediction API.
-
TIME-8176: Fixes an issue when Prediction Explanations previously failed to compute in some cases with New Series Modelers.
-
TIME-8425: The Anomaly assessment records route now filters output properly when backtest 0 is specified as the filtering condition.
All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.