December 13, 2021
New time series features¶
See details of the following new GA features:
- Segmented modeling for multiseries projects
- Time series data prep tool now GA, adds quality check
- High-resolution calendars now GA
- Restore features removed by reduction
- Text column support for multiseries projects
- Multiclass confusion matrix now supports backtests
- Accuracy Over Time performance improvements
See details of the following new public preview features:
Generally available features¶
The following new features are now generally available.
Segmented modeling for multiseries projects¶
No single model can handle extreme data diversity or can forecast the complexity of human buying patterns at a detailed level. Complex demand forecasting typically requires deep statistical know-how and unlimited budget to spend on lengthy development projects and big data architectures. Prior to the release of multiseries with segmented modeling, you would have to segment your datasets, set up a model factory to model each segment of your data, and then deploy a model for each segment. Now generally available, DataRobot provides segmented modeling, where you can build projects with up to 100 segments.
Segments are a group of series; each segment runs Autopilot and has its own Leaderboard. DataRobot then selects and prepares a champion model from each segment’s Leaderboard and feeds that champion to the project’s Combined Model. You can override DataRobot’s champion selection—the Combined Model updates with the new model information and the deployment is updated to reflect the change. With segmented modeling, all segments are represented in the Combined Model but it only represents a single deployment.
For details, see multiseries modeling with segmentation.
Time series data prep tool now GA, adds quality check¶
The time series data prep tool is now generally available. With this release, the tool adds a data quality check that ensures imputed features are not leaking the imputed target (only a potential problem for known in advance (KA) features). Any features identified as high or moderate risk for imputation leakage are removed from the set of KA features.
Additionally, when a deployment is created from a model that used a prepped dataset, the model package has the information necessary to apply the transformations to a prediction dataset originating in the AI Catalog.
For details, see the time series data prep tool documentation.
High-resolution calendars now GA¶
With this release, when uploading your own calendar file, you can now derive calendar event-related features at a much more granular, timestamp-based level. Additionally, you can specify durations to further highlight the event specificity. To ensure accuracy, DataRobot provides guardrails to support calendar-derived features based on calendar events that overlap. See the calendar file requirements for using this feature with Scoring Code (public preview) if your calendar has only full-day events.
For details, see calendar file information.
Restore features removed by reduction¶
The ability to restore derived features back into your modeling data, even if they are low impact, is now generally available. You can then create new feature lists that include them. As an improvement over the public preview version, you can click the index column to re-sort with leading restored features (which are also marked with an icon).
For full details, see the feature restoration documentation.
Text column support for multiseries projects¶
In addition to numeric and categorical types, you can now select a text column as the multiseries ID. Previously all variable types were initially accepted but could cause problems at build time. Now improved testing ensures only valid types are available for selection.
Multiclass confusion matrix now supports backtests¶
With this release, the Data Selection dropdown in the multiclass confusion matrix now allows you to base the display on an individual backtest, all backtests, or the holdout partition (if unlocked).
For details, see Confusion Matrix documentation.
Accuracy Over Time performance improvements¶
This release brings performance improvements to the Accuracy Over Time tab. The chart helps to visualize how predictions change over time, plotting predicted and actual values for selectable backtests, resolutions, and forecast distances. As a result of the complexity of the chart, the computation is extensive and, depending on dataset size, can require extensive internal resources. Now, computation optimization has resulted in faster performance and less load on resources.
For details, see Accuracy Over Time documentation.
Public preview features¶
The following features are part of the public preview program.
Scoring Code for time series¶
Scoring Code public preview capabilities for time series have expanded with this release. In addition to the blueprints and features supported in release 7.2, this release brings support for Forecast Distance (FD) splits and Weighted Rolling Windows.
If you want Scoring Code support for a project using calendars, and your calendar has only full-day events (such as holidays), ask your platform administrator to set the Disable High-Resolution Calendars for Time Series Projects feature flag for your account.
Time series Predictor support in the AI App Builder¶
Now available for public preview, you can build AI-powered Predictor applications for both multi- and single-series projects. In your time series deployment, click the actions menu and select Create Application. Once created, upload batch predictions to populate the new Time Series Forecasting widget, which allows you to navigate between multiple time unit resolutions, view calendar events (if uploaded), compare forecasted vs actuals for new data, and view insights for Prediction Explanations over time.
Time series fixed issues¶
The following customer-reported issues have been fixed since release 7.2.0.
TIME-9790: Fixes downsampled training predictions for Forecast Distance split models in non-downsampled time series projects.
TIME-9425: Fixes an issue that occasionally causes a blank page when accessing smart-sampled OTV projects with custom backtest settings.
TIME-9796: Fixes Forecast vs Actual chart crash on series change.