Date/time partitioning advanced options¶
DataRobot's default partitioning settings are optimized for the specific dataset and target feature selected. For most users, the defaults that DataRobot selects provide optimized modeling. See the time series customization documentation to gain an understanding of how DataRobot calculates partitions and advice on setting window sizes.
If you do choose to change partitioning, the content below describes changing backtest partitions.
Expand the Show Advanced options link to set details of the partitioning method. When you enable time-aware modeling, Advanced options opens to the date/time partitioning method by default. The Backtesting section of date/time partitioning provides tools for configuring backtests for your time-aware projects.
DataRobot detects the date and/or time format (standard GLIBC strings) for the selected feature. Verify that it is correct. If the format displayed does not accurately represent the date column(s) of your dataset, modify the original dataset to match the detected format and re-upload it.
Configure the backtesting partitions. You can set them from the dropdowns (applies global settings) or by clicking the bars in the visualization (applies individual settings). Individual settings override global settings. Once you modify settings for an individual backtest, any changes to the global settings are not applied to the edited backtest.
Date/date range representation
DataRobot uses date points to represent dates and date ranges within the data, applying the following principles:
All date points adhere to ISO 8601, UTC (e.g., 2016-05-12T12:15:02+00:00), an internationally accepted way to represent dates and times, with some small variation in the duration format. Specifically, there is no support for ISO weeks (e.g., P5W).
Models are trained on data between two ISO dates. DataRobot displays these dates as a date range, but inclusion decisions and all key boundaries are expressed as date points. When you specify a date, DataRobot includes start dates and excludes end dates.
Once changes are made to formats using the date partitioning column, DataRobot converts all charts, selectors, etc. to this format for the project.
Set backtest partitions globally¶
The following table describes global settings:
|Number of backtests
|Configures the number of backtests for your project, the time-aware equivalent of cross-validation (but based on time periods or durations instead of random rows).
|Configures the size of the testing data partition.
|Configures spaces in time, representing gaps between model training and model deployment.
|Sets whether to use duration or rows as the basis for partitioning, and whether to use random or latest data.
See the table above for a description of the backtesting section's display elements.
When changing partition year/month/day settings, note that the month and year values rebalance to fit the larger class (for example, 24 months becomes two years) when possible. However, because DataRobot cannot account for leap years or days in a month as it relates to your data, it cannot convert days into the larger container.
Set the number of backtests¶
You can change the number of backtests, if desired. The default number of backtests is dependent on the project parameters, but you can configure up to 20. Before setting the number of backtests, use the histogram to validate that the training and validation sets of each fold will have sufficient data to train a model. Requirements are:
For OTV, backtests require at least 20 rows in each validation and holdout fold and at least 100 rows in each training fold. If you set a number of backtests that results in any of the partitions not meeting that criteria, DataRobot only runs the number of backtests that do meet the minimums (and marks the display with an asterisk).
For time series, backtests require at least 4 rows in validation and holdout and at least 20 rows in the training fold. If you set a number of backtests that results in any of the partitions not meeting that criteria, the project could fail. See the time series partitioning reference for more information.
By default, DataRobot creates a holdout fold for training models in your project. In some cases, however, you may want to create a project without a holdout set. To do so, uncheck the Add Holdout fold box. If you disable the holdout fold, the holdout score column does not appear on the Leaderboard (and you have no option to unlock holdout). Any tabs that provide an option to switch between Validation and Holdout will not show the Holdout option.
If you build a project with a single backtest, the Leaderboard does not display a backtest column.
Set the validation length¶
To modify the duration, perhaps because of a warning message, click the dropdown arrow in the Validation length box and enter duration specifics. Validation length can also be set by clicking the bars in the visualization. Note the change modifications make in the testing representation:
Set the gap length¶
(Optional) Set the gap length from the Gap Length dropdown. Initially set to zero, DataRobot does not process a gap in testing. When set, DataRobot excludes the data that falls in the gap from use in training or evaluation of the model. Gap length can also be set by clicking the bars in the visualization.
Set rows or duration¶
By default, DataRobot ensures that each backtest has the same duration, either the default or the values set from the dropdown(s) or via the bars in the visualization. If you want the backtest to use the same number of rows, instead of the same length of time, use the Equal rows per backtest toggle:
Time series projects also have an option to set row or duration for the training data, used as the basis for feature engineering, in the training window format section.
Once you have selected the mechanism/mode for assigning data to backtests, select the sampling method, either Random or Latest, to select how to assign rows from the dataset.
Setting the sampling method is particularly useful if a dataset is not distributed equally over time. For example, if data is skewed to the most recent date, the results of using 50% of random rows versus 50% of the latest will be quite different. By selecting the data more precisely, you have more control over the data that DataRobot trains on.
Change backtest partitions¶
If you don't modify any settings, DataRobot disperses rows to backtests equally. However, you can customize an individual backtest's gap, training, validation, and holdout data by clicking the corresponding bar or the pencil icon () in the visualization. Note that:
You can only set holdout in the Holdout backtest ("backtest 0"), you cannot change the training data size in that backtest.
If, during the initial partitioning detection, the backtest configuration of the ordering (date/time) feature, series ID, or target results in insufficient rows to cover both validation and holdout, DataRobot automatically disables holdout. If other partitioning settings are changed (validation or gap duration, start/end dates, etc.), holdout is not affected unless manually disabled.
When Equal rows per backtest is checked (which sets the partitions to row-based assignment), only the Training End date is applicable.
When Equal rows per backtest is checked, the dates displayed are informative only (that is, they are approximate) and they include padding that is set by the feature derivation and forecast point windows.
Edit individual backtests¶
Regardless of whether you are setting training, gaps, validation, or holdout, elements of the editing screens function the same. Hover on a data element to display a tooltip that reports specific duration information:
Click a section (1) to open the tool for modifying the start and/or end dates; click in the box (2) to open the calendar picker.
Triangle markers provide indicators of corresponding boundaries. The larger blue triangle () marks the active boundary—the boundary that will be modified if you apply a new date in the calendar picker. The smaller orange triangle () identifies the other boundary points that can be changed but are not currently selected.
The current duration for training, validation, and gap (if configured) is reported under the date entry box:
Once you have made changes to a data element, DataRobot adds an EDITED label to the backtest.
There is no way to remove the EDITED label from a backtest, even if you manually reset the durations back to the original settings. If you want to be able to apply global duration settings across all backtests, copy the project and restart.
Modify training and validation¶
To modify the duration of the training or validation data for an individual backtest:
- Click in the backtest to open the calendar picker tool.
- Click the triangle for the element you want to modify—options are training start (default), training end/validation start, or validation end.
- Modify dates as required.
A gap is a period between the end of the training set and the start of the validation set, resulting in data being intentionally ignored during model training. You can set the gap length globally or for an individual backtest.
To set a gap, add time between training end and validation start. You can do this by ending training sooner, starting validation later or both.
Click the triangle at the end of the training period.
Click the Add Gap link.
DataRobot adds an additional triangle marker. Although they appear next to each other, both the selected (blue) and inactive (orange) triangles represent the same date. They are slightly spaced to make them selectable.
(Optional) Set the Training End Date using the calendar picker. The date you set will be the beginning of the gap period (training end = gap start).
Click the orange Validation Start Date marker; the marker changes to blue, indicating that it's selected.
(Optional) Set the Validation Start Date (validation start = gap end).
The gap is represented by a yellow band; hover over the band to view the duration.
Modify the holdout duration¶
To modify the holdout length, click in the red (holdout area) of backtest 0, the holdout partition. Click the displayed date in the Holdout Start Date to open the calendar picker and set a new date. If you modify the holdout partition and the new size results in potential problems, DataRobot displays a warning icon next to the Holdout fold. Click the warning icon () to expand the dropdown and reset the duration/date fields.
Lock the duration¶
You may want to make backtest date changes without modifying the duration of the selected element. You can lock duration for training, for validation, or for the combined period. To lock duration, click the triangle at one end of the period. Next, hold the Shift key and select the triangle at the other end of the locked duration. DataRobot opens calendar pickers for each element:
Change the date in either entry. Notice that the other date updates to mirror the duration change you made.
Interpret the display¶
The date/time partitioning display represents the training and validation data partitions as well as their respective sizes/durations. Use the visualization to ensure that your models are validating on the area of interest. The chart shows, for each backtest, the specific time period of values for the training, validation, and if applicable, holdout and gap data. Specifically, you can observe, for each backtest, whether the model will be representing an interesting or relevant time period. Will the scores represent a time period you care about? Is there enough data in the backtest to make the score valuable?
The following table describes elements of the display:
|The binned distribution of values (i.e., frequency), before downsampling, across the dataset. This is the same information as displayed in the feature’s histogram.
|Available Training Data
|The blue color bar indicates the training data available for a given fold. That is, all available data minus the validation or holdout data.
|Primary Training Data
|The dashed outline indicates the maximum amount of data you can train on to get scores from all backtest folds. You can later choose any time window for training, but depending on what you select, you may not then get all backtest scores. (This could happen, for example, if you train on data greater than the primary training window.) If you train on data less than or equal to the Primary Training Data value, DataRobot completes all backtest scores. If you train on data greater than this value, DataRobot runs fewer tests and marks the backtest score with an asterisk (*). This value is dependent on (changed by) the number of configured backtests.
|A gap between the end of the training set and the start of the validation set, resulting in the data being intentionally ignored during model training.
|A set of data indicated by a green bar that is not used for training (because DataRobot selects a different section at each backtest). It is similar to traditional validation, except that it is time based. The validation set starts immediately at the end of the primary training data (or the end of the gap).
|Holdout (only if Add Holdout fold is checked)
|The reserved (never seen) portion of data used as a final test of model quality once the model has been trained and validated. When using date/time partitioning, holdout is a duration or row-based portion of the training data instead of a random subset. By default, the holdout data size is the same as the validation data size and always contains the latest data. (Holdout size is user-configurable, however.)
|Time- or row-based folds used for training models. The Holdout backtest is known as "backtest 0" and labeled as Holdout in the visualization. For small datasets and for the highest-scoring model from Autopilot, DataRobot runs all backtests. For larger datasets, the first backtest listed is the one DataRobot uses for model building. Its score is reported in the Validation column of the Leaderboard. Subsequent backtests are not run until manually initiated on the Leaderboard.
Additionally, the display includes Target Over Time and Observations histograms. Use these displays to visualize the span of times where models are compared, measured, and assessed—to identify "regions of interest." For example, the displays help to determine the density of data over time, whether there are gaps in the data, etc.
In the displays, the green represents the selection of data that DataRobot is validating the model on. The "All Backtest" score is the average of this region. The gradation marks each backtest and its potential overlap with training data.
Study the Target Over Time graph to find interesting regions where there is some data fluctuation. It may be interesting to compare models over these regions. Use the Observations chart to determine whether, roughly speaking, the amount of data in a particular backtest is suitable.
Finally, you can click the red, locked holdout section to see where in the data the holdout scores are being measured and whether it is a consistent representation of your dataset.
Backtesting is conceptually the same as cross-validation in that it provides the ability to test a predictive model using existing historical data. That is, you can evaluate how the model would have performed historically to estimate how the model will perform in the future. Unlike cross-validation, however, backtests allow you to select specific time periods or durations for your testing instead of random rows, creating in-sequence, instead of randomly sampled, “trials” for your data. So, instead of saying “break my data into 5 folds of 1000 random rows each,” with backtests you say “simulate training on 1000 rows, predicting on the next 10. Do that 5 times.” Backtests simulate training the model on an older period of training data, then measure performance on a newer period of validation data. After models are built, through the Leaderboard you can change the training range and sampling rate. DataRobot then retrains the models on the shifted training data.
If the goal of your project is to predict forward in time, backtesting gives you a better understanding of model performance (on a time-based problem) than cross-validation. For time series problems, this equates to more confidence in your predictions. Backtesting confirms model robustness by allowing you to see whether a model consistently outperforms other models across all folds.
The number of backtests that DataRobot defaults to is dependent on the project parameters, but you can configure the build to include up to 20 backtests for additional model accuracy. Additional backtests provide you with more trials of your model so that you can be more sure about your estimates. You can carefully configure the duration and dates so that you can, for example, generate “10 two-month predictions.” Once configured to avoid specific periods, you can ask “Are the predictions similar?” or for two similar months, “Are the errors the same?”
Large gaps in your data can make backtesting difficult. If your dataset has long periods of time without any observed data, it is prudent to review where these gaps fall in your backtests. For example, if a validation window has too few data points, choosing a longer data validation window will ensure more reliable validation scores. While using more backtests may give you a more reliable measure of model performance, it also decreases the maximum training window available to the earliest backtest fold.
Configuring gaps allows you to reproduce time gaps usually observed between model training and model deployment (a period for which data is not to be used for training). It is useful in cases where, for example:
- Only older data is available for training (because ground truth is difficult to collect).
- When a model’s validation and subsequent deployment takes weeks or months.
- To deliver predictions in advance for review or actions.
A simple example: in insurance, it can take roughly a year for a claim to "develop" (the time between filing and determining the claim payout). For this reason, an actuary is likely to price 2017 policies based on models trained with 2015 data. To replicate this practice, you can insert a one-year gap between the training set and the validation set. This ensures that model evaluation is more correct. Other examples include when pricing needs regulator approval, retail sales for a seasonal business, and pricing estimates that rely on delayed reporting.