The Period Accuracy insight is off by default. この機能を有効にする方法については、DataRobotの担当者または管理者にお問い合わせください。
In some use cases, certain time periods can have more significance than others. This is particularly true for financial markets—for example, a trader may only be interested in seeing the performance of a model over the first 4 hours of each trading day. Period Accuracy gives you the ability to specify which are the more important periods within your training dataset, which DataRobot can then provide aggregate accuracy metrics for and surface results on the Leaderboard.
Using a selected optimization (accuracy) metric, you can use the Period Accuracy insight to compare these specified periods against the metric score of the model as a whole. In the example above, seeing the RSME for the validation period of a model does not provide much insight into the performance of that model when it really matters most to the trader.
To use the insight, simply:
- Upload a period file and import it to a project after models are built.
- Set filters for calculating period performance.
The insight is available for both OTV and single- and multiseries time series projects in Evaluate > Period Accuracy.
Create a period file¶
The first step in using Period Accuracy is to create a period file. Similar to calendar files, the period file indicates the name of the periods, its start date/time (and by that, its duration). Unlike calendar files, which support ranges, the period file is a two-column CSV that includes:
Column 1: The date/time column.
This is the feature used to build the project; its label must match the name of the feature exactly. The data populating the date/time feature column should represent all the time steps you want to visualize in the insight. For example, if the project has daily data from January 30, 2022 through February 8, 2023, and you want to visualize all of that data, the first column would contain 374 entries, one per date in that range.
Column 2: The period column.
The period column represents how you would like to group the data in the insight—it represents the core of what the insight should visualize, giving more information about the accuracy of the model within the defined subset of the data, so define it based on how you want to understand your data. In the above example, you could:
Mark all dates in January as members of the January bucket by entering the string
Januaryin column 2 for every applicable date. Next, mark all dates in February as
Group by weekday by labeling each Sunday with the string
Sunday, each Monday with the string
Represent dates corresponding to Monday through Friday as the string
weekdayand the dates corresponding to Saturday and Sunday as
Once the period file is created, save it locally or upload it to the AI Catalog.
Time steps in a period file¶
Defining specific time periods within a date feature is dependent on the granularity of your data (e.g., you need hourly data to view hourly predictions). To show results that match data granularity, add multiples rows in the period file to match the times of interest. 例：
Your date/time feature is
date and you have hourly data for each day. You are interested in sales between 11:00am and 1:00pm each weekday. Your period file would look like:
Generate Period Accuracy¶
Period Accuracy must be computed for each model in a project. However, once a period file is uploaded to one model in the project, it is available to all models. You can upload multiple period files to a project, which may be useful for examining data in different ways (for example, each day, weekday vs weekend, etc.).
To view insights, open a model's Period Accuracy tab and, using the dropdowns, set filters for calculating period performance. Only project-applicable filters are visible.
|期間ファイル||Select a period file. From there, you can also:
|バックテスト||Select the backtest to display results for. DataRobotでは、プロジェクトの構築時にすべてのバックテストが実行されますが、バックテストの期間インサイトを表示する前に、バックテストのモデルを個別にトレーニングし、その検定の予測を計算する必要があります。 If you select a backtest that is not yet calculated, DataRobot will prompt to run calculations.|
|Series to plot (multiseries only)||If the project is multiseries, select a series to plot.|
|予測距離（時系列および複数系列のみ）||Set the window of time to base the visualization on. See more details in Accuracy Over Time.|
Click Compute period insights to start calculations. Once computed, changing any filter—other than series, where applicable—requires rerunning the calculations.
Interpret Period Accuracy¶
計算が完了すると、検定データに基づいた結果が表で示されます。 You can also generate over time histograms.
|期間名||The name of the period, identified by column 2 in the period file.|
|観測値||The number of data points that fall within the defined period. The period is based on the applied period file and filters (backtest, series, and forecast distance, as applicable).|
|最も早い / 遅い日付||The first and last timestamp found in the period.|
|予測値/実測値||The average predicted and actual values observed in the selected backtest.|
||The performance of the observation for the period. In other words, if you were to create a project with just this period in the validation data, the displayed value is the value that would display on the Leaderboard. The red/green values below the score indicate the percentage variance from the Leaderboard score. Note that "preferedness" of a score (red/green, up/down) is dependent on the metric type.|
|時間経過に伴うプロット||A link to display the Over Time chart for the selected period. Click and scroll down to see the histogram.|
* You can change the reported metric using the Leaderboard dropdown:
When you click Plot over time, the histogram shows a point for each observation in the selected period, visualizing actual and predicted values. This helps to understand how the model performs on each row of the period of interest.
- Only the first 1000 series are computed.
- Maximum period file size is 5MB. An unlimited number of files are allowed.
- Insight export is not supported.