Evaluate¶

The Evaluate tabs provide key plots and statistics needed to judge and interpret a model’s effectiveness:

Leaderboard tab	Description	Source
Accuracy Over Space	Provides a spatial residual mapping within an individual model.	Validation, Cross-Validation, Holdout (selectable)
Accuracy over Time	Visualizes how predictions change over time.	Computed separately for each backtest and the Holdout fold and can be viewed in the UI. Plots can be computed on both Validation and Training data.
Advanced Tuning	Visualizes how predictions change over time.	Internal grid search set
Anomaly Assessment	Plots data for the selected backtest and provides SHAP explanations for up to 500 anomalous points.	Computed separately for each backtest and the Holdout fold and can be viewed in the UI. Plots can be computed on both Validation and Training data.
Anomaly over Time	Plots how anomalies occur across the timeline of your data.	Computed separately for each backtest and the Holdout fold and can be viewed in the UI. Plots can be computed on both Validation and Training data.
Confusion Matrix for multiclass projects	Compares actual data values with predicted data values in multiclass projects.	Validation, Cross-Validation, or Holdout (selectable). For binary classification projects, use the confusion matrix on the ROC Curve tab.
Feature Fit	Removed. See Feature Effects.
Forecasting Accuracy	Provides a visual indicator of how well a model predicts at each forecast distance in the project’s forecast window.	Computed separately for each backtest and the Holdout fold; only the validation subset of each fold is scored. Validation predictions are filtered by the forecast distance and the metrics are computed on the filtered predictions. UI/API does not provide access to individual backtests but rather to validation (backtest 0=most recent backtest), backtesting (averaged across all backtests), and Holdout.
Forecast vs Actual	Compares how different predictions behave at different forecast points to different times in the future.	Computed separately for each backtest and the Holdout fold and can be viewed in the UI. Plots can be computed on both Validation and training data.
Lift Chart	Depicts how well a model segments the target population and how capable it is of predicting the target.	Validation, Cross-Validation, Holdout (selectable)
Period Accuracy	View model performance over periods within the training dataset.	Validation, Holdout (selectable). Computed separately for each backtest and Holdout.
Residuals	Clearly visualizes the predictive performance and validity of a regression model.	Validation, Cross-Validation, Holdout (selectable)
ROC Curve	Explores classification, performance, and statistics related to a selected model at any point on the probability scale.	Validation data
Series Insights (clustering)	Provides information on the cluster to which each series belongs, along with series information, including rows and dates. Histograms for each cluster show the number of series, the number of total rows, and the percentage of the dataset that belongs to that cluster.	Computed for each series in the clustering backtest.
Series Insights (multiseries)	Provides series-specific information.	Computed separately for each backtest and the Holdout fold; only the validation subset of each fold is scored. Validation predictions are filtered by the forecast distance and the metrics are computed on the filtered predictions. UI/API does not provide access to individual backtests but rather to validation (backtest 0=most recent backtest), backtesting (averaged across all backtests), and Holdout.
Stability	Provides an at-a-glance summary of how well a model performs on different backtests.	Computed separately for each backtest and the Holdout fold; only the validation subset of each fold is scored.
Training Dashboard	Provides an understanding about training activity, per iteration, for Keras-based models.	Training, but validated on an internal holdout of the training data.

Evaluate¶

Was this page helpful?

Great! Let us know what you found helpful.

What can we do to improve the content?