The ROC curve visualization (on the ROC Curve tab) helps you explore classification, performance, and statistics for a selected model. ROC curves plot the true positive rate against the false positive rate for a given data source.
ROC curve explained¶
The two important characteristics of the curve to consider are the area under the curve (AUC) and the shape of the curve.
The curve is highlighted with the following elements:
- Circle—an indicator of the new threshold value. Each time you set a new display threshold, the position of the circle on the curve changes.
- Intercept lines—gray lines that provide a visual reference for the selected threshold.
ROC curve shape¶
Use the ROC curve to assess model quality. The curve, drawn based on each value in the dataset, plots the true positive rate against the false positive rate. Some takeaways from an ROC curve:
An ideal curve grows quickly for small x-values, and slows for values of x closer to 1.
The curve illustrates the tradeoff between sensitivity and specificity. An increase in sensitivity results in a decrease in specificity.
A "perfect" ROC curve yields a point in the top left corner of the chart (coordinate (0,1)), indicating no false negatives and no false positives (a high true positive rate and a low false positive rate).
The closer the curve comes to the 45-degree diagonal of the ROC space, the less accurate the model and closer it is to a random assignment model.
The shape of the curve is determined by the overlap of the classification distributions.
Area under the ROC curve¶
The AUC (area under the curve) is literally the lower-right area under the ROC Curve.
AUC does not display automatically in the Metrics pane. Click Select metrics and select Area Under the Curve (AUC) to display it.
AUC is a metric for binary classification that considers all possible thresholds and summarizes performance in a single value, reported in the bottom right of the graph. The larger the area under the curve, the more accurate the model, however:
An AUC of 0.5 suggests that predictions based on this model are no better than a random guess.
An AUC of 1.0 suggests that predictions based on this model are perfect, and because a perfect model is highly uncommon, it is likely flawed (target leakage is a common cause of this result).
StackExchange provides an excellent explanation of AUC.
Kolmogorov-Smirnov (KS) metric¶
For binary classification projects, the KS optimization metric measures the maximum distance between two non-parametric distributions.
The KS metric evaluates and ranks models based on the degree of separation between true positive and false positive distributions.
The KS metric does not display automatically in the Metrics pane. Click Select metrics and select Kolmogorov-Smirnov Score to display it.
For a complete description of the the Kolmogorov–Smirnov test (K–S test or KS test), see the Wikipedia article on the topic.