Why ROC Curves in Scikit-Learn Have More Data Points Than Thresholds? Explaining the Underlying Mechanism
Receiver Operating Characteristic (ROC) curves are a cornerstone of binary classification evaluation, visualizing the trade-off between the True Positive Rate (TPR) and False Positive Rate (FPR) across different classification thresholds. A common point of confusion among practitioners is: Why do ROC curves generated by Scikit-Learn’s roc_curve function often have more data points than the number of unique predicted probabilities (or "thresholds") from the model?
At first glance, one might expect a 1:1 relationship between thresholds and ROC points—each threshold yields one (FPR, TPR) pair. However, Scikit-Learn’s implementation includes subtle mechanisms that result in more points than the number of unique predicted probabilities. In this blog, we’ll demystify this behavior by breaking down how Scikit-Learn computes ROC curves, exploring the role of unique probabilities, and clarifying why extra points appear.
Table of Contents#
- Understanding ROC Curves and Thresholds
- The Common Misconception: Thresholds vs. ROC Points
- Scikit-Learn’s ROC Curve Implementation: Under the Hood
- Why Extra Points? The Role of Unique Probabilities and "Infinity" Threshold
- Practical Example: Visualizing ROC Points and Thresholds
- Key Takeaways
- References
1. Understanding ROC Curves and Thresholds#
Before diving into Scikit-Learn’s implementation, let’s recap the basics of ROC curves and thresholds.
What is a ROC Curve?#
A ROC curve plots the True Positive Rate (TPR) on the y-axis against the False Positive Rate (FPR) on the x-axis for different classification thresholds.
-
TPR (Sensitivity/Recall): The proportion of actual positives correctly classified as positive:
-
FPR (1 - Specificity): The proportion of actual negatives incorrectly classified as positive:
Role of Thresholds#
Classification models (e.g., logistic regression) output probabilities for the positive class. A threshold converts these probabilities into class labels: samples with probability ≥ threshold are classified as positive; others as negative.
For example, a threshold of 0.5 means "classify as positive if the model is ≥50% confident." Changing the threshold shifts the balance between TPR and FPR:
- A higher threshold (e.g., 0.8) reduces FPR but may lower TPR (fewer positives).
- A lower threshold (e.g., 0.2) increases TPR but may raise FPR (more false positives).
Expected Relationship: Thresholds → ROC Points#
Intuition suggests a 1:1 relationship: each threshold should generate one (FPR, TPR) pair, and thus one ROC point. For example, if a model outputs 10 unique probabilities, we might expect 10 ROC points. But Scikit-Learn often returns more points. Why?
2. The Common Misconception: Thresholds vs. ROC Points#
The confusion arises from a misunderstanding of what constitutes a "threshold" in Scikit-Learn’s roc_curve function. Users often assume:
"The number of ROC points equals the number of unique predicted probabilities."
In reality, Scikit-Learn’s roc_curve includes additional thresholds to ensure the ROC curve starts at (0, 0) (no positives) and ends at (1, 1) (all positives). This leads to more ROC points than the number of unique predicted probabilities.
3. Scikit-Learn’s ROC Curve Implementation: Under the Hood#
To understand why ROC points outnumber unique probabilities, let’s dissect Scikit-Learn’s roc_curve workflow. Here’s a simplified step-by-step breakdown (based on the source code):
Step 1: Sort Predictions and Labels#
First, roc_curve sorts the predicted probabilities (y_score) in descending order, along with their corresponding true labels (y_true). This ensures we evaluate thresholds from the highest to lowest confidence.
Step 2: Compute Cumulative TP and FP#
Next, it calculates cumulative true positives (TP) and false positives (FP) as the threshold decreases. For example, with sorted probabilities [0.9, 0.7, 0.5, 0.3], lowering the threshold from 0.9 to 0.7 includes the next sample, and so on.
Step 3: Identify Unique Thresholds#
Scikit-Learn extracts unique predicted probabilities as candidate thresholds. Duplicate probabilities are merged to avoid redundant calculations (e.g., if 10 samples all have probability 0.5, they contribute one threshold).
Step 4: Prepend the "Infinity" Threshold#
Crucially, roc_curve adds an infinity threshold (np.inf) to the list of thresholds. This threshold ensures the ROC curve starts at (0, 0), representing the scenario where no samples are classified as positive (TP=0, FP=0).
Step 5: Optional: Drop Redundant Thresholds#
By default, roc_curve uses drop_intermediate=True, which removes thresholds that do not change the (FPR, TPR) pair. For example, if two consecutive thresholds yield the same FPR and TPR, one is dropped to simplify the curve.
Key Outcome#
The final list of thresholds includes:
- All unique predicted probabilities (sorted descending).
- An additional "infinity" threshold (
np.inf).
Thus, the number of thresholds (and ROC points) equals:
4. Why Extra Points? The Role of Unique Probabilities and "Infinity" Threshold#
The critical insight is:
ROC points = Number of thresholds = Number of unique predicted probabilities + 1 (for the infinity threshold).
Example: Unique Probabilities + Infinity Threshold#
Suppose we have 3 unique predicted probabilities: [0.8, 0.5, 0.2]. Scikit-Learn’s roc_curve will generate 4 thresholds:
[np.inf, 0.8, 0.5, 0.2].
Each threshold corresponds to one ROC point:
np.inf: Classify no samples as positive → (FPR=0, TPR=0).0.8: Classify samples ≥0.8 as positive → (FPR₁, TPR₁).0.5: Classify samples ≥0.5 as positive → (FPR₂, TPR₂).0.2: Classify samples ≥0.2 as positive → (FPR₃, TPR₃).
Thus, 4 thresholds → 4 ROC points—one more than the 3 unique probabilities.
What About the (1, 1) Endpoint?#
You might wonder: Does Scikit-Learn add a "negative infinity" threshold to force the curve to end at (1, 1)? In practice, this is rarely needed. When the threshold drops below the lowest predicted probability, all samples are classified as positive, resulting in (FPR=1, TPR=1). This is naturally captured by the lowest unique probability threshold.
5. Practical Example: Visualizing ROC Points and Thresholds#
Let’s test this with code. We’ll generate synthetic data, compute the ROC curve, and verify the number of points vs. unique probabilities.
Step 1: Generate Data#
import numpy as np
from sklearn.metrics import roc_curve
import matplotlib.pyplot as plt
# True labels (0s and 1s)
y_true = np.array([0, 1, 1, 0, 1, 0, 1, 0])
# Predicted probabilities (with duplicates and unique values)
y_score = np.array([0.3, 0.7, 0.7, 0.2, 0.9, 0.4, 0.9, 0.1])Step 2: Compute ROC Curve#
fpr, tpr, thresholds = roc_curve(y_true, y_score)Step 3: Analyze Outputs#
Let’s inspect the results:
print("Unique predicted probabilities:", np.unique(y_score))
print("Thresholds from roc_curve:", thresholds)
print("Number of ROC points (FPR/TPR):", len(fpr))
print("Number of thresholds:", len(thresholds))Output:#
Unique predicted probabilities: [0.1 0.2 0.3 0.4 0.7 0.9]
Thresholds from roc_curve: [inf 0.9 0.7 0.4 0.3 0.2 0.1]
Number of ROC points (FPR/TPR): 7
Number of thresholds: 7
Key Observations:#
- There are 6 unique predicted probabilities.
roc_curvereturns 7 thresholds (6 unique probabilities +np.inf).- The number of ROC points (
len(fpr) = 7) equals the number of thresholds.
Step 4: Plot the ROC Curve#
plt.plot(fpr, tpr, marker='o', label='ROC Curve')
plt.xlabel('False Positive Rate (FPR)')
plt.ylabel('True Positive Rate (TPR)')
plt.title('ROC Curve: Points vs. Thresholds')
plt.legend()
plt.grid(True)
plt.show()The plot will show 7 points, each corresponding to a threshold. The first point is (0, 0) (from np.inf), and subsequent points follow the sorted thresholds.
6. Key Takeaways#
- ROC Points = Thresholds: Scikit-Learn’s
roc_curvereturns the same number of ROC points as thresholds. - Thresholds Include
np.inf: To ensure the curve starts at (0, 0),roc_curveprependsnp.infto the list of thresholds. - Unique Probabilities + 1: The number of thresholds (and thus ROC points) equals the number of unique predicted probabilities plus 1 (for
np.inf). - Drop Intermediate Thresholds: The
drop_intermediate=Trueparameter (default) may reduce the number of points by removing redundant thresholds, but this does not change the core relationship.
7. References#
- Scikit-Learn Documentation:
sklearn.metrics.roc_curve - Fawcett, T. (2006). "An introduction to ROC analysis." Pattern Recognition Letters, 27(8), 861–874.
- Scikit-Learn Source Code:
roc_curveimplementation
By understanding Scikit-Learn’s inclusion of the np.inf threshold, you can confidently interpret ROC curves and avoid confusion about the number of points. The extra points are not a bug—they’re a deliberate design choice to ensure the curve is complete and interpretable!