Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ROC curves upside down#31683

pedro-w started this conversation inGeneral
Jun 30, 2025· 0 comments
Discussion options

I've been fitting an SVC to experimental data and came across an issue where RocCurveDisplay sometimes was putting the curves upside down.
I know this happens if the positive label gets mixed up between fitting and plotting but here it is happening randomly.

It only happens when I specifyprobability=True to the model which the docs say requires randomness to compute the probabilities. I did a bit more investigation to get to a much smaller code example. I found that passing in aRandomState gives repeatable behaviour with some random states plot the ROC curve correctly and some wrongly.
From this I found thatpredict_proba seems to do one of two things depending on the random state and this leads to one of the two ROC curves.
I couldn't get any further because it seems very data dependent; just deleting a few data rows made the problem go away but I was able to manually truncate my data to 2 decimal places and nothing changed. Sorting the data by increasing value also seems to be necessary.

Here is an illustration - the first column is a 'correct' ROC curve and the corresponding probabilities, and the second is a 'wrong' one. Note that the y-scale of probabilities is very different.

image

Any comments welcome, especially if I am doing something daft here.

Thanks.

Here is the code, from a Jupyter notebook

from matplotlib import pyplot as pltfrom sklearn.metrics import  RocCurveDisplayfrom sklearn.svm import SVCimport numpy as npfrom numpy.random import RandomStateimport pandas as pdrndst1 = RandomState(1)rndst2 = RandomState(2)def load_data():    negs = [        0.14,        0.59,        1.13,        2.60,        2.92,        2.98,        3.99,        4.08,        4.43,        7.73,        10.98,    ]    poss = [        1.84,        2.15,        2.73,        3.46,        3.59,        3.63,        3.67,        3.75,        4.49,        5.22,        5.33,        5.35,        5.51,        5.69,        5.72,        5.90,        5.98,        6.29,        7.96,        7.98,        8.21,        8.62,        9.27,        10.88,        11.84,        13.11,        19.12,        20.09,        21.99,        25.00,        35.00,    ]    return pd.DataFrame(        {            "Class": np.concat([["NEG"] * len(negs), ["POS"] * len(poss)]),            "ScoreA": np.concat([negs, poss]),        }    )model1 = SVC(probability=True, kernel="rbf", random_state=rndst1)model2 = SVC(probability=True, kernel="rbf", random_state=rndst2)data = load_data()data.sort_values(by="ScoreA", inplace=True)Xs = data[["ScoreA"]]ys = data["Class"]model1.fit(Xs, ys)model2.fit(Xs, ys)probs1 = model1.predict_proba(Xs)[:, 1]probs2 = model2.predict_proba(Xs)[:, 1]probs=pd.DataFrame({"probs1":probs1, "probs2":probs2})display(probs.describe().style.format(precision=2))fig, axs = plt.subplots(2, 2, figsize=(10,10))fig.tight_layout()RocCurveDisplay.from_predictions(y_pred=probs1, y_true=ys, pos_label="POS", ax=axs[0,0])RocCurveDisplay.from_predictions(y_pred=probs2, y_true=ys, pos_label="POS", ax=axs[0,1])xv = np.arange(len(probs1))axs[1,0].plot(xv, probs1)axs[1,1].plot(xv, probs2)

I am using scikit-learn 1.5.2, numpy 2.1.3 and python 3.12.10.

You must be logged in to vote

Replies: 0 comments

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
General
Labels
None yet
1 participant
@pedro-w

[8]ページ先頭

©2009-2025 Movatter.jp