Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Is cost sensitive learning compatible with probability calibration ?#31329

Unanswered
lcrmorin asked this question inQ&A
Discussion options

Regarding the problem of class imbalance, it seems the consensus is now to usecost sensitive learning. That is to use cost imbalance (instead of class imbalance) as weights in the evaluation metric and then use the weights for learning. The idea is to get closer to real world metrics ($$$). I understand and agree with this.

However there is also the problem of probability calibration:calibration. To me it seems that using cost imbalance would break the calibration in probability. Am I right in thinking this ?

I would be tempted to fit a model with weights then use a probability calibration approach. But I am not sure that it would works as expected: typically, wouldn't the probability calibration approach need to be weighted too ?

  • If the probability calibration approach is not weighted: doesn't that defeat the purpose of weighting the model in the first place ?
  • If the probability calibration approach is weighted: doesn't that break the natural interpretation of probability ? Can we change the definition of probability to match this ? (from expected number of event, to expected loss ?)
  • Or would it make sense to use both unweighted and weighted calibration approaches and report both natural probability (% of events) and weighted probability (% of loss) ?

Edit: does the weighting change the ranking of the first step ? Maybe we don't need to weight the first step and only the second step should be weighted ?

You must be logged in to vote

Replies: 1 comment 5 replies

Comment options

Maybe@glemaitre or@jeremiedbb have an input on this ?

You must be logged in to vote
5 replies
@glemaitre
Comment options

To me it seems that using cost imbalance would break the calibration in probability. Am I right in thinking this ?

Because the cost-sensitive learning is a post-hoc operation, it will not affect the calibration of the predictive model. What will affect the calibration of the model are:

  • whether you are using a proper scoring rule: only does loss will make sure that you will get the best probability estimates
  • if the loss also include some regularization: then you are not only minimizing the "proper scoring rule" thus it will have an effect on the probability estimates
  • the number of samples: the theory is based on infinite number of samples which is not the case in practice and thus it will have an impact on the estimates

So above there are general discussions. We think that there are details to refine when it comes to tune the hyperparameters with something related tohttps://arxiv.org/pdf/2501.19195?. In short, it might be better to tune a ranking metric and have an internal calibration as well (so "refine").

So to the larger question of calibrating with weights: if you are adding weights, then you calibrate the model on the weighted target probability rather than the original one. So if the aim is to get the true probability estimate from the original target, then you don't want to apply any weights. Weights could be useful if you applied some sampling in the process and you want to shift back to the original distribution.

@lcrmorin
Comment options

Because the cost-sensitive learning is a post-hoc operation, it will not affect the calibration of the predictive model.

But it changes what it is calibrated to ? (Natural probabilities or weighted probabilities)

Thanks for the links@glemaitre (although it seems a bit like the Deep Learning community reinventing the wheel). In my field it is usual to split the modelling into a ranking step and a calibration step.

I think I can reformulate my question as: which step should be weighted ?

  • It is not clear whether the ranking step is impacted by weighting. So that one could drop weights for that step.
  • It seems more clear that the calibration step would be impacted by weighting. But can the weighting be performed independently from the first step ?
  • And if we need both natural and weighted probabilities can we perform one ranking step then two calibration step ?
@glemaitre
Comment options

But it changes what it is calibrated to ? (Natural probabilities or weighted probabilities)

No because the post-hoc operation is just finding the cut-off point to go from probability estimates to classification decision.

Since the original estimator is found by minimizing a proper scoring rule then you cannot get a better probability estimates. Potentially, you can refine those estimates with an additional calibration steps.

However, those steps are prior to do any cost-sensitive learning as presenting in the example. In short, the perfectly calibrated classifier will provide probability estimate and we find the threshold that transform them into classification decision and thus it does not change the calibration of the estimator.

which step should be weighted ?

Therefore, no steps should be weighted if the aim is to stick to the "natural" probabilities. Weighting should only be used if in the process, you do not minimize the original problem and altered it (e.g. resampling).

@lcrmorin
Comment options

Thanks@glemaitre for your answer.

I think I was confused by the post-hoc operation you mention, as you are discussing the binarisation threshold, while I was discussing calibration step (typically isotonic spline). I'd say we don't use the binarisation step that much in practice. The level of the threshold would mainly depends on the ranks (and change if the metric used is weighted by costs).

I generally need both weighted and unweighted metric (probability and expected loss) and was wondering if there is an option to have only one model for ranking and two 'calibration steps', but I guess it is not the way to go. (The decomposition of risk in the paper you mentioned would have the decomposition of a cost weighted metric in two weighted terms).

@glemaitre
Comment options

OK so to be sure to get it properly, here you refer to cost-sensitive learning as weighting directly the loss that is minimized by the estimator?

If it is the case, then I agree that you estimator is not calibrated in respect to the original target distribution.

If you recalibrate the model without weight, it should make the estimator predict in the original target distribution. If you are passing weights during the calibration then you are forcing your model to predict in the weighted target distribution, I guess. Then, I'm wondering if it is actually needed to recalibrate the model because minimizing the cost-sensitive loss in the first step would have been enough. But it might be similar to the non-sensitive model that you can get not a perfectly calibrated model and still want to reduce the calibration loss.

So at the end, if the aim is to predict in the reweighted target distribution then I think that I agree with you that you need to weight both the estimator loss and the calibration score.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
Q&A
Labels
None yet
2 participants
@lcrmorin@glemaitre

[8]ページ先頭

©2009-2025 Movatter.jp