Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

FIX metadata routing for scorer#31654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
tomMoral wants to merge7 commits intoscikit-learn:main
base:main
Choose a base branch
Loading
fromtomMoral:FIX_scorer_metadata_routing

Conversation

tomMoral
Copy link
Contributor

@tomMoraltomMoral commentedJun 25, 2025
edited
Loading

Fixes#27977

This PR add metadata routing to the estimator to score when calling a scorer.
One thing that is not easy to understand is that here, the routing needs to be dynamic, as it depends on the estimator that is being scored.
For now it is implemented for unique scorer, I have to see how to do this for multiple scorer...

TODO

  • Add a test
  • Check that this works for multiple scorer

@github-actionsGitHub Actions
Copy link

github-actionsbot commentedJun 25, 2025
edited
Loading

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enablingpre-commit hooks. Instructions to enable them can be foundhere.

You can see the details of the linting issues under thelint jobhere


ruff check

ruff detected issues. Please runruff check --fix --output-format=full locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installedruff version isruff=0.11.7.

sklearn/metrics/_scorer.py:41:5: F401 [*] `..utils.metadata_routing.process_routing` imported but unused   |39 |     _routing_enabled,40 |     get_routing_for_object,41 |     process_routing,   |     ^^^^^^^^^^^^^^^ F40142 | )43 | from ..utils.validation import _check_response_method   |   = help: Remove unused import: `..utils.metadata_routing.process_routing`sklearn/metrics/_scorer.py:616:89: E501 Line too long (91 > 88)    |614 |         return Bunch(615 |             score={616 |                 k: kwargs[k] for k in self.get_metadata_routing().consumes("score", kwargs)    |                                                                                         ^^^ E501617 |             }618 |         )    |Found 2 errors.[*] 1 fixable with the `--fix` option.

ruff format

ruff detected issues. Please runruff format locally and push the changes. Here you can see the detected issues. Note that the installedruff version isruff=0.11.7.

--- sklearn/metrics/_scorer.py+++ sklearn/metrics/_scorer.py@@ -551,9 +551,7 @@          self._metadata_request = MetadataRequest(owner=self.__class__.__name__)         estimator_request = getattr(-            estimator,-            "_metadata_request",-            estimator._get_default_requests()+            estimator, "_metadata_request", estimator._get_default_requests()         )         if hasattr(estimator_request, "score"):             self._metadata_request.score = copy.deepcopy(estimator_request.score)@@ -613,7 +611,8 @@     def _route_scorer_metadata(self, estimator, kwargs):         return Bunch(             score={-                k: kwargs[k] for k in self.get_metadata_routing().consumes("score", kwargs)+                k: kwargs[k]+                for k in self.get_metadata_routing().consumes("score", kwargs)             }         ) --- sklearn/model_selection/_validation.py+++ sklearn/model_selection/_validation.py@@ -363,16 +363,20 @@                 estimator=estimator,                 # TODO(SLEP6): also pass metadata to the predict method for                 # scoring?-                method_mapping=MethodMapping().add(caller="fit", callee="fit")+                method_mapping=MethodMapping().add(caller="fit", callee="fit"),             )         )         try:             routed_params = process_routing(router, "fit", **params)             scorer_router = scorers._route_scorer_metadata(estimator, params)-            routed_params.scorer = Bunch(score={-                k: v for step in scorer_router.values()-                for params in step.values() for k, v in params.items()-            })+            routed_params.scorer = Bunch(+                score={+                    k: v+                    for step in scorer_router.values()+                    for params in step.values()+                    for k, v in params.items()+                }+            )         except UnsetMetadataPassedError as e:             # The default exception would mention `fit` since in the above             # `process_routing` code, we pass `fit` as the caller. However,2 files would be reformatted, 922 files already formatted

Generated for commit:db99d2c. Link to the linter CI:here

@tomMoral
Copy link
ContributorAuthor

A bit fighting with thesample_weight, which is a complicated edge case 😓
I think better documenting what the expected behavior is would help.

@glemaitreglemaitre self-requested a reviewJune 26, 2025 15:36
Copy link
Contributor

@StefanieSengerStefanieSenger left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hi@tomMoral, thanks for fixing this complicated bug. I was looking into this a bit and I have a few comments that mainly deal with the case the new test is written for (scoring is done via the default scoring and the estimator's predict method). I only touched the surface but I am not sure how to handle this on a more structural level. (The others will be able to help you better.) I still hope it's a bit useful and helps this move forwards. Otherwise feel free to ignore. :)

Maybe this PR could only deal with the scoring of a single metric and multimetric scoring could be a separate PR. I think it's easier to implement and review if it's separated?

@@ -396,17 +429,57 @@ def _score(self, method_caller, estimator, X, y_true, **kwargs):
)

pos_label = None if is_regressor(estimator) else self._get_pos_label()
response_method = _check_response_method(estimator, self._response_method)
routed_params = self._route_scorer_metadata(estimator, kwargs)
Copy link
Contributor

@StefanieSengerStefanieSengerJul 11, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

self._route_scorer_metadata(estimator, kwargs) returns the metadata that should be routed to the estimator's response method viarouted_params.estimator[method_name] (which makes it possible to calculatey_pred taking the metadata into account.

But it doesn't add the metadata (passed intoscore() intest_scorer_metadata_routing) to the scoring viarouted_params.score.score.
I think that's because_Score._score_func (such asaccuracy_score()) does not expect a param called "metadata" (only "sample_weight" maybe). A param called "metadata" gets filtered out.

I think we cannot test a param "matadata" with scikit-learn's metrics, because they don't know what that is.

Comment on lines +464 to +467
.add(
score=self,
method_mapping=MethodMapping().add(caller="score", callee="score"),
)
Copy link
Contributor

@StefanieSengerStefanieSengerJul 11, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This reminds me on how we create aMetadataRouter for splitters, likecross_validate, where we pretend the splitter function was a caller namedfit, and then we route to the calleessplit andscore through the router.

In general, for adding a request to consume metadata by a method of the present instance (self), there is theMetadataRouter.add_self_request(self) method. In this case here, we route toself._score_func.
I wonder and am a bit confused about whether that would that be a self-request or not and if that matters.

@@ -1392,6 +1393,31 @@ def score2(y_true, y_pred, common_arg=None, sample_weight=None):
multi_scorer(clf, X, y, common_arg=1, sample_weight=1)


@config_context(enable_metadata_routing=True)
def test_scorer_metadata_routing():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For testing if metadata is actually routed and used we have used test classes from thesklearn/tests/metadata_routing_common.py module in the past. These accept a_Registry() that is used to store routed metadata.

Seesklearn/compose/tests/test_column_transformer.test_metadata_routing_for_column_transformer for an example.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@StefanieSengerStefanieSengerStefanieSenger left review comments

@glemaitreglemaitreAwaiting requested review from glemaitre

At least 1 approving review is required to merge this pull request.

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

Routing metadata to theresponse_method used by a scorer
2 participants
@tomMoral@StefanieSenger

[8]ページ先頭

©2009-2025 Movatter.jp