Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

MNT Refactor get_metadata_routing method in MetadataRequest to directly use _get_metadata_request instead#31695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
StefanieSenger wants to merge9 commits intoscikit-learn:main
base:main
Choose a base branch
Loading
fromStefanieSenger:refactor_router_requester

Conversation

StefanieSenger
Copy link
Contributor

@StefanieSengerStefanieSenger commentedJul 3, 2025
edited
Loading

What does this implement/fix? Explain your changes.

This PR de-couples the double-handling ofMetadataRouter andMetadataRequest objects in metadata routing ( a bit), by not using theget_metadata_routing method inMetadataRequest anymore, but instead to directly use_get_metadata_request.

This change has affect in several parts of our metadata routing implementation and the tests and there simplifies the code and makes it more intuitive to read.

The goal of this PR is to increase maintainability.

While at it, I also:

  • re-named__metadata_request__* class attribute into__metadata_request__{method} and made clear to mention its a class attribute in our internal documentation to make it easier to distinguish it from the_metadata_request instance attribute present in consumers at first sight
  • made some other small improvements for the external or internal documentation in metadata routing

@github-actionsGitHub Actions
Copy link

github-actionsbot commentedJul 3, 2025
edited
Loading

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit:20c5dcd. Link to the linter CI:here

@StefanieSengerStefanieSenger changed the titleRe-factorMNT Refactor double-handling of MetadataRouter and MetadataRequest objectsJul 3, 2025
@StefanieSengerStefanieSenger added Metadata Routingall issues related to metadata routing, slep006, sample props No Changelog Needed and removed module:metrics module:model_selection labelsJul 3, 2025
Comment on lines +996 to +1002
with pytest.warns(UserWarning, match="`transform` method which consumes metadata"):
rct = RoutingCustomTransformer(
estimator=CustomTransformer()
.set_fit_request(metadata=True)
.set_transform_request(metadata=True)
)
rct.fit_transform(X=[[1]], metadata="metadata")
Copy link
ContributorAuthor

@StefanieSengerStefanieSengerJul 4, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This test was added to make sure the warning is also shown when a transformer is not a pure consumer (but also a router). It refers to the change insklearn.base.TransformerMixin.fit_transform.

deftest_default_requests():
classOddEstimator(BaseEstimator):
deftest_class_level_requests():
classStubEstimator(BaseEstimator):
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm renamingOddEstimator because the old naming implies we are intentionally demonstrating wrong usage, but that's not our purpose here.

@@ -210,19 +210,19 @@ def test_request_type_is_valid(val, res):


@config_context(enable_metadata_routing=True)
def test_default_requests():
class OddEstimator(BaseEstimator):
def test_class_level_requests():
Copy link
ContributorAuthor

@StefanieSengerStefanieSengerJul 4, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm renamingtest_default_requests because the term "default" is not referring to default / auto routing here and could be mis-understood. What is meant are the class level requests.

@StefanieSenger
Copy link
ContributorAuthor

StefanieSenger commentedJul 5, 2025
edited
Loading

Log of my though process (that hopefully helps me move forwards):

The de-coupeling of_get_metadata_request for consumers andget_metadata_routing that was previously used in both consumers and routers and is now only used in routers is a separate issue than removing theadd_self_request in routers by using their_get_metadata_request method, that is still present in them.
I understand I should do it in the same PR nevertheless.

I wonder if we want to fully automate the setting of self requests, or if it should be integrated inMetadataRouter.add(). But if the latter, I wonder if it really makes things simpler, because we would force estimator developers to define a method_mapping in .add, that is not easily available for self requests (or only in a hacky way).
So I tend to fully automate, which means that I'd have to manipulateMetadataRouter.__init__(). I have hardly seen us add any functionality to any init method in scikit-learn, and I think for a good reason (predictability or so?). Still, I will make a suggestion to automate the setting of self requests for router, because otherwise, we force us and developers of custom meta-estimators to pass a complicated method mapping intoMetadataRouter.add()...

Since then I need to pass an object intoMetadataRouter.__init__(), so I can create a self request, I wonder if I should make the existingowner become an object type, instead of a string, or if I should add a new init param that would be the object. Maybe better to pass a new param, because then theowner string can still get a user-defined name?

But potentially, those could both be merged into the same param, since passingMetadataRouter(owner=self.__class__.__name__, owning_obj=self) seems a bit unnecessary.

@StefanieSenger
Copy link
ContributorAuthor

StefanieSenger commentedJul 5, 2025
edited
Loading

I understood that this PR cannot do much about the double-handling ofMetadataRouter andMetadataRequest objects (which I had thought it was intending in the beginning), because that depends on several other places and choices taken. But it does part of the job by taking care of certain methods. I will update the PR title to better fit that.

@StefanieSengerStefanieSenger changed the titleMNT Refactor double-handling of MetadataRouter and MetadataRequest objectsMNT Refactor get_metadata_routing method in MetadataRequest to directly use _get_metadata_request insteadJul 5, 2025
@StefanieSenger
Copy link
ContributorAuthor

I've experimented with my above idea to add a self-request automatically to routers, but it fails for routers that are pure routers but not also consumers. Given that, I no longer see a benefit in removing or refining theadd_self_request method fromMetadataRouter. We would still require some input from estimator developers on whether a self request should be added or not.

If that input is done by callingMetadataRouter().add_self_request(self) inget_metadata_routing() or through - in the simplest case, boolean paramater (likeadd_self_request=True) feels like an implementation detail.

Also, the original change of this PR, the de-tangling of_get_metadata_request andget_metadata_routing methods for consumers doesn't provide a new possibility here, becauseMetadataRouter().add_self_request had already internally usedobj._get_metadata_request().

At this point, I am tempted to think this refactoring is complete, unless there is some specific thing that we want to do withadd_self_request?

This PR makes the code more readable, because we can clearly see if we deal with a consumer based on the method called on an object. It also adds some improvements to the internal and the API docs and personally, it has given me more insights and more ideas for refactorings (which I will open separate PRs for).

What is left that I will go through our docs and examples to check whether the right method is used everywhere. I will do that and then un-draft this PR. Do you consider it complete,@adrinjalali?

@@ -46,15 +45,20 @@
MetadataRouter
~~~~~~~~~~~~~~

The ``MetadataRouter`` objects are constructed via a `get_metadata_routing` method,
Copy link
ContributorAuthor

@StefanieSengerStefanieSengerJul 5, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

TheMetadataRouter objects are constructed via aget_metadata_routing method

@adrinjalali, Do you think it makes sense to renameget_metadata_routing intoget_metadata_router?
That would be more precise and we spare people to wonder about whether there is a difference between "metadata router" and "metadata routing". In fact, this method returns a MetadataRouter object.

That would be for a new PR and could be done without deprecation (because we are still in experimental stage), I believe?

@StefanieSengerStefanieSenger marked this pull request as ready for reviewJuly 5, 2025 17:45
Comment on lines -548 to -550
Consumer-only classes such as simple estimators return a serialized
version of this class as the output of `get_metadata_routing()`.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That's not true (anymore).

get_routing_for_object(),_get_metadata_request() andget_metadata_routing() all return unserialised output.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
Metadata Routingall issues related to metadata routing, slep006, sample propsmodule:utilsNo Changelog Needed
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

1 participant
@StefanieSenger

[8]ページ先頭

©2009-2025 Movatter.jp