Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Addition of More Pooling Methods#2048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
billpsomas wants to merge1 commit intohuggingface:main
base:main
Choose a base branch
Loading
frombillpsomas:poolings

Conversation

billpsomas
Copy link

Hi there!

I am the author of the ICCV 2023 paper titled "Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?", which focuses on benchmarking pooling techniques for CNNs and Transformers. It also introduces a new, simple, attention-based pooling mechanism with great localization properties.

In this pull request, I have implemented and rigorously tested the following pooling methods:

  • Generalized Max Pooling
  • LSE Pooling
  • HOW Pooling
  • Slot Attention (Pooling)
  • SimPool
  • ViT Pooling

I believe these additions will be beneficial to the library, offering users cutting-edge options for pooling in their models. These methods have shown promising results in my research and experiments, and I am excited about their potential impact on a wider range of applications.

I am looking forward to your feedback and am happy to make any further adjustments as needed.
Thank you for considering this contribution to the library.

Cheers :)

@HuggingFaceDocBuilderDev

The docs for this PR livehere. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@rwightman
Copy link
Collaborator

@billpsomas not ignoring this, just a lot to digest and consider re integrating and testing such pooling layers in a sensible fashion...

Without diving into the details and mechanics of integration and testing challenges. Some nits on style and attribution. I recognize some of the styles here, I see some lucidrains, I see some code that looks very Microsoft-like, and others. I'd like to have attribution as to where bits and pieces here came from in the file / class level docstrings...

And then, would like to have the style unified, so make the lucid to_q etc -> q, the MS-like query_proj -> q (or merge into qkv if they are all same dim), etc.

Some of the above might be bit of work so don't jump in right away. It looks like you've integrated these layers into the pooling factory in your own timm fork, does that work well? Have they all been tested? Any comparative results combining these layers with classic CNN like resnets, or vit / vit-hybrids?

@fffffgggg54fffffgggg54 mentioned this pull requestDec 18, 2023
@billpsomas
Copy link
Author

Hello Ross and sorry for the late reply,

I've indeed integrated everything into my own timm fork and have made a lot of experiments for my paper. You can find experimental results using ResNet-18 in Figure 3 of my paper (https://arxiv.org/pdf/2309.06891.pdf). I've also tested some of them with ResNet-50 and ViT-S. For me, everything worked well. In Figure 3 you will notice that even more poolings are included. I have also integrated these into my timm fork, but did not add them here. Maybe in another PR.

Now, about attribution as to where the code came from:

GMP:https://github.com/VChristlein/dgmp/blob/master/dgmp.py
LSE: custom implementation
HOW:https://github.com/gtolias/how/tree/master/how/layers
Slot Attention: modifiedhttps://github.com/evelinehong/slot-attention-pytorch/blob/master/model.py
SimPool: custom implementation
ViT:https://github.com/sooftware/speech-transformer/blob/master/speech_transformer/attention.py

I know this is not priority, but it would be nice to have some extra poolings in the library.

In my case, I modifiedhttps://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/adaptive_avgmax_pool.py#L124-L161, so that you can give the pooling of your choice through the pool_type argument.

Cheers!

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@billpsomas@HuggingFaceDocBuilderDev@rwightman

[8]ページ先頭

©2009-2025 Movatter.jp