Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork5k
Addition of More Pooling Methods#2048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
HuggingFaceDocBuilderDev commentedDec 7, 2023
The docs for this PR livehere. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@billpsomas not ignoring this, just a lot to digest and consider re integrating and testing such pooling layers in a sensible fashion... Without diving into the details and mechanics of integration and testing challenges. Some nits on style and attribution. I recognize some of the styles here, I see some lucidrains, I see some code that looks very Microsoft-like, and others. I'd like to have attribution as to where bits and pieces here came from in the file / class level docstrings... And then, would like to have the style unified, so make the lucid to_q etc -> q, the MS-like query_proj -> q (or merge into qkv if they are all same dim), etc. Some of the above might be bit of work so don't jump in right away. It looks like you've integrated these layers into the pooling factory in your own timm fork, does that work well? Have they all been tested? Any comparative results combining these layers with classic CNN like resnets, or vit / vit-hybrids? |
Hello Ross and sorry for the late reply, I've indeed integrated everything into my own timm fork and have made a lot of experiments for my paper. You can find experimental results using ResNet-18 in Figure 3 of my paper (https://arxiv.org/pdf/2309.06891.pdf). I've also tested some of them with ResNet-50 and ViT-S. For me, everything worked well. In Figure 3 you will notice that even more poolings are included. I have also integrated these into my timm fork, but did not add them here. Maybe in another PR. Now, about attribution as to where the code came from: GMP:https://github.com/VChristlein/dgmp/blob/master/dgmp.py I know this is not priority, but it would be nice to have some extra poolings in the library. In my case, I modifiedhttps://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/adaptive_avgmax_pool.py#L124-L161, so that you can give the pooling of your choice through the pool_type argument. Cheers! |
Hi there!
I am the author of the ICCV 2023 paper titled "Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?", which focuses on benchmarking pooling techniques for CNNs and Transformers. It also introduces a new, simple, attention-based pooling mechanism with great localization properties.
In this pull request, I have implemented and rigorously tested the following pooling methods:
I believe these additions will be beneficial to the library, offering users cutting-edge options for pooling in their models. These methods have shown promising results in my research and experiments, and I am excited about their potential impact on a wider range of applications.
I am looking forward to your feedback and am happy to make any further adjustments as needed.
Thank you for considering this contribution to the library.
Cheers :)