I am the author of the ICCV 2023 paper titled "Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?", which focuses on benchmarking pooling techniques for CNNs and Transformers. It also introduces a new, simple, attention-based pooling mechanism with great localization properties.

In this pull request, I have implemented and rigorously tested the following pooling methods:

Generalized Max Pooling
LSE Pooling
HOW Pooling
Slot Attention (Pooling)
SimPool
ViT Pooling

I believe these additions will be beneficial to the library, offering users cutting-edge options for pooling in their models. These methods have shown promising results in my research and experiments, and I am excited about their potential impact on a wider range of applications.

I am looking forward to your feedback and am happy to make any further adjustments as needed.
Thank you for considering this contribution to the library.

Cheers :)

add more pooling mechanisms

82f75f4

Copy link

HuggingFaceDocBuilderDev commentedDec 7, 2023

The docs for this PR livehere. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link

Collaborator

rwightman commentedDec 7, 2023

@billpsomas not ignoring this, just a lot to digest and consider re integrating and testing such pooling layers in a sensible fashion...

Without diving into the details and mechanics of integration and testing challenges. Some nits on style and attribution. I recognize some of the styles here, I see some lucidrains, I see some code that looks very Microsoft-like, and others. I'd like to have attribution as to where bits and pieces here came from in the file / class level docstrings...

And then, would like to have the style unified, so make the lucid to_q etc -> q, the MS-like query_proj -> q (or merge into qkv if they are all same dim), etc.

Some of the above might be bit of work so don't jump in right away. It looks like you've integrated these layers into the pooling factory in your own timm fork, does that work well? Have they all been tested? Any comparative results combining these layers with classic CNN like resnets, or vit / vit-hybrids?

fffffgggg54 mentioned this pull request

Dec 18, 2023

Update ML Decoder#2045

Draft

Copy link

Author

billpsomas commentedApr 8, 2024

Hello Ross and sorry for the late reply,

I've indeed integrated everything into my own timm fork and have made a lot of experiments for my paper. You can find experimental results using ResNet-18 in Figure 3 of my paper (https://arxiv.org/pdf/2309.06891.pdf). I've also tested some of them with ResNet-50 and ViT-S. For me, everything worked well. In Figure 3 you will notice that even more poolings are included. I have also integrated these into my timm fork, but did not add them here. Maybe in another PR.

Now, about attribution as to where the code came from:

GMP:https://github.com/VChristlein/dgmp/blob/master/dgmp.py
LSE: custom implementation
HOW:https://github.com/gtolias/how/tree/master/how/layers
Slot Attention: modifiedhttps://github.com/evelinehong/slot-attention-pytorch/blob/master/model.py
SimPool: custom implementation
ViT:https://github.com/sooftware/speech-transformer/blob/master/speech_transformer/attention.py

I know this is not priority, but it would be nice to have some extra poolings in the library.

In my case, I modifiedhttps://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/adaptive_avgmax_pool.py#L124-L161, so that you can give the pooling of your choice through the pool_type argument.

Cheers!

Labels

None yet

3 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Addition of More Pooling Methods#2048

Are you sure you want to change the base?

Addition of More Pooling Methods#2048

Uh oh!

Conversation

billpsomas commentedDec 4, 2023

Uh oh!

HuggingFaceDocBuilderDev commentedDec 7, 2023

Uh oh!

rwightman commentedDec 7, 2023

Uh oh!

billpsomas commentedApr 8, 2024

Uh oh!

Uh oh!