Movatterモバイル変換

Part of the book series:Lecture Notes in Computer Science ((LNCS,volume 13686))

Included in the following conference series:

European Conference on Computer Vision

3479Accesses
11Citations

Abstract

This work simultaneously considers the discriminability and transferability properties of deep representations in the typical supervised learning task,i.e., image classification. By a comprehensive temporal analysis, we observe a trade-off between these two properties. The discriminability keeps increasing with the training progressing while the transferability intensely diminishes in the later training period. From the perspective of information-bottleneck theory, we reveal that the incompatibility between discriminability and transferability is attributed to the over-compression of input information. More importantly, we investigate why and how the InfoNCE loss can alleviate the over-compression, and further present a learning framework, named contrastive temporal coding (CTC), to counteract the over-compression and alleviate the incompatibility. Extensive experiments validate that CTC successfully mitigates the incompatibility, yielding discriminative and transferable representations. Noticeable improvements are achieved on the image classification task and challenging transfer learning tasks. We hope that this work will raise the significance of the transferability property in the conventional supervised learning setting.

Q. Cui and B. Zhao—Equal contributions.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 12583; Price includes VAT (Japan)

Softcover Book: JPY 15729; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Input Compression with Positional Consistency for Efficient Training and Inference of Transformer Neural Networks

A Differentiable Entropy Model for Learned Image Compression

Video Representation Learning by Recognizing Temporal Transformations

Notes

1.
Representations refer to the outputs of the backbone, which are processed with a global average pooling in popular models [17].
2.
We use the Mutual Information Neural Estimation (MINE) [2] method to calculate the mutual information between continuous variables.
3.
Proofs are attached in the appendix A.1.

References

Alemi, A.A., Fischer, I., Dillon, J.V., Murphy, K.: Deep variational information bottleneck. In: ICLR (2017)
Google Scholar
Belghazi, M.I., et al.: MINE: mutual information neural estimation.arXiv:1801.04062 (2018)
Chen, C., Zheng, Z., Ding, X., Huang, Y., Dou, Q.: Harmonizing transferability and discriminability for adapting object detectors. In: CVPR (2020)
Google Scholar
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: ICML (2020)
Google Scholar
Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.: Big self-supervised models are strong semi-supervised learners. In: NeurIPS (2020)
Google Scholar
Chen, X., Fan, H., Girshick, R., He, K.: Improved baselines with momentum contrastive learning.arXiv:2003.04297 (2020)
Chen, X., Wang, S., Long, M., Wang, J.: Transferability vs. discriminability: batch spectral penalization for adversarial domain adaptation. In: ICML (2019)
Google Scholar
Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: AISTATS (2011)
Google Scholar
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: AutoAugment: learning augmentation strategies from data. In: CVPR (2019)
Google Scholar
Darlow, L.N., Crowley, E.J., Antoniou, A., Storkey, A.J.: CINIC-10 is not ImageNet or CIFAR-10.arXiv:1810.03505 (2018)
Feng, Y., Jiang, J., Tang, M., Jin, R., Gao, Y.: Rethinking supervised pre-training for better downstream transferring. In: ICLR (2022)
Google Scholar
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)
Google Scholar
Furlanello, T., Lipton, Z., Tschannen, M., Itti, L., Anandkumar, A.: Born again neural networks. In: ICML (2018)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR (2020)
Google Scholar
He, K., Girshick, R., Dollár, P.: Rethinking ImageNet pre-training. In: ICCV (2019)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network.arXiv:1503.02531 (2015)
Hjelm, R.D., et al.: Learning deep representations by mutual information estimation and maximization. In: ICLR (2019)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018)
Google Scholar
Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: ICLR (2020)
Google Scholar
Khosla, P., et al.: Supervised contrastive learning. arXiv preprintarXiv:2004.11362 (2020)
Kornblith, S., Chen, T., Lee, H., Norouzi, M.: Why do better loss functions lead to less transferable features? In: NeurIPS (2021)
Google Scholar
Kornblith, S., Shlens, J., Le, Q.V.: Do better ImageNet models transfer better? In: CVPR (2019)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)
Google Scholar
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: ECCV (2014)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Long, M., Cao, Y., Wang, J., Jordan, M.: Learning transferable features with deep adaptation networks. In: ICML (2015)
Google Scholar
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft.arXiv:1306.5151 (2013)
Mao, H., Chen, X., Fu, Q., Du, L., Han, S., Zhang, D.: Neuron campaign for initialization guided by information bottleneck theory. In: CIKM (2021)
Google Scholar
Oord, A.V.D., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding.arXiv:1807.03748 (2018)
Park, T., Efros, A.A., Zhang, R., Zhu, J.Y.: Contrastive learning for unpaired image-to-image translation. In: ECCV (2020)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS (2015)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. In: IJCV (2015)
Google Scholar
Sariyildiz, M.B., Kalantidis, Y., Larlus, D., Alahari, K.: Concept generalization in visual representation learning. In: ICCV (2021)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR (2015)
Google Scholar
Shao, J., Wen, X., Zhao, B., Xue, X.: Temporal context aggregation for video retrieval with contrastive learning. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2021)
Google Scholar
Shwartz-Ziv, R., Tishby, N.: Opening the black box of deep neural networks via information.arXiv:1703.00810 (2017)
Tian, Y., Krishnan, D., Isola, P.: Contrastive multiview coding.arXiv:1906.05849 (2019)
Tian, Y., Sun, C., Poole, B., Krishnan, D., Schmid, C., Isola, P.: What makes for good views for contrastive learning? In: NeurIPS (2020)
Google Scholar
Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: ITW (2015)
Google Scholar
Tripuraneni, N., Jordan, M., Jin, C.: On the theory of transfer learning: The importance of task diversity. In: NeurIPS (2020)
Google Scholar
Van Horn, G., et al.: The INaturalist species classification and detection dataset. In: CVPR (2018)
Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-ucsd birds-200-2011 dataset (2011)
Google Scholar
Wang, X., Zhang, R., Shen, C., Kong, T., Li, L.: Dense contrastive learning for self-supervised visual pre-training.arXiv:2011.09157 (2020)
Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: ECCV (2016)
Google Scholar
Wu, M., Zhuang, C., Mosse, M., Yamins, D., Goodman, N.: On mutual information in contrastive learning for visual representations.arXiv:2005.13149 (2020)
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: CVPR (2018)
Google Scholar
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: CVPR (2017)
Google Scholar
Yalniz, I.Z., Jégou, H., Chen, K., Paluri, M., Mahajan, D.: Billion-scale semi-supervised learning for image classification.arXiv:1905.00546 (2019)
You, K., Liu, Y., Wang, J., Long, M.: LogME: practical assessment of pre-trained models for transfer learning. In: ICML (2021)
Google Scholar
Zhao, B., Cui, Q., Song, R., Qiu, Y., Liang, J.: Decoupled knowledge distillation. In: CVPR (2022)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Google Scholar
Zheng, H., Fu, J., Mei, T., Luo, J.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: ICCV (2017)
Google Scholar
Zhou, B., Cui, Q., Wei, X.S., Chen, Z.M.: BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: CVPR (2020)
Google Scholar
Zhu, R., Zhao, B., Liu, J., Sun, Z., Chen, C.W.: Improving contrastive learning by visualizing feature transformation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)
Google Scholar
Zoph, B., et al.: Rethinking pre-training and self-training. In: NeurIPS (2020)
Google Scholar

Download references

Acknowledgement

This work was supported in part by the Zhejiang Provincial Natural Science Foundation of China under Grant No. LQ22F020006. We thank anonymous reviewers from ECCV 2022 for insightful comments.

Author information

Authors and Affiliations

MEGVII Technology, Beijing, China
Quan Cui, Bingchen Zhao, Zhao-Min Chen, Borui Zhao, Renjie Song & Jiajun Liang
Waseda University, Tokyo, Japan
Quan Cui & Osamu Yoshie
University of Edinburgh, Edinburgh, UK
Bingchen Zhao
Wenzhou University, Wenzhou, China
Zhao-Min Chen
ByteDance, Beijing, China
Boyan Zhou

Authors

Quan Cui
View author publications
You can also search for this author inPubMed Google Scholar
Bingchen Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Zhao-Min Chen
View author publications
You can also search for this author inPubMed Google Scholar
Borui Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Renjie Song
View author publications
You can also search for this author inPubMed Google Scholar
Boyan Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Jiajun Liang
View author publications
You can also search for this author inPubMed Google Scholar
Osamu Yoshie
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toRenjie Song.

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 400 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cui, Q.et al. (2022). Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13686. Springer, Cham. https://doi.org/10.1007/978-3-031-19809-0_2

Download citation

DOI:https://doi.org/10.1007/978-3-031-19809-0_2
Published:01 November 2022
Publisher Name:Springer, Cham
Print ISBN:978-3-031-19808-3
Online ISBN:978-3-031-19809-0
eBook Packages:Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Movatterモバイル変換

Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Input Compression with Positional Consistency for Efficient Training and Inference of Transformer Neural Networks

A Differentiable Entropy Model for Learned Image Compression

Video Representation Learning by Recognizing Temporal Transformations

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1Electronic supplementary material

Supplementary material 1 (pdf 400 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Access this chapter

Subscribe and save

Buy Now