Part of the book series:Lecture Notes in Computer Science ((LNIP,volume 12544))
Included in the following conference series:
1311Accesses
Abstract
Deep neural networks have recently advanced the state-of-the-art in image compression and surpassed many traditional compression algorithms. The training of such networks involves carefully trading off entropy of the latent representation against reconstruction quality. The term quality crucially depends on the observer of the images which, in the vast majority of literature, is assumed to be human. In this paper, we aim to go beyond this notion of compression quality and look at human visual perception and image classificationsimultaneously. To that end, we use a family of loss functions that allows to optimize deep image compression depending on the observer and to interpolate between human perceived visual quality and classification accuracy, enabling a more unified view on image compression. Our extensive experiments show that using perceptual loss functions to train a compression system preserves classification accuracy much better than traditional codecs such as BPG without requiring retraining of classifiers on compressed images. For example, compressing ImageNet to 0.25 bpp reduces Inception-ResNet classification accuracy by only 2%. At the same time, when using a human friendly loss function, the same compression system achieves competitive performance in terms of MS-SSIM. By combining these two objective functions, we show that there is a pronounced trade-off in compression quality between the human visual system and classification accuracy.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 5719
- Price includes VAT (Japan)
- Softcover Book
- JPY 7149
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The source code is available at https://github.com/DS3Lab/odlc.
References
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015).https://www.tensorflow.org/, software available from tensorflow.org
Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. Adv. Neural Inf. Process. Syst.30, 1141–1151 (2017)
Agustsson, E., Tschannen, M., Mentzer, F., Timofte, R., Van Gool, L.: Generative adversarial networks for extreme learned image compression. arXiv preprintarXiv:1804.02958 (2018)
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: International Conference on Learning Representations (ICLR) (2017)
Bellard, F.: BPG image format (2014).https://bellard.org/bpg/
Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell.35(8), 1872–1886 (2013)
Bruna, J., Sprechmann, P., LeCun, Y.: Super-resolution with deep convolutional sufficient statistics. In: International Conference on Learning Representations (ICLR), May 2016
Chinen, T., et al.: Towards a semantic perceptual image metric. In: 2018 25th IEEE International Conference on Image Processing (ICIP), October 2018
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Chollet, F., et al.: Keras (2015).https://keras.io
Dodge, S., Karam, L.: Understanding how image quality affects deep neural networks. In: 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6 (2016)
Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. Adv. Neural Inf. Process. Syst.29, 658–666 (2016)
Galteri, L., Seidenari, L., Bertini, M., Del Bimbo, A.: Deep generative adversarial compression artifact removal. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
Gatys, L., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 262–270 (2015)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google: Webp image format (2015).https://developers.google.com/speed/webp/. Accessed 17 Mar 2019
Gueguen, L., Sergeev, A., Kadlec, B., Liu, R., Yosinski, J.: Faster neural networks straight from jpeg. Adv. Neural Inf. Process. Syst.31, 3933–3944 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017).arXiv:1704.04861
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).https://doi.org/10.1007/978-3-319-46475-6_43
Johnston, N., et al.: Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: First Workshop on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition, June 2011
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR), May 2015
Kodak, E.: Kodak lossless true color image suite (PhotoCD PCD0992)
LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pp. 253–256 (2010)
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Liu, H., Chen, T., Shen, Q., Yue, T., Ma, Z.: Deep image compression via end-to-end learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018
Liu, Z., et al.: DeepN-JPEG: a deep neural network favorable JPEG-based image compression framework. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), pp. 1–6 (2018)
Liu, Z., et al.: Machine vision guided 3D medical image compression for efficient transmission and accurate segmentation in the clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Mallat, S.: Understanding deep convolutional networks. Philos. Trans. Roy. Soc. A374(2065) (2016)
Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Conditional probability models for deep image compression. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Practical full resolution learned lossless image compression. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill (2016).http://distill.pub/2016/deconv-checkerboard/
Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2922–2930, August 2017
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis.115(3), 211–252 (2015)
Santurkar, S., Budden, D., Shavit, N.: Generative compression. In: 2018 Picture Coding Symposium (PCS), pp. 258–262 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR), May 2015
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI 2017, pp. 4278–4284. AAAI Press (2017)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: International Conference on Learning Representations (ICLR) (2017)
Toderici, G., et al.: Variable rate image compression with recurrent neural networks. In: International Conference on Learning Representations (ICLR) (2016)
Toderici, G., et al.: Full resolution image compression with recurrent neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5435–5443, July 2017
Torfason, R., Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Gool, L.V.: Towards image understanding from deep compression without decoding. In: International Conference on Learning Representations (ICLR), April 2018
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical report. CNS-TR-2011-001, California Institute of Technology (2011)
Wallace, G.K.: The jpeg still picture compression standard. IEEE Trans. Cons. Electron.38(1), xviii-xxxiv (1992)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402 (2003)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Acknowledgments
CZ and the DS3Lab gratefully acknowledge the support from the Swiss National Science Foundation (Project Number 200021_184628), Innosuisse/SNF BRIDGE Discovery (Project Number 40B2-0_187132), European Union Horizon 2020 Research and Innovation Programme (DAPHNE, 957407), Botnar Research Centre for Child Health, Swiss Data Science Center, Alibaba, Cisco, eBay, Google Focused Research Awards, Oracle Labs, Swisscom, Zurich Insurance, Chinese Scholarship Council, and the Department of Computer Science at ETH Zurich.
Author information
Authors and Affiliations
Department of Computer Science, ETH Zürich, Switzerland
Maurice Weber, Cedric Renggli & Ce Zhang
ZHAW School of Engineering, Winterthur, Switzerland
Helmut Grabner
- Maurice Weber
You can also search for this author inPubMed Google Scholar
- Cedric Renggli
You can also search for this author inPubMed Google Scholar
- Helmut Grabner
You can also search for this author inPubMed Google Scholar
- Ce Zhang
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toMaurice Weber.
Editor information
Editors and Affiliations
University of Tübingen, Tübingen, Germany
Zeynep Akata
University of Tübingen, Tübingen, Germany
Andreas Geiger
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
1Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Weber, M., Renggli, C., Grabner, H., Zhang, C. (2021). Observer Dependent Lossy Image Compression. In: Akata, Z., Geiger, A., Sattler, T. (eds) Pattern Recognition. DAGM GCPR 2020. Lecture Notes in Computer Science(), vol 12544. Springer, Cham. https://doi.org/10.1007/978-3-030-71278-5_10
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-030-71277-8
Online ISBN:978-3-030-71278-5
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative