- Simone Scardapane ORCID:orcid.org/0000-0003-0881-83447,
- Elena Nieddu8,
- Donatella Firmani ORCID:orcid.org/0000-0003-0358-32088 &
- …
- Paolo Merialdo ORCID:orcid.org/0000-0002-3852-80928
Part of the book series:Proceedings of the International Neural Networks Society ((INNS,volume 1))
Included in the following conference series:
1094Accesses
Abstract
The design of activation functions is a growing research area in the field of neural networks. In particular, instead of using fixed point-wise functions (e.g., the rectified linear unit), several authors have proposed ways of learning these functions directly from the data in a non-parametric fashion. In this paper we focus on the kernel activation function (KAF), a recently proposed framework wherein each function is modeled as a one-dimensional kernel model, whose weights are adapted through standard backpropagation-based optimization. One drawback of KAFs is the need to select a single kernel function and its eventual hyper-parameters. To partially overcome this problem, we motivate an extension of the KAF model, in which multiple kernels are linearly combined at every neuron, inspired by the literature on multiple kernel learning. We provide an application of the resultingmulti-KAF on a realistic use case, specifically handwritten Latin OCR, on a large dataset collected in the context of the ‘In Codice Ratio’ project. Results show that multi-KAFs can improve the accuracy of the convolutional networks previously developed for the task, with faster convergence, even with a smaller number of overall parameters.
The work of S. Scardapane was supported in part by Italian MIUR, “Progetti di Ricerca di Rilevante Interesse Nazionale”, GAUChO project, under Grant 2015YPXH4W_004.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 17159
- Price includes VAT (Japan)
- Softcover Book
- JPY 21449
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
The dataset is available on the web athttp://www.dia.uniroma3.it/db/icr/.
References
Aiolli, F., Donini, M.: EasyMKL: a scalable multiple kernel learning algorithm. Neurocomputing169, 215–224 (2015)
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of the 2016 International Conference on Learning Representations, ICLR (2016)
Eisenach, C., Wang, Z., Liu, H.: Nonparametrically learning activation functions in deep neural nets. In: 5th International Conference for Learning Representations (Workshop Track) (2017)
Firmani, D., Maiorino, M., Merialdo, P., Nieddu, E.: Towards knowledge discovery from the Vatican Secret Archives. In Codice Ratio-episode 1: machine transcription of the manuscripts. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 263–272. ACM (2018)
Firmani, D., Merialdo, P., Nieddu, E., Scardapane, S.: In Codice Ratio: OCR of handwritten Latin documents using deep convolutional networks. In: 11th International Workshop on Artificial Intelligence for Cultural Heritage, AI*CH 2017, pp. 9–16 (2017)
Genton, M.G.: Classes of kernels for machine learning: a statistics perspective. J. Mach. Learn. Res.2(Dec), 299–312 (2001)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, AISTATS, p. 275 (2011)
Gönen, M., Alpaydın, E.: Multiple kernel learning algorithms. J. Mach. Learn. Res.12(Jul), 2211–2268 (2011)
Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: Proceedings of the 30th International Conference on Machine Learning, ICML (2013)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV, pp. 1026–1034 (2015)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, ICML, pp. 448–456 (2015)
Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J., Yan, S.: Deep learning with S-shaped rectified linear activation units. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (2016)
Liu, W., Principe, J.C., Haykin, S.: Kernel Adaptive Filtering: A Comprehensive Introduction. Wiley, Hoboken (2011)
Oneto, L., Navarin, N., Donini, M., Ridella, S., Sperduti, A., Aiolli, F., Anguita, D.: Learning with kernels: a local rademacher complexity-based analysis with application to graph kernels. IEEE Trans. Neural Netw. Learn. Syst.29(10), 4660–4671 (2017)
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprintarXiv:1710.05941 (2017)
Scardapane, S., Van Vaerenbergh, S., Totaro, S., Uncini, A.: Kafnets: kernel-based non-parametric activation functions for neural networks. Neural Netw. (2018, in press)
Scardapane, S., Van Vaerenbergh, S., Hussain, A., Uncini, A.: Complex-valued neural networks with non-parametric activation functions. IEEE Trans. Emerg. Top. Comput. Intell. (2018, in press)
Siniscalchi, S.M., Salerno, V.M.: Adaptation to new microphones using artificial neural networks with trainable activation functions. IEEE Trans. Neural Netw. Learn. Syst.28(8), 1959–1965 (2017)
Zhang, X., Trmal, J., Povey, D., Khudanpur, S.: Improving deep neural network acoustic models using generalized maxout networks. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, pp. 215–219. IEEE (2014)
Author information
Authors and Affiliations
DIET Department, Sapienza University of Rome, Rome, Italy
Simone Scardapane
Department of Engineering, Roma Tre University, Rome, Italy
Elena Nieddu, Donatella Firmani & Paolo Merialdo
- Simone Scardapane
You can also search for this author inPubMed Google Scholar
- Elena Nieddu
You can also search for this author inPubMed Google Scholar
- Donatella Firmani
You can also search for this author inPubMed Google Scholar
- Paolo Merialdo
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toSimone Scardapane.
Editor information
Editors and Affiliations
Department of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genova, Genoa, Italy
Luca Oneto
Department of Mathematics, University of Padova, Padua, Italy
Nicolò Navarin
Department of Mathematics, University of Padova, Padua, Italy
Alessandro Sperduti
Department of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genova, Genoa, Italy
Davide Anguita
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Scardapane, S., Nieddu, E., Firmani, D., Merialdo, P. (2020). Multikernel Activation Functions: Formulation and a Case Study. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds) Recent Advances in Big Data and Deep Learning. INNSBDDL 2019. Proceedings of the International Neural Networks Society, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-16841-4_33
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-030-16840-7
Online ISBN:978-3-030-16841-4
eBook Packages:Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative