Part of the book series:Lecture Notes in Computer Science ((LNTCS,volume 10639))
Included in the following conference series:
3809Accesses
Abstract
Recurrent neural network (RNN) and connectionist temporal classification (CTC) have showed successes in many sequence labeling tasks with the strong ability of dealing with the problems where the alignment between the inputs and the target labels is unknown. Residual network is a new structure of convolutional neural network and works well in various computer vision tasks. In this paper, we take advantage of the architectures mentioned above to create a new network for handwritten digit string recognition. First we design a residual network to extract features from input images, then we employ a RNN to model the contextual information within feature sequences and predict recognition results. At the top of this network, a standard CTC is applied to calculate the loss and yield the final results. These three parts compose an end-to-end trainable network. The proposed new architecture achieves the highest performances on ORAND-CAR-A and ORAND-CAR-B with recognition rates 89.75% and 91.14%, respectively. In addition, the experiments on a generated captcha dataset which has much longer string length show the potential of the proposed network to handle long strings.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 11439
- Price includes VAT (Japan)
- Softcover Book
- JPY 14299
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Diem, M., Fiel, S., Kleber, F., Sablatnig, R., Saavedra, J.M., Contreras, D., Barrios, J.M., Oliveira, L.S.: ICFHR 2014 competition on handwritten digit string recognition in challenging datasets (HDSRC 2014). In: 14th International Conference on Frontiers in Handwriting Recognition, pp. 779–784. IEEE Press, Crete (2014)
Gattal, A., Chibani, Y., Hadjadji, B.: Segmentation and recognition system for unknown-length handwritten digit strings. Pattern Anal. Appl.20, 307–323 (2017)
Saabni, R.: Recognizing handwritten single digits and digit strings using deep architecture of neural networks. In: 3th International Conference on Artificial Intelligence and Pattern Recognition, pp. 1–6 (2016)
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: 23th International Conference on Machine Learning, pp. 369–376 (2006)
Messina, R., Louradour, J.: Segmentation-free handwritten Chinese text recognition with LSTM-RNN. In 13th IAPR International Conference on Document Analysis and Recognition, pp. 171–175 (2015)
Wang, Q.-F., Yin, F., Liu, C.-L.: Handwritten chinese text recognition by integrating multiple contexts. IEEE Trans. Pattern Anal. Mach. Intell.34, 1469–1481 (2012)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition (2016). doi:10.1109/TPAMI.2016.2646371
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 29th IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprintarXiv:1408.5093 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: 28th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprintarXiv:1312.4400 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput.9, 1735–1780 (1997)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw.18, 602–610 (2005)
Author information
Authors and Affiliations
Shanghai Key Laboratory of Multidimensional Information Processing, Department of Computer Science and Technology, East China Normal University, Shanghai, 200062, China
Hongjian Zhan, Qingqing Wang & Yue Lu
- Hongjian Zhan
You can also search for this author inPubMed Google Scholar
- Qingqing Wang
You can also search for this author inPubMed Google Scholar
- Yue Lu
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toYue Lu.
Editor information
Editors and Affiliations
Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Zhan, H., Wang, Q., Lu, Y. (2017). Handwritten Digit String Recognition by Combination of Residual Network and RNN-CTC. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_62
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-319-70135-6
Online ISBN:978-3-319-70136-3
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative