Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Hybrid HMM/BLSTM system for multi-script keyword spotting in printed and handwritten documents with identification stage

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a novel script-independent approach for word spotting in printed and handwritten multi-script documents. Since each writing type and script need to be processed using a specific spotting engine, the proposed system proceeds on two stages. The script identification is a preliminary stage that aims at recognizing on one level the writing type and the script of the input image document. Second, a specific word spotting method is used to spot query words in a large collection of documents. The proposed spotting system is based on deep bidirectional long short-term memory neural network and hidden Markov model (HMM) hybrid architecture. It takes advantage of DNN’s strong representation learning power and HMM’s sequential modeling ability. The global system has been evaluated on a mixed corpus of public databases such as KHATT, PKHATT for Arabic script and RIMES for Latin script. The experimental results on script identification and keyword spotting confirm the effectiveness of the proposed approach.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. The precision/recall break-even point is the value at which the precision is equal to the recall.

References

  1. Mandal R, Roy PP, Pal U, Blumenstein M (2015) Multi-lingual date field extraction for automatic document retrieval by machine. Inf Sci 314:277–292

    Article  Google Scholar 

  2. Rath T, Manmatha R (2003) Features for word spotting in historical manuscripts. In: ICDAR, pp 218–222

  3. Cao H, Govindaraju V (2007) Template-free word spotting in low-quality manuscripts. In: ICDAR, pp 392–396

  4. Adamek T, Connor NE, Smeaton AF (2007) Word matching using single closed contours for indexing handwritten historical documents. IJDAR 9(2):153–165

    Article  Google Scholar 

  5. Rusiñol M, Aldavert D, Toledo R, Lladós J (2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: ICDAR, pp 63–67

  6. Rodríguez-Serrano JA, Perronnin F, Llados J (2009) A similarity measure between vector sequences with application to handwritten word image retrieval. In: CVPR, pp 1722–1729

  7. Choisy C (2007) Dynamic handwritten keyword spotting based on the NSHP-HMM. In: Proceedings of the Ninth ICDAR, vol 1, pp 242–246

  8. Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden Markov models and universal vocabularies. Pattern Recognit 42:2106–2116

    Article  Google Scholar 

  9. Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224.https://doi.org/10.1109/TPAMI.2011.113

    Article  Google Scholar 

  10. Thomas S, Chatelain C, Heutte L, Paquet T (2010) An information extraction model for unconstrained handwritten documents. In: ICPR, Istanbul, Turkey, pp 1–4

  11. Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit Lett 33(7):934–942

    Article  Google Scholar 

  12. Weshah S, Kumar G, Govindaraju V (2012) Script independent word spotting in offline handwritten documents based on hidden markov models. In: Proceedings of the 13th international conference on frontiers in handwriting recognition (ICFHR 2012)

  13. Kessentini Y, Paquet T (2015) Keyword spotting in handwritten documents based on a generic text line HMM and a SVM verification, In: 2015 13th international conference on document analysis and recognition (ICDAR), pp 41–45.https://doi.org/10.1109/ICDAR.2015.7333722

  14. Aldavert D, Rusiñol M, Toledo R, Lladós J (2013) Integrating visual and textual cues for query-by-string word spotting, In: 2013 12th International conference on document analysis and recognition, pp 511–515.https://doi.org/10.1109/ICDAR.2013.108

  15. Rothacker L, Fink GA (2015) Segmentation-free query-by-string word spotting with bag-of-features HMMs. In: 2015 13th International conference on document analysis and recognition (ICDAR), pp 661–665.https://doi.org/10.1109/ICDAR.2015.7333844

  16. Puigcerver J, Toselli AH, Vidal E (2014) Word-graph and character-lattice combination for KWS in handwritten documents. In: 2014 14th International conference on frontiers in handwriting recognition, pp 181–186.https://doi.org/10.1109/ICFHR.2014.38

  17. Wshah S, Kumar G, Govindaraju V (2014) Statistical script independent word spotting in offline handwritten documents. Pattern Recognit 47(3):1039–1050

    Article  Google Scholar 

  18. Bideault G, Mioulet L, Chatelain C, Paquet T (2015) Benchmarking discriminative approaches for word spotting in handwritten documents. In: 2015 13th International conference on document analysis and recognition (ICDAR), pp 201–205https://doi.org/10.1109/ICDAR.2015.7333752

  19. Thomas S, Chatelain C, Heutte L, Paquet T, Kessentini Y (2015) A deep HMM model for multiple keywords spotting in handwritten documents. Pattern Anal Appl 18(4):1003–1015.https://doi.org/10.1007/s10044-014-0433-3

    Article MathSciNet  Google Scholar 

  20. Wöllmer M, Eyben F, Graves A, Schuller B, Rigoll G (2009) A tandem BLSTM-DBN architecture for keyword spotting with enhanced context modeling. In: In Proceedings of NOLISP

  21. Almazan J, Gordo A, Fornës A, Valveny E (2014) Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell 36:2552–2566 12

    Article  Google Scholar 

  22. Sudholt S, Fink GA (2016) PHOCNet: a deep convolutional neural network for word spotting in handwritten documents.arXiv:1604.00187

  23. Rusakov E, Sudholt S, Wolf F, Fink GA (2018) Expolring architectures for CNN-based word spotting.arXiv:1806.10866

  24. Sudholt S, Gurjar N, Fink GA (2017) Learning deep representations for word spotting under weak supervision.arXiv:1712.00250

  25. Bhunia AK, Das A, Bhunia AK, Kishore PSRaj, Roy PP (2018) Handwriting recognition in low-resource scripts using adversarial learning.arXiv:1811.01396

  26. Srihari SN, Srinivasan H, Huang C, Shetty S (2006) Spotting words in Latin, Devanagari and Arabic scripts. Vivek-Bombay 16:2

    Google Scholar 

  27. Bhardwaj A, Jose D, Govindaraju V (2008) Script independent word spotting in multilingual documents, pp 48–54

  28. Rodrıguez JA, Perronnin F (2008) Local gradient histogram features for word spotting in unconstrained handwritten documents. In: International conference on frontiers in handwriting recognition

  29. Baati K, Kanoun S, Benjlaiel M (2010) Différenciation d’écritures arabe et latine de natures imprimée et manuscrite par approche globale, In: Colloque International Francophone sur l’Ecrit et le Document (CIFED2010), Sousse, Tunisia, pp 1–12

  30. Obaidullah SM, Das N, Halder C, Roy K (2015) Indic script identification from handwritten document images—an unconstrained block-level approach. In: 2015 IEEE 2nd international conference on recent trends in information systems (ReTIS), pp 213–218

  31. Kavallieratou E, Stamatatos S (2004) Discrimination of machine-printed from handwritten text using simple structural characteristics. In: Proceedings of the 17th international conference on pattern recognition, 2004, ICPR 2004, Vol. 1, pp 437–440

  32. Genzel D, Popat AC, Teunen R, Fujii Y (2013) HMM-based script identification for OCR. In: Proceedings of the 4th international workshop on multilingual OCR, MOCR ’13. ACM, New York, NY, USA, pp 2:1–2:5

  33. Mozaffari S, Bahar P (2012) Farsi/Arabic handwritten from machine-printed words discrimination. In: 2012 International conference on frontiers in handwriting recognition (ICFHR), pp 698–703

  34. Pati PB, Ramakrishnan A (2008) Word level multi-script identification. Pattern Recognit Lett 29(9):1218–1229

    Article  Google Scholar 

  35. Saidani A, Kacem A, Belaid A (2015) Arabic/Latin and machine-printed/handwritten word discrimination using HOG-based shape descriptor. ELCVIA 14(2):1–23

    Article  Google Scholar 

  36. da Silva L, Conci A, Sanchez A (2009) Automatic discrimination between printed and handwritten text in documents. In: Brazilian symposium on computer graphics and image processing (SIBGRAPI), pp 261–267

  37. Guo JK, Ma MY (2001) Separating handwritten material from machine printed text using hidden Markov models. In: Proceedings of sixth international conference on document analysis and recognition, 2001, pp 439–443

  38. Zhou L, Lu Y, Tan CL (2006) Bangla/English script identification based on analysis of connected component profiles. In: Document analysis systems VII. Springer, pp 243–254

  39. Bhunia AK Konwer A, Bhowmick A, Roy PP, Umapada P (2018) Script identification in natural scene image and video frame using attention based convolutional-LSTM network.arXiv:1801.00470

  40. Feng Z, Yang Z, Jin L, Huang S, Sun J (2017) Robust shared feature learning for script and handwritten/machine-printed identification. Pattern Recognit Lett 100:09

    Article  Google Scholar 

  41. Jajoo M, Chakraborty N, Mollah AF, Basu S, Sarkar R (2019) Script identification from camera-captured multi-script scene text components. In: Kalita J, Balas V, Borah S, Pradhan R (eds) Recent developments in machine learning and data analytics, vol 740. Springer, Singapore, pp 159–166

    Chapter  Google Scholar 

  42. Lu L, Yi Y, Huang F, Wang K, Wang Q (2019) Integrating local CNN and global CNN for script identification in natural scene images. IEEE Access 7:52669–52679

    Article  Google Scholar 

  43. Yang Z, Jin L, Feng Z, Sun J, Zhou W (2017) Identifying machine-printed and handwritten texts using dropregion and deep convolutional network. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 1150–1155

  44. Chaudhury S, Sheth R (1999) Trainable script identification strategies for Indian languages. In: International conference on document analysis and recognition (ICDAR), pp 657–660

  45. Pal U, Chaudhuri B (2003) Script line separation from Indian multi-script documents. The Institution of Electronics and Telecommunication Engineers (IETE) Journal of Research 3–11

  46. Baati K, Kanoun S, Benjlaiel M (2010) Différenciation d’écritures arabe et latine de natures imprimée et manuscrite par approche globale. In: Colloque International Francophone sur l’Ecrit et le Document (CIFED2010)

  47. Kavallieratou E, Stamatatos S (2004) Discrimination of machine-printed from handwritten text using simple structural characteristics. In: International conference on pattern recognition (ICPR), pp 437–440

  48. Zhou L, Lu Y, Tan C (2006) Bangla/English script identification based on analysis of connected component profiles. In: Document analysis systems (DAS), pp 243–254

  49. Mozaffari S, Bahar P (2012) Farsi/Arabic handwritten from machine-printed words discrimination. In: International conference on frontiers in handwriting recognition (ICFHR), pp 698–703

  50. Pal U, Chaudhuri B (2001) Automatic identification of English, Chinese, Arabic, Devanagari and Bangla script line. In: International conference on document analysis and recognition (ICDAR), pp 790–794

  51. Haboubi S, Maddouri S, Amiri H (2011) Discrimination between Arabic and Latin from bilingual documents. In: International conference on communications computing and control applications (CCCA), pp 1–6

  52. Pal U, Chaudhuri B (2001) Machine-printed and hand-written text lines identification, pp 431–441

  53. Ben Jlaiel M, Kanoun S, Alimi A, Mullot R (2007) Three decision levels strategy for Arabic and Latin texts differentiation in printed and handwritten natures. In: International conference on document analysis and recognition (ICDAR), pp 1103–1107

  54. Guo J, Ma M (2001) Separating handwritten material from machine printed text using hidden Markov models. In: International conference on document analysis and recognition (ICDAR), pp 439–443

  55. Paquet T, Heutte L, Koch G, Chatelain C (2012) A categorization system for handwritten documents. Int J Doc Anal Recognit 15(4):315–330

    Article  Google Scholar 

  56. Kimura F, Tsuruoka S, Miyake Y, Shridhar M (1994) A lexicon directed algorithm for recognition of unconstrained handwritten words. IEICE Trans Inf Syst E77–D(7):785–793

    Google Scholar 

  57. Saidani A, Kacem Echi A, Belaid A (2015) Arabic/Latin and machine-printed/handwritten word discrimination using HOG-based shape descriptor. In: ELCVIA: electronic letters on computer vision and image analysis, pp 0001–23

  58. Hammer B (2000) On the approximation capability of recurrent neural networks. Neurocomputing 31(14):107–123

    Article  Google Scholar 

  59. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

    Article  Google Scholar 

  60. Hochreiter S, Bengio Y, Frasconi P, Schmidhuber J (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer SC, Kolen JF (eds) A field guide to dynamical recurrent neural networks. IEEE Press

  61. Hochreiter JSS (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  62. Sak H, Senior AW, Beaufays F (2014) Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. CoRR.arXiv:1402.1128

  63. Mozer MC, Backpropagation L (1995) A focused backpropagation algorithm for temporal pattern recognition. Erlbaum Associates Inc., Hillsdale, NJ, USA, pp 137–169.http://dl.acm.org/citation.cfm?id=201784.201791

  64. Bianne-Bernard AL, Menasri F, Mohamad RA-H, Mokbel C, Kermorvant C, Likforman-Sulem L (2011) Dynamic and contextual information in HMM modeling for handwritten word recognition. IEEE Trans Pattern Anal Mac Intell 33(10):2066–2080.https://doi.org/10.1109/TPAMI.2011.22

    Article  Google Scholar 

  65. Bunke H, Bengio S, Vinciarelli A (2004) Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720

    Article  Google Scholar 

  66. Rose R, Paul D (1990) A hidden Markov model based keyword recognition system. In: 1990 International conference on acoustics, speech, and signal processing, 1990, ICASSP-90, vol 1, pp 129–132

  67. Mahmoud SA, Ahmad I, Al-Khatib WG, Alshayeb M, Parvez MT, Märgner V, Fink GA (2014) Khatt: an open arabic offline handwritten text database. Pattern Recognit 47(3):1096–1112

    Article  Google Scholar 

  68. Ahmad I, Mahmoud SA, Fink GA (2016) Open-vocabulary recognition of machine-printed arabic text using hidden markov models. Pattern Recognit 51:97–111.https://doi.org/10.1016/j.patcog.2015.09.011

    Article  Google Scholar 

  69. Grosicki E, El-Abed H (2011) ICDAR 2011—French handwriting recognition competition. In: 2011 International conference on document analysis and recognition (ICDAR), pp 1459–1463

  70. Povey D, Ghoshal A, Boulianne G, Goel N, Hannemann M, Qian Y, Schwarz P, Stemmer G (2011) The kaldi speech recognition toolkit. In: In IEEE 2011 workshop

Download references

Author information

Authors and Affiliations

  1. MIRACL Laboratory, University of Sfax, Sfax, Tunisia

    Ahmed Cheikhrouhou, Yousri Kessentini & Slim Kanoun

  2. Centre de Recherche en Numérique de Sfax, Sfax, Tunisia

    Ahmed Cheikhrouhou & Yousri Kessentini

  3. LITIS Laboratory EA 4108, University of Rouen, St Etienne du Rouvray, France

    Yousri Kessentini

Authors
  1. Ahmed Cheikhrouhou

    You can also search for this author inPubMed Google Scholar

  2. Yousri Kessentini

    You can also search for this author inPubMed Google Scholar

  3. Slim Kanoun

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toAhmed Cheikhrouhou.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheikhrouhou, A., Kessentini, Y. & Kanoun, S. Hybrid HMM/BLSTM system for multi-script keyword spotting in printed and handwritten documents with identification stage.Neural Comput & Applic32, 9201–9215 (2020). https://doi.org/10.1007/s00521-019-04429-w

Download citation

Keywords

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Advertisement


[8]ページ先頭

©2009-2025 Movatter.jp