Part of the book series:Lecture Notes in Computer Science ((LNIP,volume 12824))
Included in the following conference series:
3438Accesses
Abstract
An objective and fair evaluation metric is fundamental to scene text detection and recognition research. Existing metrics cannot handle properly one-to-many and many-to-one matchings that arise naturally from the bounding box granularity inconsistency issue. They also use thresholds to match the ground truth and detection boxes, which leads to unstable matching result. In this paper, we propose a novel End-to-end Evaluation Metric (EEM) to tackle these problems. EEM handles one-to-many and many-to-one matching cases more reasonably and is threshold-free. We design a simple yet effective method to find matching groups from the ground truth and detection boxes in an image. We further employ a label merging method and use normalized scores to evaluate the performance of end-to-end text recognition methods more fairly. We conduct extensive experiments on the ICDAR2015, RCTW dataset, and a new general OCR dataset covering 17 categories of real-life scenes. Experimental results demonstrate the effectiveness and fairness of the proposed evaluation metric.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 5719
- Price includes VAT (Japan)
- Softcover Book
- JPY 7149
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
If there is no GT or DT box in a matching group, the GT or DT label is an empty string.
References
Deng, D., Liu, H., Li, X., Cai, D.: PixelLink: detecting scene text via instance segmentation. In: AAAI (2018)
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Gomez, R., et al.: ICDAR2017 robust reading challenge on COCO-text. In: ICDAR, vol. 01, pp. 1435–1443 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV, pp. 2961–2969 (2017)
He, M., et al.: ICPR2018 contest on robust reading for multi-type web images. In: ICPR, pp. 7–12 (2018)
He, T., Tian, Z., Huang, W., Shen, C., Qiao, Y., Sun, C.: An end-to-end textspotter with explicit alignment and attention. In: CVPR, pp. 5020–5029 (2018)
Karatzas, D., et al.: ICDAR 2015 competition on robust reading. In: ICDAR, pp. 1156–1160 (2015)
Karatzas, D., et al.: ICDAR 2013 robust reading competition. In: ICDAR. pp. 1484–1493 (2013)
Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. (2019)
Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process.27(8), 3676–3690 (2018)
Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: FOTS: Fast oriented text spotting with a unified network. In: CVPR, pp. 5676–5685 (2018)
Liu, Y., Jin, L., Xie, Z., Luo, C., Zhang, S., Xie, L.: Tightness-aware evaluation protocol for scene text detection. In: CVPR, pp. 9612–9620 (2019)
Nayef, N., et al.: ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification - RRC-MLT. In: ICDAR, vol. 01, pp. 1454–1459 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS, pp. 91–99 (2015)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell.39(11), 2298–2304 (2017)
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: Aster: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal, Mach. Intell. (2018)
Shi, B., et al.: ICDAR2017 competition on reading Chinese text in the wild (RCTW-17). In: ICDAR, vol. 1, pp. 1429–1434 (2017)
Tian, Z., Huang, W., He, T., He, P., Qiao, Yu.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016).https://doi.org/10.1007/978-3-319-46484-8_4
Wang, K., Babenko, B., Belongie, S.J.: End-to-end scene text recognition. In: ICCV, pp. 1457–1464 (2011)
Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. In: IJDAR, vol. 8, pp. 280–296 (2006)
Xing, L., Tian, Z., Huang, W., Scott, M.R.: Convolutional character networks. In: ICCV (2019)
Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: CVPR, pp. 2642–2651 (2017)
Author information
Authors and Affiliations
VIVO AI Lab, Shenzhen, China
Jiedong Hao, Yafei Wen, Jie Deng, Jun Gan, Shuai Ren, Hui Tan & Xiaoxin Chen
- Jiedong Hao
You can also search for this author inPubMed Google Scholar
- Yafei Wen
You can also search for this author inPubMed Google Scholar
- Jie Deng
You can also search for this author inPubMed Google Scholar
- Jun Gan
You can also search for this author inPubMed Google Scholar
- Shuai Ren
You can also search for this author inPubMed Google Scholar
- Hui Tan
You can also search for this author inPubMed Google Scholar
- Xiaoxin Chen
You can also search for this author inPubMed Google Scholar
Editor information
Editors and Affiliations
Universitat Autònoma de Barcelona, Barcelona, Spain
Josep Lladós
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Kyushu University, Fukuoka-shi, Japan
Seiichi Uchida
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Hao, J.et al. (2021). EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_7
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-030-86336-4
Online ISBN:978-3-030-86337-1
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative