Deep Learning Based CAPTCHA Recognition Network with Grouping Strategy
Abstract
:1. Introduction
- We present a novel CAPTCHA recognition algorithm that utilizes deep learning and binary images with character grouping, eliminating the requirement for segmenting CAPTCHA into individual characters.
- We implement a CAPTCHA recognition model with an adjustable number of softmax layers, leading to a range of versions of the model with selecting the model’s version with the highest performance.
- By implementing the character grouping algorithm, we have achieved a substantial reduction in the storage requirements of our model. Additionally, this has led to a simplification of the architecture within the CNN module, resulting in a more streamlined and efficient system.
- We conducted a series of experiments involving various CAPTCHA schemes datasets to evaluate the performance of our proposed CRNGS network. The results obtained from these experiments clearly demonstrate the high efficiency and effectiveness of our network in accurately recognizing CAPTCHA characters.
2. Related Work
3. Proposed Method
3.1. Conceptualization of Proposed Recognition Method
3.2. Procedure of Proposed CAPTCHAs’ Recognition Algorithm
- 1.
- First, for a CAPTCHA withk characters, we arbitrarily determine the number of softmax layerst ().
- 2.
- We generaten copies of the CAPTCHAs input image according to Equation (1).
- 3.
- We definen binary images (ABIs) so that each binary image is given a unique number between 1 andn. For instance, the first binary image is given the number 1 and the second binary image is given the number 2, and so on.
- 4.
- Each binary image (ABI) is attached to one copy of then copies of the CAPTCHA, ending up withn ABICCs.
- 5.
- For training the proposed CAPTCHA recognition algorithm, labels are added to each ABICC so that if we assumet = 2, then the ABICC with an ABI of number 1 is given the first characters’ group (first and second characters of CAPTCHA) to be its label, and ABICC with an ABI of number 2 is given the second character’s group (third and fourth characters of CAPTCHA) to be its label and so on.
- 6.
- Those ABICCs with their labels are used to train a powerful CNN to classify and recognize CAPTCHA characters. If we assumet = 2, then the ABI of an ABICC represents two character’s order or location and the label represents the two characters classes. This CNN classifies the input CAPTCHA copy characters into differentm classes.
- 7.
- The ABICCs will be recognized using this well-trained CNN serially one by one using the softmax layers such that every softmax layer classifies one character and the desired output (CAPTCHA individual characters) will be achieved.Figure 5 shows the whole procedure of proposed CAPTCHA recognition algorithm when the number of softmax layerst is set to 2.
3.3. Structure of the Proposed CRNGS Approach
4. Experiments and Results
4.1. Used Dataset and Labeling Description
4.1.1. Bank of China (BoC) Captcha Scheme
4.1.2. Weibo CAPTCHA Scheme
4.1.3. Captcha 0.3 CAPTCHA Scheme
4.1.4. Gregwar CAPTCHA Scheme
4.1.5. Preparing Dataset Images and Labeling Description
4.2. Structure and Parameters of Proposed CRNGS Model
4.3. Accuracies of Proposed Models
4.4. Comparison Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Von Ahn, L.; Blum, M.; Hopper, N.J.; Langford, J. CAPTCHA: Using hard AI problems for security. In Proceedings of the Eurocrypt, Warsaw, Poland, 4–8 May 2003; Springer: Berlin/Heidelberg, Germany, 2003; Volume 2656, pp. 294–311. [Google Scholar]
- Kumar, M.; Jindal, M.; Kumar, M. A systematic survey on CAPTCHA recognition: Types, creation and breaking techniques.Arch. Comput. Methods Eng.2022,29, 1107–1136. [Google Scholar] [CrossRef]
- Singh, V.P.; Pal, P. Survey of different types of CAPTCHA.Int. J. Comput. Sci. Inf. Technol.2014,5, 2242–2245. [Google Scholar]
- Lupkowski, P.; Urbanski, M. SemCAPTCHA—User-friendly alternative for OCR-based CAPTCHA systems. In Proceedings of the 2008 International Multiconference on Computer Science and Information Technology, Wisla, Poland, 20–22 October 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 325–329. [Google Scholar]
- Golle, P.; Ducheneaut, N. Keeping bots out of online games. In Proceedings of the 2005 ACM SIGCHI International Conference on Advances in Computer Entertainment Technology, Valencia, Spain, 15–17 June 2005; pp. 262–265. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks.Adv. Neural Inf. Process. Syst.2012,25, 84–90. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Li, J.; Chen, H.; Zhou, T.; Li, X. Tailings pond risk prediction using long short-term memory networks.IEEE Access2019,7, 182527–182537. [Google Scholar] [CrossRef]
- Zhou, L.; Wang, J.; Lu, W.; Yang, F.; Zhang, R.; Zhang, L. Captcha recognition based on deep learning. In Proceedings of the 4th International Conference on Big Data Research, Tokyo, Japan, 27–29 November 2020; pp. 89–93. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks.arXiv2015, arXiv:1506.01497. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y. An improved faster R-CNN for object detection. In Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 8–9 December 2018; IEEE: Piscataway, NJ, USA, 2018; Volume 2, pp. 119–123. [Google Scholar]
- Malik, S.; Soundararajan, R. Llrnet: A multiscale subband learning approach for low light image restoration. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 779–783. [Google Scholar]
- Jin, Z.; Iqbal, M.Z.; Bobkov, D.; Zou, W.; Li, X.; Steinbach, E. A flexible deep CNN framework for image restoration.IEEE Trans. Multimed.2019,22, 1055–1068. [Google Scholar] [CrossRef]
- Dong, W.; Wang, P.; Yin, W.; Shi, G.; Wu, F.; Lu, X. Denoising prior driven deep neural network for image restoration.IEEE Trans. Pattern Anal. Mach. Intell.2018,41, 2305–2318. [Google Scholar] [CrossRef] [PubMed]
- Zhang, L.; Xie, Y.; Luan, X.; He, J. Captcha automatic segmentation and recognition based on improved vertical projection. In Proceedings of the 2017 IEEE 9th International Conference on Communication Software and Networks (ICCSN), Guangzhou, China, 6–8 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1167–1172. [Google Scholar]
- Chen, C.J.; Wang, Y.W.; Fang, W.P. A study on captcha recognition. In Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Kitakyushu, Japan, 27–29 August 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 395–398. [Google Scholar]
- Anagnostopoulos, C.N.E.; Anagnostopoulos, I.E.; Psoroulas, I.D.; Loumos, V.; Kayafas, E. License plate recognition from still images and video sequences: A survey.IEEE Trans. Intell. Transp. Syst.2008,9, 377–391. [Google Scholar] [CrossRef]
- Wang, Q. License plate recognition via convolutional neural networks. In Proceedings of the 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 24–26 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 926–929. [Google Scholar]
- Chellapilla, K.; Simard, P. Using machine learning to break visual human interaction proofs (HIPs).Adv. Neural Inf. Process. Syst.2004,17. [Google Scholar]
- Saleem, N.; Muazzam, H.; Tahir, H.; Farooq, U. Automatic license plate recognition using extracted features. In Proceedings of the 2016 4th International Symposium on Computational and Business Intelligence (ISCBI), Olten, Switzerland, 5–7 September 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 221–225. [Google Scholar]
- Sasi, A.; Sharma, S.; Cheeran, A.N. Automatic car number plate recognition. In Proceedings of the 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), Coimbatore, India, 17–18 March 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
- Hussain, R.; Gao, H.; Shaikh, R.A.; Soomro, S.P. Recognition based segmentation of connected characters in text based CAPTCHAs. In Proceedings of the 2016 8th IEEE International Conference on Communication Software and Networks (ICCSN), Beijing, China, 4–6 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 673–676. [Google Scholar]
- Sakkatos, P.; Theerayut, W.; Nuttapol, V.; Surapong, P. Analysis of text-based CAPTCHA images using Template Matching Correlation technique. In Proceedings of the The 4th Joint International Conference on Information and Communication Technology, Electronic and Electrical Engineering (JICTEE), Chiang Rai, Thailand, 5–8 March 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 1–5. [Google Scholar]
- Wu, C.; On, L.C.; Weng, C.H.; Kuan, T.S.; Ng, K. A Macao license plate recognition system. In Proceedings of the 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, 18–21 August 2005; IEEE: Piscataway, NJ, USA, 2005; Volume 7, pp. 4506–4510. [Google Scholar]
- Baten, R.A.; Omair, Z.; Sikder, U. Bangla license plate reader for metropolitan cities of Bangladesh using template matching. In Proceedings of the 8th International Conference on Electrical and Computer Engineering, Dhaka, Bangladesh, 20–22 December 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 776–779. [Google Scholar]
- Qing, K.; Zhang, R. A multi-label neural network approach to solving connected CAPTCHAs. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; IEEE: Piscataway, NJ, USA, 2017; Volume 1, pp. 1313–1317. [Google Scholar]
- Wang, Z.; Shi, P. CAPTCHA recognition method based on CNN with focal loss.Complexity2021,2021, 1–10. [Google Scholar] [CrossRef]
- Zi, Y.; Gao, H.; Cheng, Z.; Liu, Y. An end-to-end attack on text captchas.IEEE Trans. Inf. Forensics Secur.2019,15, 753–766. [Google Scholar] [CrossRef]
- Wang, P.; Gao, H.; Shi, Z.; Yuan, Z.; Hu, J. Simple and easy: Transfer learning-based attacks to text CAPTCHA.IEEE Access2020,8, 59044–59058. [Google Scholar] [CrossRef]
- Yu, N.; Darling, K. A low-cost approach to crack python CAPTCHAs using AI-based chosen-plaintext attack.Appl. Sci.2019,9, 2010. [Google Scholar] [CrossRef]
- Kumar, M.; Jindal, M.; Kumar, M. An efficient technique for breaking of coloured Hindi CAPTCHA.Soft Comput.2023,27, 11661–11686. [Google Scholar] [CrossRef]
- Kumar, M.; Jindal, M.K.; Kumar, M. Design of innovative CAPTCHA for hindi language.Neural Comput. Appl.2022,34, 4957–4992. [Google Scholar] [CrossRef]
- Ray, P.; Bera, A.; Giri, D.; Bhattacharjee, D. Style matching CAPTCHA: Match neural transferred styles to thwart intelligent attacks.Multimed. Syst.2023,29, 1865–1895. [Google Scholar] [CrossRef]
- Thobhani, A.; Gao, M.; Hawbani, A.; Ali, S.T.M.; Abdussalam, A. CAPTCHA recognition using deep learning with attached binary images.Electronics2020,9, 1522. [Google Scholar] [CrossRef]
- Trong, N.D.; Huong, T.H.; Hoang, V.T. New Cognitive Deep-Learning CAPTCHA.Sensors2023,23, 2338. [Google Scholar] [CrossRef] [PubMed]
- Hajyan, M.; Hosseni, A.; Toosi, R.; Akhaee, M.A. Farsi CAPTCHA Recognition Using Attention-Based Convolutional Neural Network. In Proceedings of the 2023 9th International Conference on Web Research (ICWR), Tehran, Iran, 3–4 May 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 221–226. [Google Scholar]
- Hoang, D.C.; Nguyen, C.V.; Kharraz, A. EnSolver: Uncertainty-aware Captcha solver using deep ensembles.arXiv2023, arXiv:2307.15180. [Google Scholar]
- Fu, M.; Chen, N.; Hou, X.; Sun, H.; Abdussalam, A.; Sun, S. Real-time vehicle license plate recognition using deep learning. InSignal and Information Processing, Networking and Computers: Proceedings of the 4th International Conference on Signal and Information Processing, Networking and Computers (ICSINC), 4th ed.; Springer: Berlin/Heidelberg, Germany, 2019; pp. 35–41. [Google Scholar]
- Graves, A.; Fernández, S.; Gomez, F.; Schmidhuber, J. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 369–376. [Google Scholar]
BoC CAPTCHA Scheme | ||||
---|---|---|---|---|
CRABI (CRNGS_1s) | CRNGS_2s | CRNGS_3s | Multi-Label (CRNGS_4s) | |
1st Character Accuracy | 98.78% (9878/10,000) | 98.61% (9861/10,000) | 98.49% (9849/10,000) | 98.92% (9892/10,000) |
2nd Character Accuracy | 98.73% (9873/10,000) | 98.57% (9857/10,000) | 98.96% (9896/10,000) | 99.02% (9902/10,000) |
3rd Character Accuracy | 97.72% (9772/10,000) | 99.09% (9909/10,000) | 99.01% (9901/10,000) | 99.88% (9988/10,000) |
4th Character Accuracy | 98.56% (9856/10,000) | 98.86% (9886/10,000) | 99.06% (9906/10,000) | 99.12% (9912/10,000) |
Total Character Accuracy | 98.44% (39,379/40,000) | 98.78% (39,513/40,000) | 97.68% (39,552/40,000) | 99.03% (39,614/40,000) |
Overall CAPTCHA Accuracy | 94.33% (9433/10,000) | 95.41% (9541/10,000) | 95.82% (9582/10,000) | 96.39% (9639/10,000) |
Webo CAPTCHA Scheme | ||||
---|---|---|---|---|
CRABI (CRNGS_1) [35] | CRNGS_2s | CRNGS_3s | Multi-Label (CRNGS_4s) | |
1st Character Accuracy | 98.70% (9870/10,000) | 98.01% (9801/10,000)) | 82.81% (8281/10,000) | 98.26% (9826/10,000) |
2nd Character Accuracy | 98.35% (9835/10,000) | 94.00% (9400/10,000) | 81.00% (8100/10,000) | 94.36% (9436/10,000) |
3rd Character Accuracy | 95.83% (9583/10,000) | 96.65% (9821/10,000) | 80.13% (8013/10,000) | 93.85% (8281/10,000) |
4th Character Accuracy | 98.68% (9868/10,000) | 98.21% (9821/10,000) | 83.98% (8398/10,000) | 97.64% (9764/10,000) |
Total Character Accuracy | 97.89% (39,156/40,000) | 96.72% (38,687/40,000) | 81.98% (32,792/40,000) | 96.02% (38,411/40,000) |
Overall CAPTCHA Accuracy | 92.68% (9268/10,000) | 88.75% (8875/10,000) | 85.93% (8593/10,000) | 86.24% (8624/10,000) |
Captcha 0.3 CAPTCHA Scheme | ||||
---|---|---|---|---|
CRABI (CRNGS_1s) | CRNGS_2s | CRNGS_3s | Multi-Label (CRNGS_4s) | |
1st Character Accuracy | 99.92% (49,958/50,000) | 98.49% (9849/10,000) | 98.81% (9881/10,000) | 99.46% (9946/10,000) |
2nd Character Accuracy | 99.90% (49,951/50,000) | 95.26% (9526/10,000) | 96.87% (9687/10,000) | 98.16% (9816/10,000) |
3rd Character Accuracy | 98.66% (49,331/50,000) | 95.70% (9570/10,000) | 96.15% (9615/10,000) | 97.91% (9791/10,000) |
4th Character Accuracy | 99.28% (49,642/50,000) | 98.39% (9839/10,000) | 98.89% (98.89/10,000) | 99.32% (9932/10,000) |
Total Character Accuracy | 96.11% (38,444/40,000) | 96.96% (38,784/40,000) | 97.68% (39,072/40,000) | 98.71% (39,485/40,000) |
Overall CAPTCHA Accuracy | 85.93% (8593/10,000) | 89.16% (8916/10,000) | 91.62% (9162/10,000) | 95.33% (9533/10,000) |
Gregwar CAPTCHA Scheme | ||||
---|---|---|---|---|
CRABI (CRNGS_1s) [35] | CRNGS_2s | CRNGS_3s | Multi-Label (CRNGS_4s) | |
1st Character Accuracy | 93.12% (9312/10,000) | 88.83% (8883/10,000) | 87.05% (8705/10,000) | 88.69% (8869/10,000) |
2nd Character Accuracy | 85.28% (8528/10,000) | 74.17% (7417/10,000) | 79.59% (7959/10,000) | 78.43% (7843/10,000) |
3rd Character Accuracy | 74.03% (7403/10,000) | 80.65% (8065/10,000) | 78.70% (7870/10,000) | 78.17% (7817/10,000) |
4th Character Accuracy | 88.68% (8868/10,000) | 88.37% (8837/10,000) | 87.79% (8779/10,000) | 87.93% (8793/10,000) |
Total Character Accuracy | 85.28% (34,111/40,000) | 83.00% (33,202/40,000) | 83.28% (33,313/40,000) | 83.30% (33,322/40,000) |
Overall CAPTCHA Accuracy | 54.20% (5420/10,000) | 49.78% (4978/10,000) | 50.77% (5077/10,000) | 51.23% (5123/10,000) |
BoC CAPTCHA Scheme | |||||
---|---|---|---|---|---|
CRABI (CRNGS_1s) | Multilabel (CRNGS_4s) | CRNN | CRNGS_2s | CRNGS_3s | |
Testing Total Character Accuracy | 98.44% (39,379/40,000) | 99.03% (39,614/40,000) | - | 98.78% (39,513/40,000) | 97.68% (39,552/40,000) |
Testing Overall CAPTCHA Accuracy | 94.33% (9433/10,000) | 96.39% (9639/10,000) | 96.47% (9647/10,000) | 95.41% (9541/10,000) | 95.82% (9582/10,000) |
Non-trainable and Trainable Parameters | 6,640,090 | 7,838,248 | 10,478,875 | 7,039,476 | 7,484,945 |
Size of Weights on Hard Disk | 79.9 MB | 94.1 MB | 125.1 MB | 84.7 MB | 90.0 MB |
Weibo CAPTCHA Scheme | |||||
---|---|---|---|---|---|
CRABI (CRNGS_1s) [35] | Multilabel (CRNGS_4s) [35] | CRNN [35] | CRNGS_2s | CRNGS_3s | |
Testing Total Character Accuracy | 97.89% (39,156/40,000) | 96.03% (38,411/40,000) | - | 96.72% (38,687/40,000) | 81.98% (32,792/40,000) |
Testing Overall CAPTCHA Accuracy | 92.68% (9268/10,000) | 86.24% (8624/10,000) | 91.05% (9105/10,000) | 88.75% (8875/10,000) | 85.93% (8593/10,000) |
Non-trainable and Trainable Parameters | 6,670,812 | 7,961,136 | 10,477,853 | 7,100,920 | 7,577,111 |
Size of Weights on Hard Disk | 25.5 MB | 30.4 MB | 40 MB | 28.5 MB | 28.9 MB |
Captcha 0.3 CAPTCHA Scheme | |||||
---|---|---|---|---|---|
CRABI (CRNGS_1s) | Multilabel (CRNGS_4s) | CRNN | CRNGS_2s | CRNGS_3s | |
Testing Total Character Accuracy | 96.11% (38,444/40,000) | 98.71% (39,485/40,000) | - | 96.96% (38,784/40,000) | 97.68% (39,072/40,000) |
Testing Overall CAPTCHA Accuracy | 85.93% (8593/10,000) | 95.33% (9533/10,000) | 83.57% (8357/10,000) | 89.16% (8916/10,000) | 91.62% (9162/10,000) |
Non-trainable and Trainable Parameters | 7,193,086 | 10,050,232 | 10,486,591 | 8,145,468 | 9,143,933 |
Size of Weights on Hard Disk | 27.5 MB | 38.4 MB | 40.1 MB | 31.8 MB | 34.9 MB |
Gregwar CAPTCHA Scheme | |||||
---|---|---|---|---|---|
CRABI (CRNGS_1s) [35] | Multilabel (CRNGS_4s) [35] | CRNN [35] | CRNGS_2s | CRNGS_3s | |
Testing Total Character Accuracy | 85.28% (34,111/40,000) | 83.31% (33,322/40,000) | - | 83.00% (33,202/40,000) | 83.28% (33,313/40,000) |
Testing Overall CAPTCHA Accuracy | 54.20% (5420/10,000) | 51.23% (5123/10,000) | 49.98% (4998/10,000) | 49.78% (4978/10,000) | 50.77% (5077/10,000) |
Non-trainable and Trainable Parameters | 7,193,086 | 10,050,232 | 10,486,591 | 8,145,468 | 9,143,933 |
Size of Weights on Hard Disk | 27.5 MB | 38.4 MB | 40.1 MB | 32.7 MB | 34.9 MB |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Derea, Z.; Zou, B.; Al-Shargabi, A.A.; Thobhani, A.; Abdussalam, A. Deep Learning Based CAPTCHA Recognition Network with Grouping Strategy.Sensors2023,23, 9487. https://doi.org/10.3390/s23239487
Derea Z, Zou B, Al-Shargabi AA, Thobhani A, Abdussalam A. Deep Learning Based CAPTCHA Recognition Network with Grouping Strategy.Sensors. 2023; 23(23):9487. https://doi.org/10.3390/s23239487
Chicago/Turabian StyleDerea, Zaid, Beiji Zou, Asma A. Al-Shargabi, Alaa Thobhani, and Amr Abdussalam. 2023. "Deep Learning Based CAPTCHA Recognition Network with Grouping Strategy"Sensors 23, no. 23: 9487. https://doi.org/10.3390/s23239487
APA StyleDerea, Z., Zou, B., Al-Shargabi, A. A., Thobhani, A., & Abdussalam, A. (2023). Deep Learning Based CAPTCHA Recognition Network with Grouping Strategy.Sensors,23(23), 9487. https://doi.org/10.3390/s23239487