- Sheikh Muhammad Saqib1,
- Aamir Aftab1,
- Tehseen Mazhar2,
- Muhammad Iqbal1,
- Tariq Shahazad3,
- Ahmad Almogren4 &
- …
- Habib Hamam5,6,7,8
334Accesses
4Citations
Abstract
The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.





Similar content being viewed by others
Data availability
Data associated with this work can be availed from the corresponding author upon formal request.
References
Mahaur, B., Singh, N., Mishra, K.: Road object detection: a comparative study of deep learning-based algorithms. Multimed. Tools Appl.81(10), 14247–14282 (2022)
Bathija, A., Sharma, G.: Visual object detection and tracking using yolo and sort. Int. J. Eng. Res. Technol.8(11), 345–355 (2019)
Li, Q., Chen, Y., Zeng, Y.: Transformer with transfer CNN for remote-sensing-image object detection. Remote Sens.14(4), 984 (2022)
Hoeser, T., Bachofer, F., Kuenzer, C.: Object detection and image segmentation with deep learning on earth observation data: a review—Part II: applications. Remote Sens.12(18), 3053 (2020)
Raza, A., et al.: A hybrid deep learning-based approach for brain tumor classification. Electronics11(7), 1146 (2022)
Ktari, J., et al.: Lightweight AI framework for industry 4.0 case study: water meter recognition. Big Data Cognit. Comput.6(3), 72 (2022)
Haq, I., et al.: YOLO and residual network for colorectal cancer cell detection and counting. Heliyon10, e24403 (2024)
Tufail, A.B., et al.: Early-stage Alzheimer’s disease categorization using PET neuroimaging modality and convolutional neural networks in the 2D and 3D domains. Sensors22(12), 4609 (2022)
Nallapati, R., et al.: Abstractive text summarization using sequence-to-sequence rnns and beyond, (2016). arXiv preprintarXiv:1602.06023
Xu, M., et al.: Robust object detection with real-time fusion of multiview foreground silhouettes. Opt. Eng.51(4), 047202–047202 (2012)
Zhang, X., et al.: How well do deep learning-based methods for land cover classification and object detection perform on high resolution remote sensing imagery? Remote Sens.12(3), 417 (2020)
Jogin, M., et al.: Feature extraction using convolution neural networks (CNN) and deep learning. In: 2018 3rd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT). (2018). IEEE
Girshick, R., et al.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell.Intell.38(1), 142–158 (2015)
Saqib, S.M., et al.: DenseHillNet: a lightweight CNN for accurate classification of natural images. PeerJ Comput. Sci.10, e1995 (2024)
Ezzy, H., et al.: How the small object detection via machine learning and UAS-based remote-sensing imagery can support the achievement of SDG2: a case study of vole burrows. Remote Sens.13(16), 3191 (2021)
Li, K., et al.: Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J. Photogramm. Remote Sens.Photogramm. Remote Sens.159, 296–307 (2020)
Allugunti, V.R.: Breast cancer detection based on thermographic images using machine learning and deep learning algorithms. Int. J. Eng. Comput. Sci.4(1), 49–56 (2022)
Verma, R., Lee, D.: Extractive summarization: Limits, compression, generalized model and heuristics. Comput. y Sist.21(4), 787–798 (2017)
Kerdvibulvech, C. and Li, Q.: Empowering Zero-Shot Object Detection: A Human-in-the-Loop Strategy for Unveiling Unseen Realms in Visual Data. In: International Conference on Human-Computer Interaction. Springer (2024)
Kerdvibulvech, C.: Human hand motion recognition using an extended particle filter. In: Articulated Motion and Deformable Objects: 8th International Conference, AMDO 2014, Palma de Mallorca, Spain, July 16–18, 2014. Proceedings 8. Springer. (2014)
Pan, W., et al.: Semantic graph neural network: A conversion from spam email classification to graph classification. Sci. Program.2022(1), 6737080 (2022)
Saqib, S.M., et al.: Grouping of aspects into relevant category based on wordnet definitions. Int. J. Comput. Sci. Netw. Secur.19, 113–119 (2019)
Wang, X., et al.: ALTAS: An Intelligent Text Analysis System Based on Knowledge Graphs. In: Web and Big Data: Second International Joint Conference, APWeb-WAIM 2018, Macau, China, July 23–25, 2018, Proceedings, Part I 2. Springer. (2018)
Ben Jabra, M., et al.: COVID-19 diagnosis in chest X-rays using deep learning and majority voting. Appl. Sci.11(6), 2884 (2021)
Guefrechi, S., et al.: Deep learning based detection of COVID-19 from chest X-ray images. Multimed. Tools Appl.80, 31803–31820 (2021)
Ahmad, T., et al.: Object detection through modified YOLO neural network. Sci. Program.2020(1), 8403262 (2020)
Talha, M.M., et al.: Deep learning in news recommender systems: A comprehensive survey, challenges and future trends. Neurocomputing562, 126881 (2023)
Al-qaness, M.A., et al.: An improved YOLO-based road traffic monitoring system. Computing103(2), 211–230 (2021)
Chen, Z., et al.: An object detection and localization method based on improved YOLOv5 for the teleoperated robot. Appl. Sci.12(22), 11441 (2022)
Acknowledgements
This work was supported by King Saud University, Riyadh, Saudi Arabia, through Researchers Supporting Project number (RSP2024R184).
Author information
Authors and Affiliations
Department of Computing and Information Technology, Gomal University, Dera Ismail Khan, 29050, Pakistan
Sheikh Muhammad Saqib, Aamir Aftab & Muhammad Iqbal
Department of Computer Science, Virtual University of Pakistan, Lahore, 51000, Pakistan
Tehseen Mazhar
Department of Computer Science, COMSATS University Islamabad, Sahiwal Campus, Sahiwal, 57000, Pakistan
Tariq Shahazad
Department of Computer Science, College of Computer and Information Sciences, King Saud University, 11633, Riyadh, Saudi Arabia
Ahmad Almogren
School of Electrical Engineering, Dept of Electrical and Electronic Eng. Science, University of Johannesburg, Johannesburg, 2006, South Africa
Habib Hamam
Faculty of Engineering, Uni de Moncton, Moncton, NB, E1A3E9, Canada
Habib Hamam
Hodmas University College, Taleh Area, Mogadishu, Somalia
Habib Hamam
Bridges for Academic Excellence, Tunis, Tunisia
Habib Hamam
- Sheikh Muhammad Saqib
You can also search for this author inPubMed Google Scholar
- Aamir Aftab
You can also search for this author inPubMed Google Scholar
- Tehseen Mazhar
You can also search for this author inPubMed Google Scholar
- Muhammad Iqbal
You can also search for this author inPubMed Google Scholar
- Tariq Shahazad
You can also search for this author inPubMed Google Scholar
- Ahmad Almogren
You can also search for this author inPubMed Google Scholar
- Habib Hamam
You can also search for this author inPubMed Google Scholar
Contributions
All authors have equally contributed.
Corresponding authors
Correspondence toTehseen Mazhar orMuhammad Iqbal.
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Saqib, S.M., Aftab, A., Mazhar, T.et al. Integrating YOLO and WordNet for automated image object summarization.SIViP18, 9465–9481 (2024). https://doi.org/10.1007/s11760-024-03560-z
Received:
Revised:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative