Movatterモバイル変換

334Accesses
4Citations
Explore all metrics

Abstract

The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Approach for Summarizing Text and Image Data Using ResNet and BART

Towards Captioning an Image Collection from a Combined Scene Graph Representation Approach

Development and Classification of Image Dataset for Text-to-Image Generation

Article29 February 2024

Data availability

Data associated with this work can be availed from the corresponding author upon formal request.

References

Mahaur, B., Singh, N., Mishra, K.: Road object detection: a comparative study of deep learning-based algorithms. Multimed. Tools Appl.81(10), 14247–14282 (2022)
Article Google Scholar
Bathija, A., Sharma, G.: Visual object detection and tracking using yolo and sort. Int. J. Eng. Res. Technol.8(11), 345–355 (2019)
Google Scholar
Li, Q., Chen, Y., Zeng, Y.: Transformer with transfer CNN for remote-sensing-image object detection. Remote Sens.14(4), 984 (2022)
Article Google Scholar
Hoeser, T., Bachofer, F., Kuenzer, C.: Object detection and image segmentation with deep learning on earth observation data: a review—Part II: applications. Remote Sens.12(18), 3053 (2020)
Article Google Scholar
Raza, A., et al.: A hybrid deep learning-based approach for brain tumor classification. Electronics11(7), 1146 (2022)
Article Google Scholar
Ktari, J., et al.: Lightweight AI framework for industry 4.0 case study: water meter recognition. Big Data Cognit. Comput.6(3), 72 (2022)
Article Google Scholar
Haq, I., et al.: YOLO and residual network for colorectal cancer cell detection and counting. Heliyon10, e24403 (2024)
Article Google Scholar
Tufail, A.B., et al.: Early-stage Alzheimer’s disease categorization using PET neuroimaging modality and convolutional neural networks in the 2D and 3D domains. Sensors22(12), 4609 (2022)
Article Google Scholar
Nallapati, R., et al.: Abstractive text summarization using sequence-to-sequence rnns and beyond, (2016). arXiv preprintarXiv:1602.06023
Xu, M., et al.: Robust object detection with real-time fusion of multiview foreground silhouettes. Opt. Eng.51(4), 047202–047202 (2012)
Article Google Scholar
Zhang, X., et al.: How well do deep learning-based methods for land cover classification and object detection perform on high resolution remote sensing imagery? Remote Sens.12(3), 417 (2020)
Article Google Scholar
Jogin, M., et al.: Feature extraction using convolution neural networks (CNN) and deep learning. In: 2018 3rd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT). (2018). IEEE
Girshick, R., et al.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell.Intell.38(1), 142–158 (2015)
Article Google Scholar
Saqib, S.M., et al.: DenseHillNet: a lightweight CNN for accurate classification of natural images. PeerJ Comput. Sci.10, e1995 (2024)
Article Google Scholar
Ezzy, H., et al.: How the small object detection via machine learning and UAS-based remote-sensing imagery can support the achievement of SDG2: a case study of vole burrows. Remote Sens.13(16), 3191 (2021)
Article Google Scholar
Li, K., et al.: Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J. Photogramm. Remote Sens.Photogramm. Remote Sens.159, 296–307 (2020)
Article Google Scholar
Allugunti, V.R.: Breast cancer detection based on thermographic images using machine learning and deep learning algorithms. Int. J. Eng. Comput. Sci.4(1), 49–56 (2022)
Article Google Scholar
Verma, R., Lee, D.: Extractive summarization: Limits, compression, generalized model and heuristics. Comput. y Sist.21(4), 787–798 (2017)
Google Scholar
Kerdvibulvech, C. and Li, Q.: Empowering Zero-Shot Object Detection: A Human-in-the-Loop Strategy for Unveiling Unseen Realms in Visual Data. In: International Conference on Human-Computer Interaction. Springer (2024)
Kerdvibulvech, C.: Human hand motion recognition using an extended particle filter. In: Articulated Motion and Deformable Objects: 8th International Conference, AMDO 2014, Palma de Mallorca, Spain, July 16–18, 2014. Proceedings 8. Springer. (2014)
Pan, W., et al.: Semantic graph neural network: A conversion from spam email classification to graph classification. Sci. Program.2022(1), 6737080 (2022)
Google Scholar
Saqib, S.M., et al.: Grouping of aspects into relevant category based on wordnet definitions. Int. J. Comput. Sci. Netw. Secur.19, 113–119 (2019)
Google Scholar
Wang, X., et al.: ALTAS: An Intelligent Text Analysis System Based on Knowledge Graphs. In: Web and Big Data: Second International Joint Conference, APWeb-WAIM 2018, Macau, China, July 23–25, 2018, Proceedings, Part I 2. Springer. (2018)
Ben Jabra, M., et al.: COVID-19 diagnosis in chest X-rays using deep learning and majority voting. Appl. Sci.11(6), 2884 (2021)
Article Google Scholar
Guefrechi, S., et al.: Deep learning based detection of COVID-19 from chest X-ray images. Multimed. Tools Appl.80, 31803–31820 (2021)
Article Google Scholar
Ahmad, T., et al.: Object detection through modified YOLO neural network. Sci. Program.2020(1), 8403262 (2020)
Google Scholar
Talha, M.M., et al.: Deep learning in news recommender systems: A comprehensive survey, challenges and future trends. Neurocomputing562, 126881 (2023)
Article Google Scholar
Al-qaness, M.A., et al.: An improved YOLO-based road traffic monitoring system. Computing103(2), 211–230 (2021)
Article MathSciNet Google Scholar
Chen, Z., et al.: An object detection and localization method based on improved YOLOv5 for the teleoperated robot. Appl. Sci.12(22), 11441 (2022)
Article Google Scholar

Download references

Acknowledgements

This work was supported by King Saud University, Riyadh, Saudi Arabia, through Researchers Supporting Project number (RSP2024R184).

Author information

Authors and Affiliations

Department of Computing and Information Technology, Gomal University, Dera Ismail Khan, 29050, Pakistan
Sheikh Muhammad Saqib, Aamir Aftab & Muhammad Iqbal
Department of Computer Science, Virtual University of Pakistan, Lahore, 51000, Pakistan
Tehseen Mazhar
Department of Computer Science, COMSATS University Islamabad, Sahiwal Campus, Sahiwal, 57000, Pakistan
Tariq Shahazad
Department of Computer Science, College of Computer and Information Sciences, King Saud University, 11633, Riyadh, Saudi Arabia
Ahmad Almogren
School of Electrical Engineering, Dept of Electrical and Electronic Eng. Science, University of Johannesburg, Johannesburg, 2006, South Africa
Habib Hamam
Faculty of Engineering, Uni de Moncton, Moncton, NB, E1A3E9, Canada
Habib Hamam
Hodmas University College, Taleh Area, Mogadishu, Somalia
Habib Hamam
Bridges for Academic Excellence, Tunis, Tunisia
Habib Hamam

Authors

Sheikh Muhammad Saqib
View author publications
You can also search for this author inPubMed Google Scholar
Aamir Aftab
View author publications
You can also search for this author inPubMed Google Scholar
Tehseen Mazhar
View author publications
You can also search for this author inPubMed Google Scholar
Muhammad Iqbal
View author publications
You can also search for this author inPubMed Google Scholar
Tariq Shahazad
View author publications
You can also search for this author inPubMed Google Scholar
Ahmad Almogren
View author publications
You can also search for this author inPubMed Google Scholar
Habib Hamam
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors have equally contributed.

Corresponding authors

Correspondence toTehseen Mazhar orMuhammad Iqbal.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saqib, S.M., Aftab, A., Mazhar, T.et al. Integrating YOLO and WordNet for automated image object summarization.SIViP18, 9465–9481 (2024). https://doi.org/10.1007/s11760-024-03560-z

Download citation

Received:12 June 2024
Revised:14 August 2024
Accepted:30 August 2024
Published:28 September 2024
Issue Date:December 2024
DOI:https://doi.org/10.1007/s11760-024-03560-z

Movatterモバイル変換

Integrating YOLO and WordNet for automated image object summarization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Hybrid Approach for Summarizing Text and Image Data Using ResNet and BART

Towards Captioning an Image Collection from a Combined Scene Graph Representation Approach

Development and Classification of Image Dataset for Text-to-Image Generation

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Access this article

Subscribe and save

Buy Now