Part of the book series:Lecture Notes in Computer Science ((LNCS,volume 14403))
Included in the following conference series:
430Accesses
Abstract
In this paper, we propose an improved YOLOv8-based Kiwifruit detection method using Swin Transformer, aiming to address challenges posed by significant scale variation and inaccuracies in multiscale object detection. Specifically, our approach embeds the encoder from Swin Transformer, based on its sliding-window design, into the YOLOv8 architecture to capture contextual information and global dependencies of the detected objects at multiple scales, facilitating the learning of semantic features. Through comparative experiments with the state-of-the-art object detection algorithms on our collected dataset, our proposed method demonstrates efficient detection of objects at different scales, significantly reducing false negatives while im-proving precision. Moreover, the method proves to be versatile in detecting objects of various sizes in different environmental settings, fulfilling the real-time requirements in complex and unknown Kiwifruit cultivation scenarios. The results highlight the potential practical applications of the pro-posed approach in Kiwifruit industry, showcasing its suitability for addressing real-world challenges and complexities.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 9151
- Price includes VAT (Japan)
- Softcover Book
- JPY 11439
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Carion, Nicolas, Massa, Francisco, Synnaeve, Gabriel, Usunier, Nicolas, Kirillov, Alexander, Zagoruyko, Sergey: End-to-end object detection with Transformers. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020).https://doi.org/10.1007/978-3-030-58452-8_13
Fang, Y., et al.: You only look at one sequence: rethinking Transformer in vision through object detection.https://arxiv.org/abs/2106.00666
Ferguson, A.: 1904—the year that Kiwifruit (Actinidia deliciosa) came to New Zealand. N. Z. J. Crop. Hortic. Sci.32, 3–27 (2004)
Fu, Y., Nguyen, M., Yan, W.Q.: Grading methods for fruit freshness based on deep learning. SN Comput. Sci.3 (2022)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Gong, H., et al.: Swin-transformer-enabled YOLOv5 with attention mechanism for small object detection on satellite images. Remote Sens.14, 2861 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016).https://doi.org/10.1007/978-3-319-46493-0_38
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional Neural Networks. Commun. ACM60, 84–90 (2012)
Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. Int. J. Comput. Vision128, 642–656 (2019)
Liu, Y., Nand, P., Hossain, M.A., Nguyen, M., Yan, W.Q.: Sign language recognition from digital videos using feature pyramid network with detection transformer. Multimedia Tools Appl.82, 21673–21685 (2023)
Liu, Y., Yang, G., Huang, Y., Yin, Y.: SE-Mask R-CNN: an improved Mask R-CNN for apple detection and segmentation. J. Intell. Fuzzy Syst.41, 6715–6725 (2021)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE/CVF International Conference on Computer Vision (ICCV). (2021)
Liu, Z., Yan, W., Yang, B.: Image denoising based on a CNN model. IEEE ICCAR (2018)
Luo, Z., Yan, W.Q., Nguyen, M.: Kayak and sailboat detection based on the improved YOLO with transformer. In: International Conference on Control and Computer Vision (2022)
Massah, J., AsefpourVakilian, K., Shabanian, M., Shariatmadari, S.: Design, development, and performance evaluation of a robot for yield estimation of Kiwifruit. Comput. Electron. Agric.185, 106132 (2021)
Pan, C., Liu, J., Yan, W., et al.: Salient object detection based on visual perceptual saturation and two-stream hybrid networks. IEEE Trans. Image Process. (2021)
Pan, C., Yan, W.: A learning-based positive feedback in salient object detection. In: IEEE IVCNZ (2018)
Pan, C., Yan, W.: Object detection based on saturation of visual perception. Multimedia Tools Appl.79(27–28), 19925–19944 (2020)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: IEEE CVPR, pp. 779–788 (2016)
Shen, D., Xin, C., Nguyen, M., Yan, W.: Flame detection using deep learning. In: IEEE ICCAR (2018)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wang, L., Yan, W.Q.: Tree leaves detection based on deep learning. In: International Symposium on Geometry and Vision, pp. 25–38 (2021)
Xia, Y., Nguyen, M., Yan, W.Q.: A real-time Kiwifruit detection based on improved YOLOv7. In: Image and Vision Computing, pp. 48–61 (2023)
Yan, W.Q.: Computational Methods for Deep Learning – Theory, Algorithms, and Implementations, 2nd edn. Springer, Singapore (2023).https://doi.org/10.1007/978-981-99-4823-9
Zhao, K., Yan, W.Q.: Fruit detection from digital images using CenterNet. In: Nguyen, M., Yan, W.Q., Ho, H. (eds.) ISGV 2021. CCIS, vol. 1386, pp. 313–326. Springer, Cham (2021).https://doi.org/10.1007/978-3-030-72073-5_24
Yan, W.Q.: Introduction to Intelligent Surveillance, 3rd edn. Springer, Cham (2019).https://doi.org/10.1007/978-3-030-10713-0
Xia, Y.: Kiwifruit Detection and Tracking from A Deep Learning Perspective Using Digital Videos. Master’s thesis, Auckland University of Technology, New Zealand (2023)
Author information
Authors and Affiliations
Auckland University of Technology, 1010, Auckland, New Zealand
Yi Xia, Minh Nguyen, Raymond Lutui & Wei Qi Yan
- Yi Xia
You can also search for this author inPubMed Google Scholar
- Minh Nguyen
You can also search for this author inPubMed Google Scholar
- Raymond Lutui
You can also search for this author inPubMed Google Scholar
- Wei Qi Yan
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toYi Xia.
Editor information
Editors and Affiliations
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan
Auckland University of Technology, Auckland, New Zealand
Minh Nguyen
Auckland University of Technology, Auckland, New Zealand
Parma Nand
Auckland University of Technology, Auckland, New Zealand
Xuejun Li
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xia, Y., Nguyen, M., Lutui, R., Yan, W.Q. (2024). Multiscale Kiwifruit Detection from Digital Images. In: Yan, W.Q., Nguyen, M., Nand, P., Li, X. (eds) Image and Video Technology. PSIVT 2023. Lecture Notes in Computer Science, vol 14403. Springer, Singapore. https://doi.org/10.1007/978-981-97-0376-0_7
Download citation
Published:
Publisher Name:Springer, Singapore
Print ISBN:978-981-97-0375-3
Online ISBN:978-981-97-0376-0
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative