Part of the book series:Lecture Notes in Computer Science ((LNCS,volume 14486))
Included in the following conference series:
330Accesses
Abstract
In recent years, computer-aided diagnosis systems have shown great potential in assisting radiologists with accurate and efficient medical image analysis. This paper presents a novel approach for bone pathology localization and classification in wrist X-ray images using a combination of YOLO (You Only Look Once) and the Shifted Window Transformer (Swin) with a newly proposed block. The proposed methodology addresses two critical challenges in wrist X-ray analysis: accurate localization of bone pathologies and precise classification of abnormalities. The YOLO framework is employed to detect and localize bone pathologies, leveraging its real-time object detection capabilities. Additionally, the Swin, a transformer-based module, is utilized to extract contextual information from the localized regions of interest (ROIs) for accurate classification.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 8579
- Price includes VAT (Japan)
- Softcover Book
- JPY 10724
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Berger, R.A.: The anatomy and basic biomechanics of the wrist joint. J. Hand Ther.9(2), 84–93 (1996)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprintarXiv:2004.10934 (2020)
Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-CAM++: generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 839–847 (2018)
Chekalina, V., Novikov, G., Gusak, J., Oseledets, I., Panchenko, A.: Efficient gpt model pre-training using tensor train matrix representation. arXiv preprintarXiv:2306.02697 (2023)
Elfwing, S., Uchibe, E., Doya, K.: Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw.107, 3–11 (2018). special issue on deep reinforcement learning
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Hardalaç, F., et al.: Fracture detection in wrist x-ray images using deep learning-based object detection models. Sensors22(3), 1285 (2022)
He, L., Todorovic, S.: DESTR: object detection with split transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9377–9386, June 2022
Hedström, E.M., Svensson, O., Bergström, U., Michno, P.: Epidemiology of fractures in children and adolescents: increased incidence over the past decade: a population-based study from Northern Sweden. Acta Orthop.81(1), 148–153 (2010)
Jocher, G., et al.: ultralytics/yolov5: v3.1 - Performance Improvements, October 2022
Jones, R.M., et al.: Assessment of a deep-learning system for fracture detection in musculoskeletal radiographs. NPJ Digit. Med.3(1), 1–6 (2020)
Lanchantin, J., Wang, T., Ordonez, V., Qi, Y.: General multi-label image classification with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16478–16488, June 2021
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Liu, Y., Shao, Z., Hoffmann, N.: Global attention mechanism: retain information to enhance channel-spatial interactions. arXiv preprintarXiv:2112.05561 (2021)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Lu, P., et al.: Learn to explain: multimodal reasoning via thought chains for science question answering. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 2507–2521. Curran Associates, Inc. (2022)
Mounts, J., Clingenpeel, J., McGuire, E., Byers, E., Kireeva, Y.: Most frequently missed fractures in the emergency department. Clin. Pediatr.50(3), 183–186 (2011)
Nagy, E., Janisch, M., Hržić, F., Sorantin, E., Tschauner, S.: A pediatric wrist trauma x-ray dataset (grazpedwri-dx) for machine learning. Sci. Data9(1), 222 (2022)
Nguyen, H.P., Hoang, T.P., Nguyen, H.H.: A deep learning based fracture detection in arm bone x-ray images. In: 2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6. IEEE (2021)
Pathare, S.J., Solkar, R.P., Nagare, G.D.: Detection of fractures in long bones for trauma centre patients using hough transform. In: 2020 International Conference on Communication and Signal Processing (ICCSP), pp. 088–091. IEEE (2020)
Rainbow, M., Wolff, A., Crisco, J., Wolfe, S.: Functional kinematics of the wrist. J. Hand Surg. (Eur. Vol.)41(1), 7–21 (2016)
Randsborg, P.H., et al.: Fractures in children: epidemiology and activity-specific fracture rates. JBJS95(7), e42 (2013)
Razzhigaev, A., et al.: Pixel-level BPE for auto-regressive image generation. In: Proceedings of the First Workshop on Performance and Interpretability Evaluations of Multimodal, Multipurpose, Massive-Scale Models, pp. 26–30. International Conference on Computational Linguistics, Virtual, October 2022
Rimmer, A.: Radiologist shortage leaves patient care at risk, warns royal college. BMJ Br. Med. J. (Online)359 (2017)
Selivanov, A., Rogov, O.Y., Chesakov, D., Shelmanov, A., Fedulova, I., Dylov, D.V.: Medical image captioning via generative pretrained transformers. Sci. Rep.13(1) (2023).https://doi.org/10.1038/s41598-023-31223-5
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-YOLOv4: scaling cross stage partial network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13029–13038 (2021)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprintarXiv:2207.02696 (2022)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision–ECCV 2018. ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018).https://doi.org/10.1007/978-3-030-01234-2_1
Xiao, F., et al.: Lattice-based transformer encoder for neural machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3090–3097. Association for Computational Linguistics, Florence, Italy, July 2019
Yadav, D., Rathor, S.: Bone fracture detection and classification using deep learning approach. In: 2020 International Conference on Power Electronics & IoT Applications in Renewable Energy and its Control (PARC), pp. 282–285. IEEE (2020)
Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 636–644 (2017)
Yu, J., Wang, Z., Vasudevan, V., Yeung, L., Seyedhosseini, M., Wu, Y.: Coca: contrastive captioners are image-text foundation models. Trans. Mach. Learn. Res. (2022).https://openreview.net/forum?id=Ee277P3AYC
Zheng, Z., et al.: Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern.52(8), 8574–8586 (2022)
Acknowledgements
The work was supported by Ministry of Science and Higher Education grant No. 075-10-2021-068.
Author information
Authors and Affiliations
Skolkovo Institute of Science and Technology, Moscow, Russia
Razan Dibo, Andrey Galichin, Dmitry V. Dylov & Oleg Y. Rogov
Artificial Intelligence Research Institute (AIRI), Moscow, Russia
Oleg Y. Rogov
Pirogov National Medical and Surgical Center, Moscow, Russia
Pavel Astashev
- Razan Dibo
You can also search for this author inPubMed Google Scholar
- Andrey Galichin
You can also search for this author inPubMed Google Scholar
- Pavel Astashev
You can also search for this author inPubMed Google Scholar
- Dmitry V. Dylov
You can also search for this author inPubMed Google Scholar
- Oleg Y. Rogov
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toOleg Y. Rogov.
Editor information
Editors and Affiliations
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Krasovskii Institute of Mathematics and Mechanics of Russian Academy of Sciences, Yekaterinburg, Russia
Michael Khachay
University of Oslo, Oslo, Norway
Andrey Kutuzov
American University of Armenia, Yerevan, Armenia
Habet Madoyan
Artificial Intelligence Research Institute, Moscow, Russia
Ilya Makarov
University of Hamburg, Hamburg, Germany
Irina Nikishina
Skolkovo Institute of Science and Technology, Moscow, Russia
Alexander Panchenko
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Maxim Panov
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko
Apptek, Aachen, Germany
Evgenii Tsymbalov
Kazan Federal University, Kazan, Russia
Elena Tutubalina
MTS AI, Moscow, Russia
Sergey Zagoruyko
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dibo, R., Galichin, A., Astashev, P., Dylov, D.V., Rogov, O.Y. (2024). DeepLOC: Deep Learning-Based Bone Pathology Localization and Classification in Wrist X-Ray Images. In: Ignatov, D.I.,et al. Analysis of Images, Social Networks and Texts. AIST 2023. Lecture Notes in Computer Science, vol 14486. Springer, Cham. https://doi.org/10.1007/978-3-031-54534-4_14
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-031-54533-7
Online ISBN:978-3-031-54534-4
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative