185Accesses
Abstract
This paper addresses road safety concerns by investigating low-cost solutions for sound event detection (SED) tailored to driving scenarios. While advanced technologies like deep learning hold promise for improving road safety, their practical implementation often involves expensive sensors and hardware. Distractions, a major cause of accidents, require effective detection and mitigation. This study concentrates on auditory distractions and utilizes SED with low-cost edge devices to identify and timestamp relevant audio events, providing valuable insights into the driving environment. We evaluate state-of-the-art deep learning models on various edge devices, including the 2023 DCASE baseline with convolutional recurrent neural networks (CRNN) and an adapted YOLO vision model for audio spectrograms. Our analysis spans different hardware options, including single-board computers (SBCs) and desktop equipment, offering guidance on cost-effective hardware selection for in-vehicle SED applications. This research aims to contribute to affordable SED solutions in the context of driving safety, with the ultimate goal of advancing road safety efforts worldwide.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.






Similar content being viewed by others
Data availability
No datasets were generated or analyzed during the current study.
References
The sustainable development goals report 2022, July 2022. [Online]. Available:https://unstats.un.org/sdgs/report/2022/
Vicente F, Huang Z, Xiong X, la Torre F, Zhang W, Levi D (2015) Driver gaze tracking and eyes off the road detection system. IEEE Trans Intell Transp Syst 16:2014–2027
Li W, Huang J, Xie G, Karray F, Li R (2021) A survey on vision-based driver distraction analysis. J Syst Architect 121:102319
Li W, Gkritza K, Albrecht C (2014) The culture of distracted driving: evidence from a public opinion survey in IOWA. Transp Res F: Traffic Psychol Behav 26:337–347
Prat F, Planes M, Gras ME, Sullman MJ (2015) An observational study of driving distractions on urban roads Spain. Accid Anal Prevent 74:8–16
Prat F, Gras ME, Planes M, Font-Mayolas S, Sullman MJ (2017) Driving distractions: an insight gained from roadside interviews on their prevalence and factors associated with driver distraction. Transp Res F: Traffic Psychol Behav 45:194–207
Farmer CM, Braitman KA, Lund AK (2010) Cell phone use while driving and attributable crash risk. Traffic Inj Prev 11(5):466–470
Castorena C, Cobos M, Lopez-Ballester J, Ferri FJ (2024) A safety-oriented framework for sound event detection in driving scenarios. Appl Acoust 215:109719
Edwards S, Wundersitz LN, Australia S (2019) Distracted driving: prevalence and motivations. Accid Anal Prevent 54:99–107
Koppel S, Charlton J, Kopinathan C, Taranto D (2011) Are child occupants a significant source of driving distraction? Accid Anal Prevent 43(3):1236–1244
Regan MA, Oviedo-Trespalacios O (2022) Driver distraction: mechanisms, evidence, prevention, and mitigation. In: The Vision Zero Handbook: Theory, Technology and Management for a Zero Casualty Policy. (pp 995–1056). Springer International Publishing, Cham
Zou T, Guo H, Khaloei M, MacKenzie D, Boyle LN (2023) Examining the relationships between multimodal environments and multitasking driving behaviors. Transp Res Rec 2677(2):944–957
Nagahama A, Tanaka K, Feliciani C, Cui G, Wada T (2022) Effects of urban landscape and soundscape on driving behavior. In: 2022 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA). (pp 84–88). IEEE
Prohn MJ, Herbig B (2023) Potentially critical driving situations during “Blue-light" driving: a video analysis. West J Emerg Med 24(2):348
Turpault N, Serizel R, Shah AP, Salamon J (2019) Sound event detection in domestic environments with weakly labeled data and soundscape synthesis. In: Workshop on Detection and Classification of Acoustic Scenes and Events
Tarvainen A, Valpola H (2017) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems. 30
Venkatesh S, Moffat D, Miranda ER (2022) You only hear once: a YOLO-like algorithm for audio segmentation and sound event detection. Appl Sci 12:3293
Bai J, Lu, F and Zhang K. “ONNX: Open neural network exchange GitHub.” [Online]. Available:https://github.com/onnx/onnx
Ahn H, Chen T, Alnaasan N, Shafi A, Abduljabbar M, Subramoni H, Panda DK (2023) Performance characterization of using quantization for dnn inference on edge devices. In2023 IEEE 7th International Conference on Fog and Edge Computing (ICFEC). (pp 1–6). IEEE
Jin T, Bercea GT, Le TD, Chen T, Su G, Imai H, Negishi Y, Leu A, O’Brien K, Kawachiya K, Eichenberger AE (2020) Compiling ONNX neural network models using mlir. arXiv preprintarXiv:2008.08272.
Han S, Mao H, Dally WJ (2015) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprintarXiv:1510.00149.
Cerutti G, Prasad R, Brutti A, Farella E (2020) Compact recurrent neural networks for acoustic event detection on low-energy low-complexity platforms. IEEE J Select Topics Signal Process 14(4):654–664
Liang T, Glossner J, Wang L, Shi S, Zhang X (2021) Pruning and quantization for deep neural network acceleration: A survey. Neurocomputing 461:370–403
Rokh B, Azarpeyvand A, Khanteymoori A (2023) A comprehensive survey on model quantization for deep neural networks in image classification. ACM Trans Intell Syst Technol 14(6):1–50
Kuzmin A, Nagel M, Van Baalen M, Behboodi A, Blankevoort T (2023) Pruning vs quantization: Which is better? Adv Neural Inform Process Syst 36:62414–62427
Bilen Ç, Ferroni G, Tuveri F, Azcarreta J, Krstulović S (2020 ) A framework for the robust evaluation of sound event detection. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (pp 61–65). IEEE
Ebbers J, Haeb-Umbach R, Serizel R (2022) Threshold independent evaluation of sound event detection scores. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (pp 1021–1025). IEEE
Acknowledgements
This work has been supported by Grant TED2021-131003B-C21 funded by MCIN/AEI/10.13039/501100011033 and by the “EU Union NextGenerationEU/PRTR”, as well as by Grant PID2022-137048OB-C41 funded by MICIU/AEI/10.13039/501100011033 and “ERDF A way of making Europe”. Authors would like also to thankGeneralitat Valenciana-Santiago Grisolía program for financing this work (GRISOLIAP/2021/060, CPI-21-232). Finally, the authors acknowledge as well the Artemisa computer resources funded by the EU ERDF and Comunitat Valenciana, and the technical support of IFIC (CSIC-UV).
Author information
Authors and Affiliations
Computer Science Department, Universitat de València, Burjassot, Spain
Carlos Castorena, Jesus Lopez-Ballester, Juan A. De Rus, Maximo Cobos & Francesc J. Ferri
- Carlos Castorena
You can also search for this author inPubMed Google Scholar
- Jesus Lopez-Ballester
You can also search for this author inPubMed Google Scholar
- Juan A. De Rus
You can also search for this author inPubMed Google Scholar
- Maximo Cobos
You can also search for this author inPubMed Google Scholar
- Francesc J. Ferri
You can also search for this author inPubMed Google Scholar
Contributions
Carlos Castorena conceived the idea, conducted the experiments, and wrote the main manuscript. Jesus Lopez-Ballester provided technical and experimental assistance. Juan Antonio de Rus contributed to the planning and execution of the experiments. Maximo Cobos and Francesc J. Ferri provided technical supervision and assisted in the writing and analysis of the results. All authors reviewed and approved the final manuscript.
Corresponding author
Correspondence toMaximo Cobos.
Ethics declarations
Conflict of interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Castorena, C., Lopez-Ballester, J., De Rus, J.A.et al. Edge computing for driving safety: evaluating deep learning models for cost-effective sound event detection.J Supercomput81, 288 (2025). https://doi.org/10.1007/s11227-024-06796-1
Accepted:
Published:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative