CN118465305A

Movatterモバイル変換

Info

Publication number: CN118465305A
Application number: CN202410917282.5A
Authority: CN
Inventors: 王兴; 赵坤; 杨正玮
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2024-07-10
Filing date: 2024-07-10
Publication date: 2024-08-09
Anticipated expiration: 2044-07-10
Also published as: CN118465305B

Abstract

The invention belongs to the technical fields of meteorology, computer hearing and deep learning, and discloses a wind speed measurement deep learning method and system based on audio data of a monitoring camera, wherein the method comprises the following steps: monitoring the separation and acquisition of audio data; extracting root mean square energy and MEL spectrogram of monitoring wind noise as time-frequency domain characteristics of the wind noise; introducing a double-attention mechanism, and establishing an SA-CNN wind sound frequency domain feature extraction deep learning model and an CA-CNN wind sound time domain feature extraction deep learning model so as to form a wind speed measurement deep learning model. According to the wind speed accurate calculation method, from the brand-new view of the monitoring camera, the time-frequency domain features of the monitoring wind sound are fused to realize the accurate calculation of the wind speed. The invention is easy to deploy and implement, does not need to detach and calibrate the installed monitoring camera, and is convenient for the introduction and application of departments such as weather, traffic, urban management and the like.

Description

Translated fromChinese

基于监控相机音频数据的风速测量深度学习方法及系统Deep learning method and system for wind speed measurement based on surveillance camera audio data

技术领域Technical Field

本发明属于气象学、计算机听觉、深度学习技术领域，尤其涉及一种基于监控相机音频数据的风速测量深度学习方法及系统。The present invention belongs to the fields of meteorology, computer hearing, and deep learning technology, and in particular, relates to a deep learning method and system for wind speed measurement based on audio data of a surveillance camera.

背景技术Background Art

风在陆地/水面与大气层之间的质量、能量和动量的湍流交换中发挥着重要作用。高时空分辨率的地表风速感知对了解城市微气候、天气和污染传输规律，提高风能利用效率、促进低空经济发展等方面至关重要。然而，作为地表风速观测的基本手段，气象站可获得高时间分辨率的点风速信息，受限于空间离散性，很难准确捕捉复杂地形区（如山区、城区等）风速空间变异特征；天气雷达，可探测风场的三维结构信息，以及连续的、高空间分辨率的区域风速数据，但在城市区域雷达观测存在下垫面盲区，观测结果与地表风情存在一定的差距。Wind plays an important role in the turbulent exchange of mass, energy and momentum between the land/water surface and the atmosphere. High temporal and spatial resolution of surface wind speed perception is crucial to understanding urban microclimate, weather and pollution transmission laws, improving wind energy utilization efficiency, and promoting low-altitude economic development. However, as a basic means of surface wind speed observation, meteorological stations can obtain point wind speed information with high temporal resolution. However, due to the limitation of spatial discreteness, it is difficult to accurately capture the spatial variation characteristics of wind speed in complex terrain areas (such as mountainous areas, urban areas, etc.); weather radar can detect the three-dimensional structural information of the wind field, as well as continuous, high-spatial-resolution regional wind speed data, but there is a blind spot in the underlying surface in radar observations in urban areas, and there is a certain gap between the observation results and the surface wind conditions.

统计数据显示我国监控相机数量远超5亿台，且大部分部署于城市区域。广泛密布的监控相机所采集的监控音频数据可持续、动态的记录地表风速的变化，为高时空分辨率的风速测量提供。如何从监控音频中实现对风声的监测，并对风速进行测量成为本发明所关注的重点。然而，受人类活动的影响，监控音频数据中常包含复杂的环境声（如：车辆鸣笛声、交谈声、车辆噪声等），增加了对风速准确测量的难度。Statistics show that the number of surveillance cameras in my country far exceeds 500 million, and most of them are deployed in urban areas. The surveillance audio data collected by the widely distributed surveillance cameras can continuously and dynamically record the changes in surface wind speed, providing high temporal and spatial resolution wind speed measurement. How to monitor the wind sound from the surveillance audio and measure the wind speed has become the focus of the present invention. However, affected by human activities, the surveillance audio data often contains complex environmental sounds (such as vehicle horns, conversations, vehicle noise, etc.), which increases the difficulty of accurately measuring wind speed.

为此，本发明采用深度学习的方法，引入双重注意力机制，首先，从监控数据中分析出音频数据；然后，分析不同速度风声的音频特征，构建风声的监控音频特征模型；最后，以监控风声的音频特征为输入，以卷积神经网络、长短期记忆网络为基础对监控音频中风声的时频域特征进行挖掘，抑制外界环境声的干扰，并构建风速回归计算深度学习模型，实现地表风速的准确测量。本发明专利易于部署和实施，无需对已安装的监控相机进行拆卸和标定，便于气象、交通、城市管理等部门的引入和应用。与专利（江苏省气象服务中心.一种基于路噪音频分析的路面气象状况识别方法及系统[P].中国专利：CN115762565A.2023.03.07）相比，该专利面向的是路面状态（如积水，积雪、结冰等）定性判别，本发明的特色在于以监控相机所采集的音频入手，采用深度学习方法，实现风速的定量估计，而非定性的判断，具有更广的应用场景和实用价值。To this end, the present invention adopts a deep learning method and introduces a dual attention mechanism. First, the audio data is analyzed from the monitoring data; then, the audio features of wind sounds at different speeds are analyzed to construct a monitoring audio feature model of wind sounds; finally, the audio features of the monitored wind sounds are used as input, and the time-frequency domain features of the wind sounds in the monitoring audio are mined based on convolutional neural networks and long short-term memory networks to suppress the interference of external environmental sounds, and a deep learning model for wind speed regression calculation is constructed to achieve accurate measurement of surface wind speed. The patent of the present invention is easy to deploy and implement, and there is no need to disassemble and calibrate the installed monitoring cameras, which is convenient for introduction and application by meteorological, transportation, urban management and other departments. Compared with the patent (Jiangsu Meteorological Service Center. A method and system for identifying road meteorological conditions based on road noise frequency analysis [P]. Chinese Patent: CN115762565A.2023.03.07), this patent is aimed at qualitative identification of road conditions (such as water accumulation, snow accumulation, ice, etc.). The feature of the present invention is that it starts with the audio collected by the monitoring camera and adopts a deep learning method to achieve quantitative estimation of wind speed rather than qualitative judgment, which has a wider application scenario and practical value.

通过上述分析，现有技术存在的问题及缺陷为：Through the above analysis, the problems and defects of the prior art are as follows:

（1）气象站受限于空间离散性，很难准确捕捉复杂地形区（如山区、城区等）风速空间变异特征；(1) Due to the spatial discreteness of meteorological stations, it is difficult to accurately capture the spatial variation characteristics of wind speed in complex terrain areas (such as mountainous areas and urban areas);

（2）天气雷达在城市区域雷达观测存在下垫面盲区，观测结果与地表风情存在一定的差距。(2) Weather radar observations in urban areas have underlying surface blind spots, and there is a certain gap between the observation results and the surface wind conditions.

（3）受人类活动的影响，监控相机监控音频数据中常包含复杂的环境声（如：车辆鸣笛声、交谈声、车辆噪声等），增加了对风速准确测量的难度。(3) Affected by human activities, the audio data collected by surveillance cameras often contains complex environmental sounds (such as vehicle horns, conversations, vehicle noise, etc.), which increases the difficulty of accurately measuring wind speed.

发明内容Summary of the invention

针对现有技术存在的问题，本发明提供了一种基于监控相机音频数据的风速测量深度学习方法及系统。In view of the problems existing in the prior art, the present invention provides a deep learning method and system for wind speed measurement based on audio data of surveillance cameras.

本发明是这样实现的，一种基于监控相机音频数据的风速测量深度学习方法，包括：The present invention is implemented as follows: a deep learning method for wind speed measurement based on surveillance camera audio data, comprising:

步骤一，监控音频数据的分离与获取；Step 1: Separation and acquisition of monitoring audio data;

步骤二，提取监控风声的均方根能量和MEL频谱图作为风声的时、频域特征；Step 2: extract the root mean square energy and MEL spectrum of the monitored wind sound as the time and frequency domain features of the wind sound;

步骤三，引入双注意力机制，建立SA-CNN风声频域特征提取深度学习模型和CA-CNN风声时域特征提取深度学习模型，进而形成风速测量深度学习模型。Step three, introduce the dual attention mechanism, establish the SA-CNN wind sound frequency domain feature extraction deep learning model and the CA-CNN wind sound time domain feature extraction deep learning model, and then form a wind speed measurement deep learning model.

进一步，所述步骤一，包括：Further, the step 1 includes:

（1-1）获取监控相机的坐标位置，实现监控相机在系统中的注册；(1-1) Obtain the coordinates of the surveillance camera and register the surveillance camera in the system;

（1-2）将监控视频、音频数据进行分离，并将音频转换成.wav格式。(1-2) Separate the surveillance video and audio data, and convert the audio into .wav format.

进一步，所述步骤二提取监控风声的均方根能量作为风声的时域特征，具体包括如下步骤：Furthermore, the step 2 extracts the root mean square energy of the monitored wind sound as the time domain feature of the wind sound, which specifically includes the following steps:

（2-1）对监控音频进行分帧处理，并对分帧结果使用汉明窗进行加窗处理，平滑信号的同时减少信息的损失及邻帧之间的连续性；(2-1) Frame the surveillance audio and perform windowing on the frame results using a Hamming window to smooth the signal while reducing information loss and continuity between adjacent frames;

（2-2）对每一帧进行离散傅里叶变换操作，将时域波形信号转换到频域，获取短时能量谱，延频率，单位HZ；(2-2) Perform discrete Fourier transform on each frame to convert the time domain waveform signal into the frequency domain and obtain the short-time energy spectrum , Delay frequency , unit: HZ;

（2-3）将能量谱通过一组Mel尺度的三角形滤波器组，将原始声音信号产生的线性频谱映射到基于听觉感知的Mel非线性频谱中，并对取log获取获取Log-Mel特征：(2-3) The energy spectrum Through a set of Mel-scale triangular filter banks, the linear spectrum generated by the original sound signal is mapped to the Mel nonlinear spectrum based on auditory perception. In Get log Get Log-Mel features:

(1) (1)

式中，是第个Log-Mel频谱，是倒谱特征的预设值，是倒谱系数的指数，为结果特征的维度，实现监控风声频域特征的提取，时间对应的风声频域特征记为；In the formula, It is Log-Mel spectrum, is the preset value of the cepstrum feature, is the exponent of the cepstral coefficient, is the dimension of the result feature, realizing the extraction of the frequency domain features of the monitored wind sound, time The corresponding wind sound frequency domain characteristics are recorded as ;

（2-4）以时间为横轴，以声音的振幅为纵轴，绘制声音均方根能量的幅度包络对声音在时间维度的变化进行可视化，以此作为风声的RMS时域特征，时间对应的风声时域特征记为。(2-4) With time as the horizontal axis and the amplitude of the sound as the vertical axis, the amplitude envelope of the root mean square energy of the sound is plotted to visualize the change of the sound in the time dimension, which is used as the RMS time domain feature of the wind sound. The corresponding wind sound time domain characteristics are recorded as .

进一步，所述步骤三中的建立SA-CNN风声频域特征提取深度学习模型，具体包括如下步骤：Furthermore, the establishment of the SA-CNN wind sound frequency domain feature extraction deep learning model in step 3 specifically includes the following steps:

（3-1）引入空间注意力机制，注意力机制与CNN层相结合形成频域注意力模块；(3-1) Introduce the spatial attention mechanism, which is combined with the CNN layer to form a frequency domain attention module;

（3-2）将5个频域注意力模块相连接形成SA-CNN网络，实现网络的加深，从而提取高维度的频域特征；(3-2) Connect the five frequency domain attention modules to form a SA-CNN network to deepen the network and extract high-dimensional frequency domain features;

（3-3）将步骤（2-4）获取的风声频域MEL特征输入SA-CNN网络，输出的特征向量记为：。(3-3) The wind sound frequency domain MEL features obtained in step (2-4) are input into the SA-CNN network, and the output feature vector is recorded as: .

进一步，所述步骤三中的构建CA-CNN风声时域特征提取深度学习模型，包括如下步骤：Furthermore, the step 3 of constructing a CA-CNN wind sound time domain feature extraction deep learning model includes the following steps:

（4-1）引入自注意力机制，注意力机制与CNN层相结合形成时域注意力模块；(4-1) Introduce the self-attention mechanism, which is combined with the CNN layer to form a temporal attention module;

（4-2）将5个时间注意力模块相连接形成CA-CNN网络，实现网络的加深，从而提取高维度的时域特征；(4-2) Connect the five temporal attention modules to form a CA-CNN network to deepen the network and extract high-dimensional temporal features;

（4-3）将步骤（2-5）获取的风声时域MEL特征输入CA-CNN网络，输出的特征向量记为：。(4-3) Input the wind sound time domain MEL features obtained in step (2-5) into the CA-CNN network, and the output feature vector is recorded as: .

进一步，所述步骤三中的形成风速测量深度学习模型，包括：Further, the step 3 of forming a wind speed measurement deep learning model includes:

（5-1）使用全连接神经网络层将步骤（3-3）与（4-3）所提取的监控风声时频特征相融合，获取风速描述的特征向量；(5-1) Using a fully connected neural network layer, the time-frequency features of the monitored wind sound extracted in step (3-3) and (4-3) are integrated to obtain a feature vector describing the wind speed;

（5-2）使用线性回归层，计算对应风速值，实现完整深度学习模型的搭建。(5-2) Use the linear regression layer to calculate the corresponding wind speed value and build a complete deep learning model.

本发明的另一目的在于提供一种基于监控相机音频数据的风速测量深度学习系统，所述基于监控相机音频数据的风速测量深度学习系统包括：Another object of the present invention is to provide a deep learning system for wind speed measurement based on surveillance camera audio data, the deep learning system for wind speed measurement based on surveillance camera audio data comprising:

预处理模块，用于分离与获取监控音频数据；Preprocessing module, used to separate and obtain monitoring audio data;

频域模块，用于建模并挖掘频域特征；Frequency domain module, used to model and mine frequency domain features;

时域模块，用于建模并挖掘时域特征；Time domain module, used to model and mine time domain features;

预测模块，用于实现基于监控音频的风速预测。The prediction module is used to realize wind speed prediction based on monitoring audio.

本发明的另一目的在于提供一种计算机设备，所述计算机设备包括存储器和处理器，所述存储器存储有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行所述基于监控相机音频数据的风速测量深度学习方法的步骤。Another object of the present invention is to provide a computer device, comprising a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the deep learning method for wind speed measurement based on surveillance camera audio data.

本发明的另一目的在于提供一种计算机可读存储介质，存储有计算机程序，所述计算机程序被处理器执行时，使得所述处理器执行所述基于监控相机音频数据的风速测量深度学习方法的步骤。Another object of the present invention is to provide a computer-readable storage medium storing a computer program, which, when executed by a processor, enables the processor to perform the steps of the deep learning method for wind speed measurement based on surveillance camera audio data.

本发明的另一目的在于提供一种信息数据处理终端，所述信息数据处理终端用于实现所述基于监控相机音频数据的风速测量深度学习系统。Another object of the present invention is to provide an information data processing terminal, which is used to implement the wind speed measurement deep learning system based on surveillance camera audio data.

结合上述的技术方案和解决的技术问题，本发明所要保护的技术方案所具备的优点及积极效果为：In combination with the above technical solutions and the technical problems solved, the advantages and positive effects of the technical solutions to be protected by the present invention are as follows:

第一、本发明通过引入Attention注意力机制，有效抑制外界环境声的干扰，深入挖掘监控音频数据中所蕴含的风速信息；First, the present invention effectively suppresses the interference of external environmental sounds by introducing the Attention mechanism, and deeply mines the wind speed information contained in the monitoring audio data;

借助深度学习方法的泛化能力，实现复杂监控场景丰富的准确测量，具有较好的推广性与可复制性；With the generalization ability of deep learning methods, it can achieve rich and accurate measurements in complex monitoring scenarios, with good scalability and replicability;

方法的实现无需对监控相机进行拆卸和复杂的标定，方法易于部署和实施；The implementation of the method does not require disassembly of the surveillance camera and complex calibration, and the method is easy to deploy and implement;

方法以城市监控相机为载体，可在现有的城市监控资源上部署，无需额外的安装硬件设施，维护和运行成本较低。The method uses urban surveillance cameras as carriers and can be deployed on existing urban surveillance resources without the need for additional hardware installation, with low maintenance and operation costs.

与现有风速测量手段相比，城市监控相机具有数量大、分布密、传输快、成本低等观测优势，解决了当前近地面风速的高分辨测量问题，对城市微气象建模、风能的利用与开发、低空经济发展具有重要的科学意义和实际应用价值。Compared with existing wind speed measurement methods, urban monitoring cameras have the advantages of large number, dense distribution, fast transmission and low cost. They solve the current problem of high-resolution measurement of near-ground wind speed and have important scientific significance and practical application value for urban micrometeorological modeling, wind energy utilization and development, and low-altitude economic development.

第二，本发明从城市监控的角度出发，借助深度学习等人工智能方法，实现地表风速的测量，通过多相机的协同观测，实现了地面风速的高分辨率观测，弥补了现有风速观测手段（包括雷达、气象站）在时间和空间分辨率方面的不足。Second, from the perspective of urban monitoring, the present invention uses artificial intelligence methods such as deep learning to realize the measurement of surface wind speed. Through the collaborative observation of multiple cameras, high-resolution observation of ground wind speed is achieved, which makes up for the shortcomings of existing wind speed observation methods (including radar and meteorological stations) in terms of time and spatial resolution.

本发明解决了现有技术中存在的以下技术问题，并取得了显著的技术进步：The present invention solves the following technical problems existing in the prior art and achieves significant technical progress:

传统风速测量方法的局限性：传统风速测量通常依赖于专用的风速仪或传感器，这些设备成本高且安装复杂。恶劣天气条件或复杂地形中，传统设备的测量精度受到影响。传统方法在风速变化频繁的环境中，实时性和响应速度较慢。传统方法未能充分利用音频数据中的风声特征，导致测量手段单一。在时域和频域特征提取和融合方面，缺乏有效的方法来提升数据的利用效率。Limitations of traditional wind speed measurement methods: Traditional wind speed measurement usually relies on dedicated anemometers or sensors, which are expensive and complex to install. The measurement accuracy of traditional equipment is affected in severe weather conditions or complex terrain. Traditional methods have slow real-time performance and response speed in environments with frequent wind speed changes. Traditional methods fail to fully utilize the wind sound characteristics in audio data, resulting in a single measurement method. In terms of time domain and frequency domain feature extraction and fusion, there is a lack of effective methods to improve data utilization efficiency.

本发明获得的显著技术进步为：The significant technical advances achieved by the present invention are:

提高测量精度和实时性：通过引入SA-CNN和CA-CNN深度学习模型，充分利用音频数据中的时、频域特征，提高了风速测量的精度和实时性。结合空间注意力和自注意力机制，提高了特征提取的有效性和模型的鲁棒性。Improve measurement accuracy and real-time performance: By introducing the SA-CNN and CA-CNN deep learning models, the time and frequency domain features in the audio data are fully utilized to improve the accuracy and real-time performance of wind speed measurement. Combining the spatial attention and self-attention mechanisms improves the effectiveness of feature extraction and the robustness of the model.

设备成本和适用范围的改进：该方法仅依赖于已有的监控相机音频数据，减少了对专用测量设备的依赖，降低了成本。通过对监控音频数据的智能分析，该方法在复杂环境和恶劣天气条件下依然能够保持高精度的测量。Improvement of equipment cost and scope of application: This method only relies on existing surveillance camera audio data, reducing the reliance on dedicated measurement equipment and reducing costs. Through intelligent analysis of surveillance audio data, this method can still maintain high-precision measurements in complex environments and severe weather conditions.

数据利用效率的提升：充分挖掘和利用音频数据中的风声特征，解决了传统方法数据利用效率低的问题。通过全连接神经网络层对时、频域特征的融合，实现了风速描述的高效特征向量提取，进一步提高了模型的准确性。Improved data utilization efficiency: Fully mining and utilizing the wind sound features in audio data solves the problem of low data utilization efficiency of traditional methods. By integrating the time and frequency domain features through the fully connected neural network layer, efficient feature vector extraction of wind speed description is achieved, further improving the accuracy of the model.

技术适用性的扩展：该方法适用于多种监控场景，包括城市监控、交通监控和野外环境监控，具有广泛的应用前景。利用深度学习和注意力机制，实现了风速测量的智能化和自动化，大幅提升了技术水平。Expansion of technical applicability: This method is applicable to a variety of monitoring scenarios, including urban monitoring, traffic monitoring, and field environment monitoring, and has broad application prospects. By using deep learning and attention mechanism, the intelligent and automated wind speed measurement is realized, which greatly improves the technical level.

第三，本发明的技术方案转化后的预期收益和商业价值为：以城市监控相机为平台，实现城市地面风的高分辨率测量，所生产的地面风数据资料对城市风能的开发、低空风行器路线导航、城市交通运输、以及低空经济的发展具有重要意义和应用价值。此外，发明进一步拓展了城市监控相机的功能范畴，监控相机组成的地面风观测网亦为现有风速观测手段提供有益的补充。Third, the expected benefits and commercial value of the technical solution of the present invention after transformation are: using the urban monitoring camera as a platform to achieve high-resolution measurement of urban ground wind, the ground wind data produced has important significance and application value for the development of urban wind energy, low-altitude wind navigation, urban transportation, and the development of low-altitude economy. In addition, the invention further expands the functional scope of urban monitoring cameras, and the ground wind observation network composed of monitoring cameras also provides a useful supplement to the existing wind speed observation methods.

本发明的技术方案填补了国内外业内技术空白：从城市监控相机这一崭新视角出发，实现地面风速的高分辨率感知，解决了气象学领域对城市地面风速的高分辨率观测难题。查阅国内外论文等研究资料，均为发现同类工作。解决了城市地面风的高时空分辨率感知难题。The technical solution of the present invention fills the technical gap in the industry at home and abroad: from the new perspective of urban monitoring cameras, high-resolution perception of ground wind speed is achieved, solving the problem of high-resolution observation of urban ground wind speed in the field of meteorology. After consulting domestic and foreign papers and other research materials, similar work was found. The problem of high-temporal and spatial resolution perception of urban ground wind is solved.

第四，本发明提供的基于监控相机音频数据的风速测量深度学习方法通过多传感器数据的分离与获取，提取风声的时频域特征，并引入双注意力机制构建深度学习模型，实现对风速的准确测量。与传统方法相比，该方法能够更精确地捕获风声的时频特征，从而提高了风速测量的准确性和可靠性。Fourth, the deep learning method for wind speed measurement based on surveillance camera audio data provided by the present invention extracts the time-frequency domain features of wind sound by separating and acquiring multi-sensor data, and introduces a dual attention mechanism to construct a deep learning model to achieve accurate measurement of wind speed. Compared with traditional methods, this method can more accurately capture the time-frequency features of wind sound, thereby improving the accuracy and reliability of wind speed measurement.

其中，该方法通过监控音频数据的分离与获取，将音频转换成.wav格式，并提取风声的均方根能量和MEL频谱图作为时、频域特征。通过引入双注意力机制，分别建立SA-CNN和CA-CNN两种深度学习模型，实现了风声的频域和时域特征提取，从而为风速测量奠定了基础。Among them, this method converts the audio into .wav format by monitoring the separation and acquisition of audio data, and extracts the root mean square energy and MEL spectrum of wind sound as time and frequency domain features. By introducing the dual attention mechanism, two deep learning models, SA-CNN and CA-CNN, are established respectively, which realizes the frequency domain and time domain feature extraction of wind sound, thus laying the foundation for wind speed measurement.

在SA-CNN模型中，引入了空间注意力机制，通过5个频域注意力模块相连接形成SA-CNN网络，提取高维度的频域特征；而在CA-CNN模型中，引入了自注意力机制，通过5个时间注意力模块相连接形成CA-CNN网络，提取高维度的时域特征。这两种模型的应用有效地增强了对风声特征的提取能力。In the SA-CNN model, the spatial attention mechanism is introduced, and the SA-CNN network is formed by connecting five frequency domain attention modules to extract high-dimensional frequency domain features; in the CA-CNN model, the self-attention mechanism is introduced, and the CA-CNN network is formed by connecting five time attention modules to extract high-dimensional time domain features. The application of these two models effectively enhances the ability to extract wind sound features.

最终，该方法通过全连接神经网络层将频域和时域特征相融合，并利用线性回归层计算对应的风速值，构建完整的深度学习模型。这一创新性方法在风速测量领域取得了显著的技术进步，提高了测量的准确性和稳定性，具有广泛的应用前景。Finally, the method combines the frequency domain and time domain features through a fully connected neural network layer, and uses a linear regression layer to calculate the corresponding wind speed value to build a complete deep learning model. This innovative method has made significant technological progress in the field of wind speed measurement, improved the accuracy and stability of measurement, and has broad application prospects.

第五，在风速测量领域，传统的方法往往依赖于物理传感器直接测量，这种方法在部署和维护上成本较高，且受到地理位置和环境的限制。随着监控相机的普及和深度学习技术的发展，利用监控相机捕获的音频数据来间接测量风速成为了一种新的。然而，如何从复杂的音频数据中准确提取出与风速相关的特征，并构建有效的深度学习模型来实现风速的准确测量，成为了亟待解决的技术问题。Fifth, in the field of wind speed measurement, traditional methods often rely on direct measurement of physical sensors, which is costly to deploy and maintain, and is limited by geographical location and environment. With the popularization of surveillance cameras and the development of deep learning technology, using audio data captured by surveillance cameras to indirectly measure wind speed has become a new method. However, how to accurately extract wind speed-related features from complex audio data and build an effective deep learning model to achieve accurate wind speed measurement has become a technical problem that needs to be solved urgently.

本发明提出了一种基于监控相机音频数据的风速测量深度学习方法，该方法首先实现了监控音频数据的分离与获取，并创新性地提取了监控风声的均方根能量和MEL频谱图作为风声的时、频域特征。接着，通过引入双注意力机制，构建了SA-CNN和CA-CNN两个深度学习模型，分别用于提取风声的高维度频域和时域特征。最后，将两个模型提取的特征进行融合，并使用线性回归层计算对应风速值，实现了风速的准确测量。This paper proposes a deep learning method for wind speed measurement based on surveillance camera audio data. This method first realizes the separation and acquisition of surveillance audio data, and innovatively extracts the root mean square energy and MEL spectrum of the monitored wind sound as the time and frequency domain features of the wind sound. Then, by introducing the dual attention mechanism, two deep learning models, SA-CNN and CA-CNN, are constructed to extract high-dimensional frequency domain and time domain features of wind sound, respectively. Finally, the features extracted by the two models are fused, and the corresponding wind speed value is calculated using a linear regression layer to achieve accurate measurement of wind speed.

本发明的显著技术进步在于：一是通过提取均方根能量和MEL频谱图作为特征，有效捕捉了风声在时域和频域上的特性；二是通过引入双注意力机制，提高了深度学习模型对关键特征的关注度，从而提高了风速测量的准确性；三是利用监控相机的音频数据实现了风速的间接测量，降低了成本，扩展了风速测量的应用场景；四是构建了一个完整的深度学习模型，实现了从音频数据到风速值的直接转换，提高了风速测量的自动化和智能化水平。The significant technical advances of the present invention are: first, by extracting root mean square energy and MEL spectrum as features, the characteristics of wind sound in the time domain and frequency domain are effectively captured; second, by introducing a dual attention mechanism, the focus of the deep learning model on key features is improved, thereby improving the accuracy of wind speed measurement; third, the audio data of the surveillance camera is used to realize indirect measurement of wind speed, which reduces costs and expands the application scenarios of wind speed measurement; fourth, a complete deep learning model is constructed to realize the direct conversion from audio data to wind speed values, thereby improving the automation and intelligence level of wind speed measurement.

综上所述，本发明提出的基于监控相机音频数据的风速测量深度学习方法，通过创新性的特征提取方法和深度学习模型设计，解决了传统风速测量方法成本高、受环境限制等问题，实现了风速的准确、高效测量，为风速测量领域带来了新的技术突破和发展机遇。In summary, the deep learning method for wind speed measurement based on surveillance camera audio data proposed in the present invention solves the problems of high cost and environmental restrictions of traditional wind speed measurement methods through innovative feature extraction methods and deep learning model design, and realizes accurate and efficient measurement of wind speed, bringing new technological breakthroughs and development opportunities to the field of wind speed measurement.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明实施例提供的基于监控相机音频数据的风速测量深度学习方法流程图；FIG1 is a flow chart of a deep learning method for wind speed measurement based on surveillance camera audio data provided by an embodiment of the present invention;

图2是本发明实施例提供的SA-CNN结构图；FIG2 is a diagram of the SA-CNN structure provided by an embodiment of the present invention;

图3是本发明实施例提供的CA-CNN结构图；FIG3 is a structural diagram of a CA-CNN provided by an embodiment of the present invention;

图4是本发明实施例提供的不同风速音频的RMS对比图；FIG4 is an RMS comparison diagram of different wind speed audios provided by an embodiment of the present invention;

图5是本发明实施例提供的不同风速音频的MEL特征图；FIG5 is a MEL characteristic diagram of different wind speed audios provided by an embodiment of the present invention;

图6是本发明实施例提供的基于监控相机音频数据的风速测量深度学习系统模块图；6 is a module diagram of a deep learning system for wind speed measurement based on surveillance camera audio data provided by an embodiment of the present invention;

图7是本发明实施例提供的监控音频风速测量方法运行界面示意图。FIG. 7 is a schematic diagram of an operating interface of a monitoring audio wind speed measurement method provided in an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the purpose, technical solution and advantages of the present invention more clearly understood, the present invention is further described in detail below in conjunction with the embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention and are not used to limit the present invention.

本发明实施例的两个具体应用实施例为：Two specific application examples of the embodiment of the present invention are:

实施例一：城市道路监控中的风速测量Example 1: Wind speed measurement in urban road monitoring

在城市道路监控中，了解实时风速有助于交通管理和突发事件的应急响应。例如，强风天气导致道路封闭、交通事故增加等情况，提前预警可以提高城市交通管理的效率和安全性。In urban road monitoring, understanding real-time wind speed is helpful for traffic management and emergency response to emergencies. For example, when strong winds lead to road closures and increased traffic accidents, early warning can improve the efficiency and safety of urban traffic management.

监控相机音频数据获取：在城市道路的监控系统中，安装具有音频捕捉功能的监控相机。获取监控视频和音频数据，并将音频数据转换成.wav格式。Surveillance camera audio data acquisition: In the urban road monitoring system, surveillance cameras with audio capture function are installed to acquire surveillance video and audio data, and convert the audio data into .wav format.

音频特征提取：将音频数据分帧处理，并进行汉明窗加窗处理。对每一帧进行离散傅里叶变换，获取短时能量谱。通过Mel尺度的三角形滤波器组，将线性频谱映射到Mel非线性频谱，并取log获取Log-Mel特征。提取声音均方根能量的幅度包络，作为风声的时域特征。Audio feature extraction: The audio data is processed by dividing into frames and performing Hamming windowing. Each frame is subjected to discrete Fourier transform to obtain the short-time energy spectrum. The linear spectrum is mapped to the Mel nonlinear spectrum through the Mel-scale triangular filter bank, and the log is taken to obtain the Log-Mel feature. The amplitude envelope of the sound root mean square energy is extracted as the time domain feature of the wind sound.

深度学习模型构建：引入空间注意力机制和自注意力机制，构建SA-CNN和CA-CNN网络，分别提取风声的频域和时域特征。将提取的时域和频域特征输入全连接神经网络层，进行特征融合，获取风速描述的特征向量。Deep learning model construction: Introduce spatial attention mechanism and self-attention mechanism, build SA-CNN and CA-CNN networks, respectively extract the frequency domain and time domain features of wind sound. Input the extracted time domain and frequency domain features into the fully connected neural network layer, perform feature fusion, and obtain the feature vector describing the wind speed.

风速值计算：使用线性回归层，计算对应的风速值。将风速测量结果反馈给城市交通管理系统，实现实时监控和预警。Wind speed value calculation: Use the linear regression layer to calculate the corresponding wind speed value. Feedback the wind speed measurement results to the urban traffic management system to achieve real-time monitoring and early warning.

实施例二：野外环境监控中的风速测量Example 2: Wind speed measurement in field environment monitoring

在野外环境监控中，实时风速测量对于气象监测和生态环境保护非常重要。例如，在森林火灾预警中，风速是关键因素之一，实时监测有助于预防和控制火灾。In field environment monitoring, real-time wind speed measurement is very important for meteorological monitoring and ecological environment protection. For example, in forest fire early warning, wind speed is one of the key factors, and real-time monitoring helps prevent and control fires.

监控相机音频数据获取：在野外环境中，安装具有音频捕捉功能的监控相机（如在森林中的监控塔上）。获取监控视频和音频数据，并将音频数据转换成.wav格式。Surveillance camera audio data acquisition: In a field environment, install a surveillance camera with audio capture function (such as on a surveillance tower in a forest). Acquire surveillance video and audio data, and convert the audio data into .wav format.

风速值计算：使用线性回归层，计算对应的风速值。风速测量结果反馈给生态环境监测系统，实现实时监控和预警。Wind speed value calculation: Use the linear regression layer to calculate the corresponding wind speed value. The wind speed measurement results are fed back to the ecological environment monitoring system to achieve real-time monitoring and early warning.

如图1所示，本发明实施例提供的基于监控相机音频数据的风速测量深度学习方法包括以下步骤：As shown in FIG1 , the deep learning method for wind speed measurement based on surveillance camera audio data provided by an embodiment of the present invention includes the following steps:

S101，监控音频数据的分离与获取；S101, separation and acquisition of monitoring audio data;

S102，提取监控风声的均方根能量和MEL频谱图作为风声的时、频域特征；S102, extracting the root mean square energy and MEL spectrum of the monitored wind sound as the time and frequency domain features of the wind sound;

S103，引入双注意力机制，建立SA-CNN风声频域特征提取深度学习模型和CA-CNN风声时域特征提取深度学习模型，进而形成风速测量深度学习模型。S103, introduce the dual attention mechanism, establish the SA-CNN wind sound frequency domain feature extraction deep learning model and the CA-CNN wind sound time domain feature extraction deep learning model, and then form a wind speed measurement deep learning model.

本发明实施例提供的步骤一，包括：The first step provided in the embodiment of the present invention includes:

（1-2）将监控视频、音频数据进行分离，并将音频转换成.wav格式；(1-2) Separate the surveillance video and audio data, and convert the audio into .wav format;

本发明实施例提供的步骤二提取监控风声的均方根能量作为风声的时域特征，具体包括如下步骤：Step 2 provided in the embodiment of the present invention extracts the root mean square energy of the monitored wind sound as the time domain feature of the wind sound, and specifically includes the following steps:

（2-2）对每一帧进行离散傅里叶变换（The Discrete Fourier transform, DFT）操作，将时域波形信号转换到频域，获取短时能量谱。延频率，单位HZ；(2-2) Perform a Discrete Fourier Transform (DFT) operation on each frame to convert the time domain waveform signal into the frequency domain and obtain the short-time energy spectrum. . Delay frequency , unit: HZ;

（2-3）将能量谱通过一组Mel尺度的三角形滤波器组，按照式（1）中描述的规则，将原始声音信号产生的线性频谱映射到基于听觉感知的Mel非线性频谱中。并对取log获取获取Log-Mel特征。(2-3) The energy spectrum Through a set of Mel-scale triangular filter banks, according to the rule described in equation (1), the linear spectrum generated by the original sound signal is mapped to the Mel nonlinear spectrum based on auditory perception. In. And Get log Get Log-Mel features.

(1) (1)

式中，是第个Log-Mel频谱，是倒谱特征的预设值，是倒谱系数的指数，为结果特征的维度。至此，实现监控风声频域特征的提取。时间对应的风声频域特征记为：In the formula, It is Log-Mel spectrum, is the preset value of the cepstrum feature, is the exponent of the cepstral coefficient, is the dimension of the result feature. So far, the extraction of wind noise frequency domain features has been realized. The corresponding wind sound frequency domain characteristics are recorded as :

（2-4）以时间为横轴，以声音的振幅为纵轴，绘制声音均方根能量的幅度包络对声音在时间维度的变化进行可视化，以此作为风声的RMS时域特征。时间对应的风声时域特征记为。(2-4) With time as the horizontal axis and the amplitude of the sound as the vertical axis, the amplitude envelope of the root mean square energy of the sound is plotted to visualize the change of the sound in the time dimension, which is used as the RMS time domain feature of the wind sound. The corresponding wind sound time domain characteristics are recorded as .

如图2所示，本发明实施例提供的步骤三中的建立SA-CNN风声频域特征提取深度学习模型，具体包括如下步骤：As shown in FIG. 2 , the step 3 of establishing the SA-CNN wind sound frequency domain feature extraction deep learning model provided by the embodiment of the present invention specifically includes the following steps:

（3-1）频域注意力模块构建(3-1) Construction of frequency domain attention module

引入空间注意力机制，通过为输入数据的不同部分分配不同的权重（或注意力分数），模型能够识别最重要的信息。注意力机制与CNN层相结合形成频域注意力模块，实现如下：By introducing the spatial attention mechanism, the model can identify the most important information by assigning different weights (or attention scores) to different parts of the input data. The attention mechanism is combined with the CNN layer to form a frequency domain attention module, which is implemented as follows:

通过CNN网络获取输入特征图（维度为，为长、为宽、为通道），第层卷积神经网络的实现如下式：Get input feature map through CNN network (Dimensions are , For long, For width, for channels), The implementation of the layer convolutional neural network is as follows:

(2) (2)

其中，为激活函数，和分别为权重和偏置，为卷积操作。in, is the activation function, and are weights and biases respectively, is the convolution operation.

提取每个像素的通道最大值组成新的特征图（维度为）；提取每个像素的通道平均值组成另一特征图（维度为）。Extract the channel maximum value of each pixel to form a new feature map (Dimensions are ); Extract the channel average of each pixel to form another feature map (Dimensions are ).

对两个特征图进行堆叠，得到特征图（维度为）；Stack the two feature maps to get the feature map (Dimensions are );

使用卷积操作，将调整通道数为1，得到（维度为）；Using the convolution operation, Adjust the number of channels to 1, and we get (Dimensions are );

Sigmoid函数将每个位置权值归一化到0-1之间，作为权重（维度为）；The Sigmoid function normalizes the weight of each position to between 0 and 1 as the weight (Dimensions are );

将输入的特征图乘以权值，得到新的特征图。The input feature map Multiply by weight , and obtain a new feature map.

（3-2）SA-CNN网络搭建。如图2所示。将5个频域注意力模块相连接形成SA-CNN网络，实现网络的加深，从而提取高维度的频域特征，用于风速信息的挖掘。(3-2) SA-CNN network construction. As shown in Figure 2, five frequency domain attention modules are connected to form a SA-CNN network to deepen the network and extract high-dimensional frequency domain features for wind speed information mining.

（3-3）将步骤（2-4）获取的风声频域MEL特征输入SA-CNN网络，输出的特征向量记为：；(3-3) The wind sound frequency domain MEL features obtained in step (2-4) are input into the SA-CNN network, and the output feature vector is recorded as: ;

如图3所示，本发明实施例提供的步骤三中的构建CA-CNN风声时域特征提取深度学习模型，包括如下步骤：As shown in FIG3 , the step 3 of building a CA-CNN wind sound time domain feature extraction deep learning model provided by the embodiment of the present invention includes the following steps:

（4-1）时域注意力模块构建(4-1) Construction of temporal attention module

引入自注意力机制，通过为输入数据的不同部分分配不同的权重（或注意力分数），模型能够识别最重要的信息。注意力机制与CNN层相结合形成时域注意力模块，实现如下：The self-attention mechanism is introduced to enable the model to identify the most important information by assigning different weights (or attention scores) to different parts of the input data. The attention mechanism is combined with the CNN layer to form a temporal attention module, which is implemented as follows:

CNN网络处理所得获取输入特征图（维度为）, 第层卷积神经网络的实现如式(2);The input feature map is obtained by CNN network processing (Dimensions are ), The implementation of the layer convolutional neural network is as shown in formula (2);

对进行全局池化(最大/平均)，得到一个特征向量（维度为）；right Perform global pooling (maximum/average) to obtain a feature vector (Dimensions are );

通过全连接操作将维度降低到维，并添加ReLU激活层，得到新的特征向量;Through the full connection operation The dimension is reduced to dimension, and add a ReLU activation layer to get a new feature vector ;

再次使用全连接操作将维度恢复到维，得到新的特征向量（维度为）；Using the full connection operation again will Dimension restored to dimension, and obtain a new feature vector (Dimensions are );

Sigmoid函数将权值归一化到0-1之间，作为每个像素的权重（维度为）；The Sigmoid function will The weight is normalized to between 0 and 1 as the weight of each pixel. (Dimensions are );

（4-2）CA-CNN网络搭建。如图3所示。将5个时间注意力模块相连接形成CA-CNN网络，实现网络的加深，从而提取高维度的时域特征，用于风速信息的提取与判别。(4-2) CA-CNN network construction. As shown in Figure 3, five time attention modules are connected to form a CA-CNN network to deepen the network and extract high-dimensional time domain features for wind speed information extraction and discrimination.

（4-3）将步骤（2-5）获取的风声时域MEL特征输入CA-CNN网络，输出的特征向量记为：；(4-3) Input the wind sound time domain MEL features obtained in step (2-5) into the CA-CNN network, and the output feature vector is recorded as: ;

本发明实施例提供的步骤三中的形成风速测量深度学习模型，包括：The step 3 of forming a wind speed measurement deep learning model provided by the embodiment of the present invention includes:

图4是不同风速音频的RMS对比图；图5是不同风速音频的MEL特征图。FIG4 is an RMS comparison diagram of audio at different wind speeds; FIG5 is a MEL characteristic diagram of audio at different wind speeds.

如图6所示，本发明实施例提供的基于监控相机音频数据的风速测量深度学习系统包括：As shown in FIG6 , the wind speed measurement deep learning system based on surveillance camera audio data provided by an embodiment of the present invention includes:

将本发明应用实施例提供的基于监控相机音频数据的风速测量深度学习方法应用于计算机设备，所述计算机设备包括存储器和处理器，所述存储器存储有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行所述基于监控相机音频数据的风速测量深度学习方法的步骤。The deep learning method for wind speed measurement based on surveillance camera audio data provided by an application embodiment of the present invention is applied to a computer device, wherein the computer device includes a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the deep learning method for wind speed measurement based on surveillance camera audio data.

将本发明应用实施例提供的基于监控相机音频数据的风速测量深度学习方法应用于信息数据处理终端，所述信息数据处理终端用于实现所述基于监控相机音频数据的风速测量深度学习系统。The deep learning method for wind speed measurement based on surveillance camera audio data provided by the application embodiment of the present invention is applied to an information data processing terminal, and the information data processing terminal is used to implement the deep learning system for wind speed measurement based on surveillance camera audio data.

应当注意，本发明的实施方式可以通过硬件、软件或者软件和硬件的结合来实现。硬件部分可以利用专用逻辑来实现；软件部分可以存储在存储器中，由适当的指令执行系统，例如微处理器或者专用设计硬件来执行。本领域的普通技术人员可以理解上述的设备和方法可以使用计算机可执行指令和/或包含在处理器控制代码中来实现，例如在诸如磁盘、CD或DVD-ROM的载体介质、诸如只读存储器(固件)的可编程的存储器或者诸如光学或电子信号载体的数据载体上提供了这样的代码。本发明的设备及其模块可以由诸如超大规模集成电路或门阵列、诸如逻辑芯片、晶体管等的半导体、或者诸如现场可编程门阵列、可编程逻辑设备等的可编程硬件设备的硬件电路实现，也可以用由各种类型的处理器执行的软件实现，也可以由上述硬件电路和软件的结合例如固件来实现。It should be noted that the embodiments of the present invention can be implemented by hardware, software, or a combination of software and hardware. The hardware part can be implemented using dedicated logic; the software part can be stored in a memory and executed by an appropriate instruction execution system, such as a microprocessor or dedicated design hardware. It can be understood by a person of ordinary skill in the art that the above-mentioned devices and methods can be implemented using computer executable instructions and/or contained in a processor control code, such as a carrier medium such as a disk, CD or DVD-ROM, a programmable memory such as a read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. Such code is provided on the carrier medium. The device and its modules of the present invention can be implemented by hardware circuits such as very large-scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., can also be implemented by software executed by various types of processors, and can also be implemented by a combination of the above-mentioned hardware circuits and software, such as firmware.

南京大学研发的监控音频风速测量方法运行界面如图所示。在图7所示的方法中，以布设在南京工程学院的监控相机为例（坐标：119°049′N，32°25′E），假定需从该相机数据中获取风速数据：The operating interface of the monitoring audio wind speed measurement method developed by Nanjing University is shown in the figure. In the method shown in Figure 7, taking the monitoring camera deployed at Nanjing Institute of Technology as an example (coordinates: 119°049′N, 32°25′E), it is assumed that wind speed data needs to be obtained from the camera data:

第一步，确定监控相机的位置。监控相机2位于北门，实际坐标为（118.6128754,32.077598）；The first step is to determine the location of the surveillance camera. Surveillance camera 2 is located at the north gate, and the actual coordinates are (118.6128754, 32.077598);

第二步，将监控回传的数据分离，分别获得监控视频、音频数据。音频分离方法核心代码如下：The second step is to separate the data sent back by the monitoring to obtain monitoring video and audio data respectively. The core code of the audio separation method is as follows:

# 抓取监控视频# Capture surveillance video

my_clip = mp.VideoFileClip(video_path)my_clip = mp.VideoFileClip(video_path)

# 提取音频（运行后在视频路径中产生.wav格式的音频文件）# Extract audio (generate .wav audio file in the video path after running)

my_clip.audio.write_audiofile(f'{video_path}.wav')my_clip.audio.write_audiofile(f'{video_path}.wav')

# 抓取视频# Capture video

cap = cv2.VideoCapture(video_path)cap = cv2.VideoCapture(video_path)

# 使用cv2.WINDOW_NORMAL标志创建可以通过鼠标关闭的窗口# Use the cv2.WINDOW_NORMAL flag to create a window that can be closed by the mouse

cv2.namedWindow('video', cv2.WINDOW_NORMAL)cv2.namedWindow('video', cv2.WINDOW_NORMAL)

while cap.isOpened():while cap.isOpened():

ret, frame = cap.read()ret, frame = cap.read()

cv2.imshow('video', frame)cv2.imshow('video', frame)

key = cv2.waitKey(25)key = cv2.waitKey(25)

if key == 27 or cv2.getWindowProperty('video', cv2.WND_PROP_VISIBLE) < 1:if key == 27 or cv2.getWindowProperty('video', cv2.WND_PROP_VISIBLE) < 1:

break # 释放硬件资源break # Release hardware resources

cap.release()cap.release()

cv2.destroyAllWindows()cv2.destroyAllWindows()

第三步，提取监控风声的MEL和RMS变化，并使用全连接网络聚合风声的均方根能量图、MEL图。MEL特征提取核心代码如下：The third step is to extract the MEL and RMS changes of the monitored wind sound, and use the fully connected network to aggregate the root mean square energy map and MEL map of the wind sound. The core code for MEL feature extraction is as follows:

# 加载提取的音频# Load the extracted audio

y, sr = librosa.load(f'{video_path}.wav', sr=None)y, sr = librosa.load(f'{video_path}.wav', sr=None)

VOICE_LEN=32000 #获得N_FFT的长度VOICE_LEN=32000 #Get the length of N_FFT

print("sr:" , sr)print("sr:" , sr)

N_FFT=getNearestLen(0.25,sr)N_FFT = getNearestLen(0.25,sr)

print("N_FFT:" , N_FFT)print("N_FFT:" , N_FFT)

y=normalizeVoiceLen(y,VOICE_LEN)y=normalizeVoiceLen(y,VOICE_LEN)

print("y.shape:" , y.shape)print("y.shape:" , y.shape)

librosa.display.waveshow(y, sr)librosa.display.waveshow(y, sr)

mel_data=librosa.feature.mel(y=y, sr=sr,S=None, n_mfcc=20, dct_type=2, norm='ortho',n_fft=N_FFT,hop_length=int(N_FFT/4)) #获取风声mel特征mel_data=librosa.feature.mel(y=y, sr=sr,S=None, n_mfcc=20, dct_type=2, norm='ortho',n_fft=N_FFT,hop_length=int(N_FFT/4)) #Get wind sound mel features

rms_data=librosa.feature.rms(y=y, sr=sr,S=None, n_mfcc=20, dct_type=2, norm='ortho',n_fft=N_FFT,hop_length=int(N_FFT/4)) #获取风声rms特征rms_data=librosa.feature.rms(y=y, sr=sr,S=None, n_mfcc=20, dct_type=2, norm='ortho',n_fft=N_FFT,hop_length=int(N_FFT/4)) #Get the rms features of wind sound

第四步，搭建并行结构的风声速度测量模型，服务于风速的计算，结构如图1所示。风速测量模型的核心代码如下。The fourth step is to build a wind speed measurement model with a parallel structure to serve the calculation of wind speed. The structure is shown in Figure 1. The core code of the wind speed measurement model is as follows.

import torch.nn as nnimport torch.nn as nn

class ChannelAttention(nn.Module): # 时域特征提取CA-CNN模块class ChannelAttention(nn.Module): # Time domain feature extraction CA-CNN module

def __init__(self, channels, reduction=16):def __init__(self, channels, reduction=16):

super(ChannelAttention, self).__init__()super(ChannelAttention, self).__init__()

self.avg_pool = nn.AdaptiveAvgPool2d(1)self.avg_pool = nn.AdaptiveAvgPool2d(1)

self.fc1 = nn.Conv2d(channels, channels // reduction, kernel_size=1, stride=1, padding=0)self.fc1 = nn.Conv2d(channels, channels // reduction, kernel_size=1, stride=1, padding=0)

self.relu = nn.ReLU(inplace=True)self.relu = nn.ReLU(inplace=True)

self.fc2 = nn.Conv2d(channels // reduction, channels, kernel_size=1, stride=1, padding=0)self.fc2 = nn.Conv2d(channels // reduction, channels, kernel_size=1, stride=1, padding=0)

self.sigmoid = nn.Sigmoid()self.sigmoid = nn.Sigmoid()

def forward(self, x):def forward(self, x):

out = self.avg_pool(x)out = self.avg_pool(x)

out = self.fc1(out)out = self.fc1(out)

out = self.relu(out)out = self.relu(out)

out = self.fc2(out)out = self.fc2(out)

out = self.sigmoid(out)out = self.sigmoid(out)

out = x * outout = x * out

return outreturn out

class SpatialAttention(nn.Module): # 频域特征提取SA-CNN模块class SpatialAttention(nn.Module): # Frequency domain feature extraction SA-CNN module

def __init__(self):def __init__(self):

super(SpatialAttention, self).__init__()super(SpatialAttention, self).__init__()

self.conv = nn.Conv2d(2, 1, kernel_size=7, stride=1, padding=3)self.conv = nn.Conv2d(2, 1, kernel_size=7, stride=1, padding=3)

self.sigmoid = nn.Sigmoid()self.sigmoid = nn.Sigmoid()

def forward(self, x):def forward(self, x):

avg_out = torch.mean(x, dim=1, keepdim=True)avg_out = torch.mean(x, dim=1, keepdim=True)

max_out, _ = torch.max(x, dim=1, keepdim=True)max_out, _ = torch.max(x, dim=1, keepdim=True)

out = torch.cat([avg_out, max_out], dim=1)out = torch.cat([avg_out, max_out], dim=1)

out = self.conv(out)out = self.conv(out)

out = self.sigmoid(out)out = self.sigmoid(out)

out = x * outout = x * out

return outreturn out

第六步，其他相机依次执行以上操作，直至遍历所有监控相机。Step 6: Other cameras perform the above operations in sequence until all surveillance cameras are traversed.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，都应涵盖在本发明的保护范围之内。The above description is only a specific implementation mode of the present invention, but the protection scope of the present invention is not limited thereto. Any modifications, equivalent substitutions and improvements made by any technician familiar with the technical field within the technical scope disclosed by the present invention and within the spirit and principle of the present invention should be covered by the protection scope of the present invention.

Claims

Translated fromChinese

1.一种基于监控相机音频数据的风速测量深度学习方法，其特征在于，包括：1. A deep learning method for wind speed measurement based on surveillance camera audio data, characterized by comprising:

2.根据权利要求1所述的基于监控相机音频数据的风速测量深度学习方法，其特征在于，所述步骤一，包括：2. The deep learning method for wind speed measurement based on surveillance camera audio data according to claim 1, characterized in that the step 1 comprises:

3.根据权利要求1所述的基于监控相机音频数据的风速测量深度学习方法，其特征在于，所述步骤二提取监控风声的均方根能量作为风声的时域特征，具体包括如下步骤：3. The deep learning method for wind speed measurement based on surveillance camera audio data according to claim 1 is characterized in that the step 2 extracts the root mean square energy of the monitored wind sound as the time domain feature of the wind sound, specifically comprising the following steps:

(1) (1)

4.根据权利要求1所述的基于监控相机音频数据的风速测量深度学习方法，其特征在于，所述步骤三中的建立SA-CNN风声频域特征提取深度学习模型，具体包括如下步骤：4. The deep learning method for wind speed measurement based on surveillance camera audio data according to claim 1 is characterized in that the establishment of the SA-CNN wind sound frequency domain feature extraction deep learning model in step 3 specifically includes the following steps:

5.根据权利要求1所述的基于监控相机音频数据的风速测量深度学习方法，其特征在于，所述步骤三中的构建CA-CNN风声时域特征提取深度学习模型，包括如下步骤：5. The deep learning method for wind speed measurement based on surveillance camera audio data according to claim 1 is characterized in that the step 3 of constructing a CA-CNN wind sound time domain feature extraction deep learning model comprises the following steps:

6.根据权利要求1所述的基于监控相机音频数据的风速测量深度学习方法，其特征在于，所述步骤三中的形成风速测量深度学习模型，包括：6. The deep learning method for wind speed measurement based on surveillance camera audio data according to claim 1, characterized in that the step of forming a deep learning model for wind speed measurement in step 3 comprises:

7.一种实施如权利要求1-6任意一项所述的基于监控相机音频数据的风速测量深度学习方法的基于监控相机音频数据的风速测量深度学习系统，其特征在于，所述基于监控相机音频数据的风速测量深度学习系统包括：7. A deep learning system for wind speed measurement based on surveillance camera audio data that implements the deep learning method for wind speed measurement based on surveillance camera audio data as described in any one of claims 1 to 6, characterized in that the deep learning system for wind speed measurement based on surveillance camera audio data comprises:

8.一种计算机设备，其特征在于，所述计算机设备包括存储器和处理器，所述存储器存储有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行如权利要求1-6任意一项所述基于监控相机音频数据的风速测量深度学习方法的步骤。8. A computer device, characterized in that the computer device includes a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the deep learning method for wind speed measurement based on surveillance camera audio data as described in any one of claims 1-6.

9.一种计算机可读存储介质，其特征在于，存储有计算机程序，所述计算机程序被处理器执行时，使得所述处理器执行如权利要求1-6任意一项所述基于监控相机音频数据的风速测量深度学习方法的步骤。9. A computer-readable storage medium, characterized in that a computer program is stored therein, and when the computer program is executed by a processor, the processor executes the steps of the deep learning method for wind speed measurement based on surveillance camera audio data as described in any one of claims 1 to 6.

10.一种信息数据处理终端，其特征在于，所述信息数据处理终端用于实现如权利要求7所述基于监控相机音频数据的风速测量深度学习系统。10. An information data processing terminal, characterized in that the information data processing terminal is used to implement the wind speed measurement deep learning system based on surveillance camera audio data as described in claim 7.