CN118887789A

Movatterモバイル変換

Info

Publication number: CN118887789A
Application number: CN202411012957.8A
Authority: CN
Inventors: 孙兵; 郭唐仪; 杨诚一
Original assignee: Jiangsu Jifan Transportation Technology Co ltd
Current assignee: Jiangsu Jifan Transportation Technology Co ltd
Priority date: 2024-07-26
Filing date: 2024-07-26
Publication date: 2024-11-01

Abstract

The invention discloses an active safety early warning system for a water-facing cliff road section, which is characterized by comprising a detection unit, a data processing unit, a communication unit and an early warning unit, wherein: the detection unit is used for detecting the sound signal; the data processing unit comprises a data preprocessing module and a voice recognition module, wherein: the preprocessing module is used for preprocessing the detected sound signals; the voice recognition module recognizes the preprocessed data and outputs a detection result; the communication unit is used for transmitting signal data; the early warning unit receives the detection result of the data processing unit and triggers the early warning working state. The system has the advantages of low cost, flexible deployment, strong applicability and high reliability in the environment of facing the cliff. The sound-based detection technology has wide application in various scenes such as a road with a water facing cliff in a rural road, a road with poor sight distance in a curve, a factory and mine area, a construction area, an edge sea defense and the like.

Description

Translated fromChinese

一种临水临崖路段主动安全预警系统An active safety warning system for roads near water and cliffs

技术领域Technical Field

本发明属于道路安全预警装置领域，具体是一种临水临崖路段主动安全预警系统。The invention belongs to the field of road safety warning devices, in particular to an active safety warning system for road sections near water or cliffs.

背景技术Background Art

2022年，我国因道路交通事故共导致约14500人死亡，农路具有安全等级低、交通事故检测时效性差等特点，因此农路交通安全成为交通强国建设道路上的重点问题。临水临崖路段作为农路中的事故高发路段，自然成为现阶段安全防护的重中之重。In 2022, about 14,500 people died in road traffic accidents in my country. Rural roads have the characteristics of low safety level and poor timeliness of traffic accident detection. Therefore, rural road traffic safety has become a key issue in the construction of a strong transportation country. As a high-incidence section of rural roads, sections near water and cliffs have naturally become the top priority of safety protection at this stage.

目前针对临水临崖路段的防护措施有很多，比如安装护栏、种植绿化带、安装摄像头、路灯等。护栏在夜间农路效果不佳，交通事故发生时对行驶速度较快的车辆防护作用不明显；种植绿化带在城市道路是被广泛使用的方法，但考虑农路宽度普遍在3-5米，绿化带在占地面积上不占优势；摄像头采集数据受天气、光照影响大，在夜间很难进行车辆精准识别；路灯、雷达等在成本和供电问题上有具有不小的挑战。At present, there are many protective measures for roads near water and cliffs, such as installing guardrails, planting green belts, installing cameras, street lights, etc. Guardrails are not effective on farm roads at night, and they do not provide significant protection for vehicles traveling at high speeds when traffic accidents occur; planting green belts is a widely used method on urban roads, but considering that farm roads are generally 3-5 meters wide, green belts do not have an advantage in terms of floor space; camera data collection is greatly affected by weather and lighting, and it is difficult to accurately identify vehicles at night; street lights, radars, etc. have considerable challenges in terms of cost and power supply.

发明内容Summary of the invention

本发明针对背景技术中存在的问题，综合考虑全天时、全天候工作、设备占地面积、安装供电成本、预警时效性等，本申请采用基于声音的车辆检测方法，结合道钉、预警标志牌进行声光预警，达到车辆在夜间进入临水临崖路段时提前被预警、危险路段行驶过程中清晰地看到道路边界的效果，确保隐患治理到位。In response to the problems existing in the background technology, the present invention comprehensively considers all-day and all-weather operation, equipment footprint, installation and power supply costs, warning timeliness, etc. This application adopts a sound-based vehicle detection method, combined with road studs and warning signs for sound and light warning, so as to achieve the effect of early warning when vehicles enter sections near water or cliffs at night, and clearly see the road boundaries when driving in dangerous sections, thereby ensuring that hidden dangers are effectively addressed.

技术方案：Technical solution:

一种临水临崖路段主动安全预警系统，它包括探测单元，数据处理单元，通信单元和预警单元，其中：An active safety warning system for a road section near water or cliffs, comprising a detection unit, a data processing unit, a communication unit and a warning unit, wherein:

探测单元用于探测声音信号；The detection unit is used for detecting sound signals;

数据处理单元包括数据预处理模块、声音识别模块，其中：预处理模块对探测到的声音信号进行预处理；声音识别模块对预处理后的数据进行识别，并输出检测结果；The data processing unit includes a data preprocessing module and a sound recognition module, wherein: the preprocessing module preprocesses the detected sound signal; the sound recognition module recognizes the preprocessed data and outputs the detection result;

通信单元用于信号数据的传输；The communication unit is used for transmission of signal data;

预警单元接收数据处理单元的检测结果，触发预警工作状态。The early warning unit receives the detection result of the data processing unit and triggers the early warning working state.

优选的，所述探测单元为多个声音传感器组成。Preferably, the detection unit is composed of a plurality of sound sensors.

优选的，所述探测单元包括主动降噪模块，主动降噪模块执行以下步骤：Preferably, the detection unit includes an active noise reduction module, and the active noise reduction module performs the following steps:

1)通过探测单元中的麦克风阵列收集周围环境的声音信号，这些信号包括所需的不同车型的声音信号和各种噪声；1) Collect sound signals from the surrounding environment through the microphone array in the detection unit. These signals include the required sound signals of different car models and various noises;

2)把收集到的声音信号送入到主动降噪模块处理器中，该处理器利用适应滤波器算法生成一个与参考噪声频率相同但相位相反的信号，确保生成的声波能有效中和噪声，达到减噪。2) The collected sound signal is sent to the active noise reduction module processor, which uses an adaptive filter algorithm to generate a signal with the same frequency but opposite phase as the reference noise, ensuring that the generated sound wave can effectively neutralize the noise and achieve noise reduction.

优选的，在主动降噪过程中，自适应滤波算法通过以下基本数学表达式实现：Preferably, in the active noise reduction process, the adaptive filtering algorithm is implemented by the following basic mathematical expression:

其中，y(n)表示输出信号，计算得到的逆相声波；x(n)表示输入信号，即原始声音信号；w_i(n)表示在时刻n的滤波器权重；M表示滤波器的阶数。Among them, y(n) represents the output signal, the calculated inverse phase sound wave; x(n) represents the input signal, that is, the original sound signal;_wi (n) represents the filter weight at time n; and M represents the order of the filter.

优选的，所述预处理模块对探测到的声音信号进行预处理，具体步骤为：Preferably, the preprocessing module preprocesses the detected sound signal, and the specific steps are:

1)为了方便信号处理和保证音质，采集的格式都设置为wav，采样频率设置为44.1kHz；1) In order to facilitate signal processing and ensure sound quality, the acquisition format is set to wav and the sampling frequency is set to 44.1kHz;

2)把收集到的车辆声音数据进行预处理，通过预加重滤波器作用于原始音频信号，达到增强高频信息的效果，预加重所用滤波器种类为一阶高通滤波器，其数学表达式如下：2) The collected vehicle sound data is pre-processed and applied to the original audio signal through a pre-emphasis filter to achieve the effect of enhancing high-frequency information. The type of filter used for pre-emphasis is a first-order high-pass filter, and its mathematical expression is as follows:

其中，p为预加重系数，n为时间离散索引值，为预加重后的音频幅度，x(n)为预加重前的音频幅度，p的数值设置为0.97；Where p is the pre-emphasis coefficient, n is the time discrete index value, is the audio amplitude after pre-emphasis, x(n) is the audio amplitude before pre-emphasis, and the value of p is set to 0.97;

3)对声音数据进行分帧加窗，进行分帧加窗操作时均使用Hamming窗，帧长为1024，帧移为512，Hamming窗的数学表达式如下：3) Perform frame division and windowing on the sound data. The Hamming window is used for frame division and windowing operations. The frame length is 1024 and the frame shift is 512. The mathematical expression of the Hamming window is as follows:

其中，N为窗口宽度所包含的数据点总数，n为标记窗口内数据点位置的索引值，为加权参数，的值设置为0.46；Among them, N is the total number of data points contained in the window width, n is the index value marking the position of the data point in the window, is the weighting parameter, The value of is set to 0.46;

4)声音数据特征提取：4) Sound data feature extraction:

使用MFCC特征提取函数提取信号的特征参数，将上述预处理后的声音数据再进行快速傅里叶变换FFT得到信号的幅度谱，然后将变换得到的谱信号通过一组Mel滤波器组，之后将Mel滤波器输出的能量取对数得到对数能量谱，最后将对数能量谱进行离散余弦变换DCT，得到12阶MFCC参数；The characteristic parameters of the signal are extracted using the MFCC feature extraction function. The pre-processed sound data is subjected to the fast Fourier transform FFT to obtain the amplitude spectrum of the signal. The spectrum signal obtained by the transformation is then passed through a set of Mel filter groups. The energy output by the Mel filter is then logarithmically obtained to obtain the logarithmic energy spectrum. Finally, the logarithmic energy spectrum is subjected to discrete cosine transform DCT to obtain the 12th-order MFCC parameters.

其中，i为声音信号第i帧；n为离散余弦变换后的谱线；M为Mel滤波器组中滤波器的数量；S(i,m)为通过Mel滤波器后的对数能量，m为第几个滤波器。Among them, i is the i-th frame of the sound signal; n is the spectrum line after discrete cosine transform; M is the number of filters in the Mel filter bank; S(i,m) is the logarithmic energy after passing through the Mel filter, and m is the number of filters.

5)得到12维静态MFCC参数，做为声音识别模块的输入。5) Obtain 12-dimensional static MFCC parameters as input to the sound recognition module.

优选的，所述声音识别模块，基于预处理后的声音数据，使用声音模型进行识别，并输出检测结果，声音模型构建的具体步骤为：Preferably, the sound recognition module uses a sound model to perform recognition based on the preprocessed sound data and outputs a detection result. The specific steps of constructing the sound model are:

1)构建三层BP神经网络，其网络模型拓扑结构包括输入层、隐含层和输出层；将前面提取的音频特征参数归一化后作为神经网络的输入向量；1) Construct a three-layer BP neural network, whose network model topology includes an input layer, a hidden layer, and an output layer; normalize the audio feature parameters extracted previously as the input vector of the neural network;

2)在BP神经网络模型的构建中，需要对隐含层节点数进行选择，以便提升算法性能并避免“过拟合”问题，该隐含层节点数目前利用的是以下公式2) In the construction of the BP neural network model, it is necessary to select the number of hidden layer nodes in order to improve the algorithm performance and avoid the "overfitting" problem. The number of hidden layer nodes currently uses the following formula:

公式中n_k代表的是隐含层单元数，n_i代表的是输入层单元数，n₀代表的是输出层单元数，a为[0,10]之间的常数；In the formula, n_k represents the number of hidden layer units, n_i represents the number of input layer units, n₀ represents the number of output layer units, and a is a constant between [0,10];

3)模型训练中迭代次数设置50次，学习率设为0.05，动量系数设为0.9，训练次数为100，训练目标定为0.001；其中80％为训练集，另外20％为测试集。其中输出标签0为有车经过，输出标签为1为无车。3) In model training, the number of iterations is set to 50, the learning rate is set to 0.05, the momentum coefficient is set to 0.9, the number of training times is set to 100, and the training target is set to 0.001; 80% of the data is used as the training set, and the other 20% is used as the test set. The output label is 0 if a car has passed by, and the output label is 1 if there is no car.

优选的，所述预警单元为LED发光标志标志牌，预警工作时，警告标志牌长亮，提醒标志牌闪烁，正常时候不亮。Preferably, the early warning unit is an LED luminous sign. When the early warning is working, the warning sign is always on and the reminder sign flashes. It is not on normally.

优选的，所述预警单元为太阳能道钉，预警工作时，道钉发出红色光线，多个道钉同步闪烁。Preferably, the warning unit is a solar road stud. When the warning is working, the road stud emits red light and multiple road studs flash synchronously.

本发明的有益效果Beneficial Effects of the Invention

本申请通过基于声音的多类别、定向交通目标检测技术，实现车辆等特定声音的精准目标识别，在临水临崖路段检测来车。该方法成本低廉、部署灵活，在临水临崖环境中的适用性强、可靠性高。基于声音的检测技术，在农村公路临水临崖路段、弯道视距不良路段、厂矿区域、施工区域、边海防等诸多场景，均具有广阔应用。This application uses multi-category, directional traffic target detection technology based on sound to achieve accurate target recognition of specific sounds such as vehicles, and detect oncoming vehicles in sections near water and cliffs. This method is low-cost, flexible to deploy, and has strong applicability and high reliability in environments near water and cliffs. Sound-based detection technology has broad applications in many scenarios such as sections of rural roads near water and cliffs, sections with poor visibility on curved roads, factory and mining areas, construction areas, border and coastal defense, etc.

本申请可以及时地对驾驶员发出警示，降低了事故发生的可能性，可以避免或减轻严重交通事故带来的人员伤亡和车辆损毁等情况，从而减少了事故损失成本。This application can issue warnings to drivers in a timely manner, reduce the possibility of accidents, avoid or reduce casualties and vehicle damage caused by serious traffic accidents, and thus reduce the cost of accident losses.

本申请中使用声探装置进行车辆检测，相比于一般预警系统中使用的雷达、摄像头等装置，成本更加低廉，相关配套设施的搭建简便、投入量少，能够实现低成本的全天候车辆检测和预警。In this application, an acoustic detection device is used for vehicle detection. Compared with radars, cameras and other devices used in general early warning systems, the cost is lower, the construction of related supporting facilities is simple and the investment is small, and low-cost all-weather vehicle detection and early warning can be achieved.

本申请能够帮助驾驶员意识到潜在的危险，保障特殊路段上的行车安全，减少交通事故的发生，大大降低交通事故的伤亡和死亡人数，保护行车人员的生命和财产安全。从而提高道路安全水平和出行的舒适度与信心。This application can help drivers realize potential dangers, ensure driving safety on special roads, reduce the occurrence of traffic accidents, greatly reduce the number of casualties and deaths in traffic accidents, and protect the lives and property safety of drivers, thereby improving the level of road safety and the comfort and confidence of travel.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明的结构示意图FIG. 1 is a schematic diagram of the structure of the present invention.

图2为实施例中以FXLMS算法为基础的复合式控制结构图FIG. 2 is a composite control structure diagram based on the FXLMS algorithm in the embodiment

图3为实施例中FxLMS仿真图(结果)FIG3 is a FxLMS simulation diagram (result) of the embodiment

图4为实施例中FxLMS仿真图(误差)FIG4 is a FxLMS simulation diagram (error) in the embodiment

图5为实施例中广义互相关时延估计算法原理图FIG. 5 is a schematic diagram of a generalized cross-correlation delay estimation algorithm in an embodiment

图6为实施例中声信号频谱图FIG. 6 is a spectrum diagram of an acoustic signal in an embodiment

图7为实施例中声信号共振峰连线图FIG. 7 is a diagram showing the connection of the acoustic signal resonance peaks in an embodiment.

图8为实施例中声信号包络图FIG. 8 is an envelope diagram of an acoustic signal in an embodiment

图9为实施例中声信号包括图反傅里叶变换示意图FIG. 9 is a schematic diagram of an inverse Fourier transform of an acoustic signal including a graph in an embodiment

图10为实施例中原始频率还原示意图FIG. 10 is a schematic diagram of original frequency restoration in an embodiment

图11为实施例中倒谱计算过程图FIG. 11 is a diagram of the cepstrum calculation process in an embodiment

图12为实施例中梅尔滤波器组特性示意图FIG. 12 is a schematic diagram of the characteristics of the Mel filter bank in the embodiment

图13为实施例中MFCC示意图FIG. 13 is a schematic diagram of MFCC in an embodiment

图14为本发明的预警识别算法流程图FIG. 14 is a flowchart of the early warning recognition algorithm of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合实施例对本发明作进一步说明，但本发明的保护范围不限于此：The present invention will be further described below in conjunction with embodiments, but the protection scope of the present invention is not limited thereto:

结合图1和图14，一种临水临崖路段主动安全预警系统及预警方法，它包括探测单元，数据处理单元，通信单元和预警单元。其中：In conjunction with FIG. 1 and FIG. 14 , an active safety warning system and warning method for a road section near water or cliffs includes a detection unit, a data processing unit, a communication unit and a warning unit. Among them:

(1)探测单元(声探装置)布局方案(1) Detection unit (acoustic detection device) layout plan

声探测识别装置有两种布局方案，分别是8元声阵列布局和5元声阵列布局。8元声阵列分为上下两层，上面一层由4元声阵列组成一个直径为40mm的圆盘，下面一层由4元声阵列组成的80～120mm的圆盘。上下两个圆盘高度差为40mm～100mm。5元声阵列上面一层只有1个声传感器，下面一层有4个声传感器组成直径为40mm～120mm的圆盘。上下两层的高度差为40mm～100mm。分布于上下两层的声传感器有效保证了在垂直方向上的声音分辨能力，下层的4元阵列形式则负责在水平方向上对目标进行定向。There are two layout schemes for the acoustic detection and identification device, namely the 8-element acoustic array layout and the 5-element acoustic array layout. The 8-element acoustic array is divided into two layers, the upper layer is composed of a 4-element acoustic array to form a disk with a diameter of 40mm, and the lower layer is composed of a 4-element acoustic array to form a disk of 80-120mm. The height difference between the upper and lower disks is 40mm-100mm. The upper layer of the 5-element acoustic array has only one acoustic sensor, and the lower layer has 4 acoustic sensors to form a disk with a diameter of 40mm-120mm. The height difference between the upper and lower layers is 40mm-100mm. The acoustic sensors distributed in the upper and lower layers effectively ensure the sound resolution capability in the vertical direction, and the 4-element array form of the lower layer is responsible for orienting the target in the horizontal direction.

(2)声音传感器芯片选型(2) Selection of sound sensor chip

声音传感器芯片采用MSM26S4030H0芯片，这是一款全向、底部端口、I2S数字输出MEMS麦克风。它具有高性能和高可靠性。MSM26S4030H0有4毫米×3毫米×1.0毫米金属帽LGA封装。它与SMT兼容，没有灵敏度下降。The sound sensor chip uses the MSM26S4030H0 chip, which is an omnidirectional, bottom-port, I2S digital output MEMS microphone. It has high performance and high reliability. The MSM26S4030H0 has a 4mm × 3mm × 1.0mm metal cap LGA package. It is compatible with SMT and has no sensitivity drop.

(3)数据预处理模块(环境噪声处理)(3) Data preprocessing module (environmental noise processing)

通过对声音特性分析可知，声音都是由相应的频谱与能量构成的。处理噪声的基本思路就是找出一种与目标噪声频谱相同且相位相反的声音，将其与噪声叠加。在实际应用中，使用“有源消声”这一技术，也就是主动降噪技术(Active Noise Control,ANC)，原理图如下，先通过样机内部携带的麦克风将各种噪声收集起来，再将噪声信号传输至降噪电路，通过降噪系统的通过电声处理，运算出反相的波。并通过内置音频处理电路将声波进行放大和重放，相位相反的声音将会叠加，噪声的振幅将会大幅衰减，从而降低了噪声。Through the analysis of sound characteristics, we can know that sounds are composed of corresponding spectra and energy. The basic idea of dealing with noise is to find a sound with the same spectrum and opposite phase as the target noise, and superimpose it with the noise. In practical applications, the technology of "active noise elimination" is used, that is, active noise reduction technology (Active Noise Control, ANC). The schematic diagram is as follows. First, various noises are collected through the microphone carried inside the prototype, and then the noise signal is transmitted to the noise reduction circuit. The noise reduction system is processed by electroacoustics to calculate the inverted wave. The sound waves are amplified and replayed through the built-in audio processing circuit, and the sounds with opposite phases will be superimposed, and the amplitude of the noise will be greatly attenuated, thereby reducing the noise.

ANC降噪实现核心算法为FxLMS(最小均方差算法)。图2所示的为一个典型的以FXLMS算法为基础的复合式控制结构。初级前向通道的传递函数为P(z)，次级声道的传递函数为S(z)。在进行主动降噪时，系统输入的信号源为参考麦克风输入x(n)和误差麦克风输入e(n)，系统的输出为期望输出y'(n)。所有的算法参数均应由这两路输入信号的特征确定，只要选择与输入信号特征相适应的算法参数，那么就可以取得较好的降噪效果。The core algorithm for ANC noise reduction is FxLMS (least mean square error algorithm). Figure 2 shows a typical composite control structure based on the FXLMS algorithm. The transfer function of the primary forward channel is P(z), and the transfer function of the secondary channel is S(z). When performing active noise reduction, the signal source of the system input is the reference microphone input x(n) and the error microphone input e(n), and the output of the system is the expected output y'(n). All algorithm parameters should be determined by the characteristics of these two input signals. As long as the algorithm parameters that are suitable for the characteristics of the input signal are selected, a better noise reduction effect can be achieved.

下面对使用FxLMS的ANC算法进行如下仿真，仿真过程如下：The following simulation is performed on the ANC algorithm using FxLMS. The simulation process is as follows:

生成一正弦波信号作为原始信号 Generate a sine wave signal as the original signal

生成一随机白噪声信号 Generate a random white noise signal

采用FxLMS算法滤波，输出误差和滤波结果如图3和图4所示 The FxLMS algorithm is used for filtering. The output error and filtering results are shown in Figures 3 and 4.

输入信号为正弦信号加噪声的混合信号，可见正弦信号受噪声影响失真较大；实验输出信号失真较小，噪声信号已经很小，验证了基于FxLMS的ANC算法的降噪能力。The input signal is a mixed signal of a sinusoidal signal and noise. It can be seen that the sinusoidal signal is greatly distorted by the noise. The experimental output signal has less distortion and the noise signal is already very small, which verifies the noise reduction capability of the ANC algorithm based on FxLMS.

(4)微阵列多频段声目标测向与定向(4) Microarray multi-band acoustic target direction finding and orientation

目前基于麦克风阵列的声源定位方法大致可以分为三类：基于最大输出功率的可控波束形成技术、基于高分辨率谱图估计技术和基于声音时间差(time-delayestimation，TDE)的声源定位技术。At present, the sound source localization methods based on microphone arrays can be roughly divided into three categories: controllable beamforming technology based on maximum output power, sound source localization technology based on high-resolution spectrogram estimation technology, and sound source localization technology based on time-delay estimation (TDE).

基于TDE的算法核心在于对传播时延的准确估计，一般通过对麦克风间信号做互相关处理，并可以通过简单的延时求和、几何计算或是直接利用互相关结果进行可控功率响应搜索等方法进一步获得声源位置信息。这类算法实现相对简单，运算量小，便于实时处理，因此在实际中运用最广，也比较适合本研究中小型化、低功率的要求。The core of the TDE-based algorithm is to accurately estimate the propagation delay. Generally, it performs cross-correlation processing on the signals between microphones, and can further obtain the sound source location information through simple delay summation, geometric calculation, or directly use the cross-correlation results to search for controllable power response. This type of algorithm is relatively simple to implement, has a small amount of calculation, and is easy to process in real time. Therefore, it is the most widely used in practice and is more suitable for the miniaturization and low power requirements of this study.

(4-1)广义时延算法(4-1) Generalized Delay Algorithm

四元立体声阵列可以使估计方位角精度提高，但在估计距离时会产生一定估计误差。因此，时延估计算法的选取非常重要。时延估计算法是利用时延估计来完成目标的联合测向和测距，其中时延就是声源到达各麦克风的时间差。可以根据广义互相关时延估计法进行时延估计。The four-element stereo array can improve the accuracy of the estimated azimuth, but it will produce a certain estimation error when estimating the distance. Therefore, the selection of the delay estimation algorithm is very important. The delay estimation algorithm uses delay estimation to complete the joint direction finding and distance measurement of the target, where the delay is the time difference between the sound source reaching each microphone. The delay estimation can be performed according to the generalized cross-correlation delay estimation method.

广义互相关法以基本互相关为理论基础，通过求两信号之间的互功率谱，并在功率谱域内给予一定加权，再反变换到时域得到两信号之间的互相关函数，最终估计出两信号之间的时延。原理图如图5所示。The generalized cross-correlation method takes the basic cross-correlation as its theoretical basis. It obtains the cross-power spectrum between the two signals, gives a certain weight in the power spectrum domain, and then inversely transforms to the time domain to obtain the cross-correlation function between the two signals, and finally estimates the time delay between the two signals. The principle diagram is shown in Figure 5.

将两路声信号的自相关函数做傅里叶变换，得到两麦克风接受声信号的互功率谱：Perform Fourier transform on the autocorrelation functions of the two acoustic signals to obtain the cross-power spectrum of the acoustic signals received by the two microphones:

式中：φ_ss(ω)分别为互相关函数R₁₂(τ)、R_ss(τ)对应的功率谱。对上式加权后进行反傅里叶变换，可得广义互相关函数：Where: φ_ss (ω) are the power spectra corresponding to the cross-correlation functions R₁₂ (τ) and R_ss (τ). After weighting the above equation and performing an inverse Fourier transform, the generalized cross-correlation function can be obtained:

式中：Ψ₁₂(ω)为广义加权函数，实际应用时，可针对不同噪声与混响情况选择不同的Ψ₁₂(ω)。Where: Ψ₁₂ (ω) is a generalized weighting function. In practical applications, different Ψ₁₂ (ω) can be selected for different noise and reverberation conditions.

最后即可由上式进行峰值检测确定时延：Finally, the peak value detection can be performed to determine the delay using the above formula:

(4-2)结合表1，加权求相关函数(4-2) Combined with Table 1, weighted correlation function

表1频域加权函数表Table 1 Frequency domain weighting function table

经过PHAT加权的互功率谱近似于单位冲激响应的表达式，突出了时延峰值，能够有效抑制混响噪声，提高时延估计的精度和准确度。The cross-power spectrum weighted by PHAT is close to the expression of unit impulse response, which highlights the delay peak, can effectively suppress reverberation noise, and improve the precision and accuracy of delay estimation.

(5)声音识别模块(声音模型)的学习与训练(5) Learning and training of sound recognition module (sound model)

为了实现深度神经网络对弱目标一维声、震信号的特征学习与识别，需要将一维声、震信号向量特征转化为二维图像特征。利用一维声、震信号的时频特征向量，结合梅尔倒谱算法，实现图像特征的提取。再将这些倒谱图像特征，作为深度神经网络的输入，实现深度神经网络对一维声、震信号的分类和识别。In order to realize the feature learning and recognition of weak target one-dimensional sound and seismic signals by deep neural network, it is necessary to convert the one-dimensional sound and seismic signal vector features into two-dimensional image features. The image features are extracted by using the time-frequency feature vector of one-dimensional sound and seismic signals combined with the Mel cepstrum algorithm. These cepstrum image features are then used as the input of the deep neural network to realize the classification and recognition of one-dimensional sound and seismic signals by the deep neural network.

(5-1)倒谱计算(5-1) Cepstrum calculation

基于梅尔倒谱模型的声信号特征提取方法，如图6所示是一个声信号的频谱图，峰值就表示声音的主要频率成分，这些峰值称为共振峰(formants)，共振峰携带了声音的辨识属性，可以用来识别不同的声音。图7提取的是频谱的包络(Spectral Envelope)，即一条连接这些共振峰点的平滑曲线。The acoustic signal feature extraction method based on the Mel-Cepstrum model is shown in Figure 6, which is a spectrum of an acoustic signal. The peaks represent the main frequency components of the sound. These peaks are called formants. The formants carry the identification properties of the sound and can be used to identify different sounds. Figure 7 extracts the spectral envelope, which is a smooth curve connecting these formants.

可以看出频谱由两部分组成：包络和频谱的细节，把这两部分分离开，就可以得到包络如图8所示。It can be seen that the spectrum consists of two parts: the envelope and the details of the spectrum. By separating these two parts, we can get the envelope as shown in Figure 8.

在频谱上做傅里叶变换就相当于逆傅里叶变换Inverse FFT(IFFT)。在对数频谱上面做IFFT就相当于在一个伪频率(pseudo-frequency)坐标轴上面描述信号。Performing a Fourier transform on a spectrum is equivalent to an inverse Fourier transform (IFFT). Performing an IFFT on a logarithmic spectrum is equivalent to describing the signal on a pseudo-frequency axis.

由图9所示，包络主要包括低频成分，可将其看成一个每秒4个周期的正弦信号。在伪坐标轴上面的4Hz处设定一个峰值。而频谱的细节部分主要是高频。将其看作一个每秒100个周期的正弦信号。在伪坐标轴上面的100Hz处设定一个峰值。把两信号叠加即得到初始频谱信号如图10所示。As shown in Figure 9, the envelope mainly includes low-frequency components, which can be regarded as a sine signal with 4 cycles per second. A peak is set at 4Hz on the pseudo-coordinate axis. The detailed part of the spectrum is mainly high frequency. It can be regarded as a sine signal with 100 cycles per second. A peak is set at 100Hz on the pseudo-coordinate axis. The initial spectrum signal is obtained by superimposing the two signals as shown in Figure 10.

总结倒谱分析，其过程具体如下：To summarize the cepstrum analysis, the process is as follows:

将原语音信号经过傅里叶变换得到频谱：X[k]＝H[k]E[k]，只考虑幅度就是： The original speech signal is transformed by Fourier to obtain the spectrum: X[k] = H[k]E[k], and only considering the amplitude is:

|X[k]|＝|H[k]||E[k]|；|X[k]|＝|H[k]||E[k]|;

取对数：log||X[k]||＝log||H[k]||+log||E[k]||； Take the logarithm: log||X[k]||＝log||H[k]||+log||E[k]||;

取逆傅里叶变换得到：x[k]＝h[k]+e[k]。 Taking the inverse Fourier transform yields: x[k]=h[k]+e[k].

以上步骤又叫同态信号处理。它的目的是将非线性问题转化为线性问题的处理方法。第一步通过卷积将声音信号变成了乘性信号(时域的卷积相当于频域的乘积)。第二步通过取对数将乘性信号转化为加性信号，第三步进行逆变换，使其恢复为卷性信号。这时候，虽然前后均是时域序列，但它们所处的离散时域显然不同，所以后者称为倒谱频域。所以，倒谱(cepstrum)就是一种信号的傅里叶变换经对数运算后再进行傅里叶反变换得到的谱。它的计算过程如图11所示：The above steps are also called homomorphic signal processing. Its purpose is to transform nonlinear problems into linear problems. The first step is to convert the sound signal into a multiplicative signal through convolution (convolution in the time domain is equivalent to multiplication in the frequency domain). The second step is to convert the multiplicative signal into an additive signal by taking the logarithm, and the third step is to perform an inverse transform to restore it to a convolution signal. At this time, although both the before and after are time domain sequences, the discrete time domains they are in are obviously different, so the latter is called the cepstrum frequency domain. Therefore, the cepstrum is a spectrum obtained by performing a logarithmic operation on the Fourier transform of a signal and then performing an inverse Fourier transform. Its calculation process is shown in Figure 11:

(5-2)频率分析(5-2) Frequency analysis

Mel频率分析(Mel-Frequency Analysis)是一种基于人类听觉感知的频率分析。实验观测发现人耳就像一个滤波器组一样，它只关注某些特定的频率分量，只让某些频率的信号通过。这些滤波器在频率坐标轴上并不是统一分布的，在低频区域有很多密集分布的滤波器，高频区域滤波器就分布得比较稀疏。Mel-Frequency Analysis is a frequency analysis based on human auditory perception. Experimental observations have found that the human ear is like a filter bank, which only focuses on certain frequency components and only allows signals of certain frequencies to pass. These filters are not uniformly distributed on the frequency coordinate axis. There are many densely distributed filters in the low-frequency area, and the filters in the high-frequency area are sparsely distributed.

人的听觉系统是一个特殊的非线性系统，它响应不同频率信号的灵敏度是不同的。所以在语音特征的提取上，人类听觉系统做得非常好，它不仅能提取出语义信息,而且能提取出说话人的个人特征，这些都是现有的语音识别系统所望尘莫及的。如果在语音识别系统中能模拟人类听觉感知处理特点，就可以提高声音的识别率。The human auditory system is a special nonlinear system, and its sensitivity to different frequency signals is different. Therefore, in terms of speech feature extraction, the human auditory system does a very good job. It can not only extract semantic information, but also extract the speaker's personal characteristics, which are beyond the reach of existing speech recognition systems. If the human auditory perception and processing characteristics can be simulated in the speech recognition system, the recognition rate of sound can be improved.

(5-3)梅尔倒谱算法(5-3) Mel Cepstrum Algorithm

梅尔频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)考虑到了人类的听觉特征，先将线性频谱映射到基于听觉感知的Mel非线性频谱中，然后转换到倒谱上。由图12所示，将不统一的频率转化为统一的频率，也就是统一的滤波器组。Mel Frequency Cepstrum Coefficient (MFCC) takes into account the human auditory characteristics, first mapping the linear spectrum to the Mel nonlinear spectrum based on auditory perception, and then converting it to the cepstrum. As shown in Figure 12, the non-uniform frequencies are converted to uniform frequencies, that is, a unified filter bank.

在Mel频域内，人对音调的感知度为线性关系。举例来说，如果两段语音的Mel频率相差两倍，则人耳听起来两者的音调也相差两倍。将频谱通过一组Mel滤波器就得到Mel频谱，在Mel频谱上面获得的倒谱系数h[k]就称为Mel频率倒谱系数，简称MFCC如图13所示。In the Mel frequency domain, people's perception of pitch is linear. For example, if the Mel frequencies of two voices differ by two times, the pitch of the two voices will also differ by two times to the human ear. The spectrum is passed through a set of Mel filters to obtain the Mel spectrum. The cepstral coefficient h[k] obtained on the Mel spectrum is called the Mel frequency cepstral coefficient, or MFCC for short, as shown in Figure 13.

(6)样本集的构建(6) Construction of sample set

数据已经成为限制算法能力提升的重要瓶颈，充足的样本数据是深度学习模型能够达到理想效果的重要保障。在日常生活场景方面，ImageNet、COCO等大数据集的提出极大的提高了算法在这一场景的目标检测准确度，在近地面目标声、震信号方面，目前还没有针对该领域的大型标注数据库存在，因此，标注数据的缺乏是近地面目标检测任务亟待解决的问题之一。针对这一问题，为了多方面地验证所提方法的有效性，实验将采用两个公共数据集：ESC50和UrbanSound8K。Data has become an important bottleneck that limits the improvement of algorithm capabilities. Sufficient sample data is an important guarantee for deep learning models to achieve ideal results. In terms of daily life scenarios, the introduction of large data sets such as ImageNet and COCO has greatly improved the accuracy of target detection in this scenario. In terms of near-ground target sound and vibration signals, there is currently no large-scale annotated database for this field. Therefore, the lack of annotated data is one of the urgent problems to be solved in the near-ground target detection task. To address this issue, in order to verify the effectiveness of the proposed method in many aspects, the experiment will use two public data sets: ESC50 and UrbanSound8K.

ESC50数据集由2000个带标签的环境声音组成，这些音频被平衡地分为了50个类别(每个类别40个音频片段，每个音频片段的时长是5秒)。整体上它们又被分为5个简单定义的大类(每个大类又分为10个小类)：动物的声音、自然场景声音、人(非语音)声音、室内/家庭声音、室外/城市噪音。The ESC50 dataset consists of 2,000 labeled environmental sounds, which are balanced and divided into 50 categories (40 audio clips per category, each audio clip is 5 seconds long). Overall, they are divided into 5 simply defined categories (each category is divided into 10 subcategories): animal sounds, natural scene sounds, human (non-speech) sounds, indoor/home sounds, and outdoor/urban noises.

UrbanSound8K包含了8732个短音频片段(每个音频时长不超过4秒)。该数据集将音频信号分为10个类别不均衡的环境声音事件:空调、汽车喇叭、玩耍的孩子、狗叫、钻井、发动机空转、枪击、手提钻、警笛、街头音乐。为了获得更为准确的结果，该数据集已预先整理成了10折，因此本方案在该数据上使用的是10折交叉验证方法。该数据集中的音频时长不一致，为了确保网络输入的一致性，我们将少于4秒的音频使用原始音频信号进行了填补。UrbanSound8K contains 8732 short audio clips (each audio is no longer than 4 seconds). The dataset divides the audio signals into 10 categories of unbalanced environmental sound events: air conditioning, car horns, playing children, dog barking, drilling, engine idling, shooting, jackhammers, police sirens, and street music. In order to obtain more accurate results, the dataset has been pre-organized into 10 folds, so this solution uses a 10-fold cross-validation method on this data. The audio duration in this dataset is inconsistent. In order to ensure the consistency of the network input, we padded the audio that is less than 4 seconds with the original audio signal.

临水临崖路段主动安全预警系统Active safety warning system for roads near water and cliffs

本申请公开了一种临水路段主动安全预警系统，首先由探测单元进行声目标的获取，与自创建的声数据库进行比对后结合深度网络进行车辆检测；通过通信单元将检测结果传输至预警模块，预警模块由下图所示的LED主动发光标志牌、发光道钉、车辆探测器、控制箱组成，接收到信号后进行声光预警，实现驾驶员在临水临崖危险路段驶入前接到警示牌提醒，并在路段行驶过程中始终看得到由道钉组成的“最后一道防线”。The present application discloses an active safety warning system for roads near water. First, a detection unit acquires acoustic targets, compares them with a self-created acoustic database, and then performs vehicle detection in combination with a deep network. The detection results are transmitted to a warning module through a communication unit. The warning module is composed of an LED active luminous signboard, luminous road studs, a vehicle detector, and a control box as shown in the figure below. After receiving the signal, an acoustic and light warning is performed, so that the driver can receive a warning sign before entering a dangerous road section near water or cliffs, and can always see the "last line of defense" composed of road studs during driving on the road section.

实施例选定连云港数条农路临水路段作为实验路段，将实验所需各类器材安装在临水路段各位置，进行点位选择、传感器选型、传感器布设与安装、采集数据需求来验证系统可行性，并将检测数据与实际情况对比，以便于随时调整计算模型和校准参数。The embodiment selected several water-side sections of agricultural roads in Lianyungang as experimental sections, installed various equipment required for the experiment at various locations on the water-side sections, carried out point selection, sensor selection, sensor layout and installation, and data collection requirements to verify the feasibility of the system, and compared the test data with the actual situation to facilitate the adjustment of the calculation model and calibration parameters at any time.

结合图1实施例中系统由基于声音的交通探测单元1、预警单元(太阳能道钉2或LED发光标志标志3)、供电单元4、通信单元组成。将上述单元进行集成整合，实现多位一体的预警设备，系统应用原理：车辆探测器预先设置声音库，设定探测距离，分辨出探测范围内的声音是否是车辆发出，如果判定为车辆，探测器线主控系统发出信号，控住系统控制LED发光标志点亮，并通过无线通讯方式向道钉发出信号，道钉同步闪烁，起到一种预警作用。In the embodiment of FIG. 1, the system is composed of a sound-based traffic detection unit 1, an early warning unit (solar road studs 2 or LED light-emitting signs 3), a power supply unit 4, and a communication unit. The above units are integrated to realize a multi-in-one early warning device. The system application principle is: the vehicle detector pre-sets the sound library, sets the detection distance, and distinguishes whether the sound within the detection range is emitted by a vehicle. If it is determined to be a vehicle, the detector line main control system sends a signal, the control system controls the LED light-emitting sign to light up, and sends a signal to the road studs through wireless communication. The road studs flash synchronously, which plays a kind of early warning role.

探测单元：基于声音的交通探测装置，通过声音检测目标车辆的到达。Detection Unit: A sound-based traffic detection device that detects the arrival of a target vehicle by sound.

预警单元：由LED发光标志以及道钉组成，起到警示目标车辆的作用。Warning unit: It is composed of LED luminous signs and road studs, which serve to warn target vehicles.

供电单元：由80W的太阳能板完成探测单元、预警单元和通信单元的供电任务。Power supply unit: The 80W solar panel is used to power the detection unit, early warning unit and communication unit.

通信单元：将探测单元的信号经过无线传输的形式发送给预警单元。Communication unit: sends the signal of the detection unit to the early warning unit via wireless transmission.

控制箱：主控系统，太阳能控制系统，电池集成与控制线内。Control box: main control system, solar control system, battery integration and control line.

LED发光标志标志牌：由LED全透型三角警告标志牌和点正式文字提醒发光标志牌组成，预警工作时，警告标志牌长亮，提醒标志牌闪烁，正常时候不亮。工作电压12V，功耗不低于20W。LED luminous sign: It is composed of LED full-transparent triangle warning sign and dot formal text reminder luminous sign. When the warning is working, the warning sign is always on, the reminder sign flashes, and it is not on normally. The working voltage is 12V, and the power consumption is not less than 20W.

车辆探测器(基于声音)：声音探测距离<30m，误检率<5％，漏检率<5％，功耗0.3W。Vehicle detector (sound-based): sound detection distance <30m, false detection rate <5%, missed detection rate <5%, power consumption 0.3W.

太阳能道钉：每个道钉含独立供电系统，来车时发出红色光线，多个道钉同步闪烁，设置距离2m。设计尺寸直径1200厚度25。Solar road studs: Each road stud has an independent power supply system, emits red light when a car approaches, multiple road studs flash synchronously, and are set at a distance of 2m. The design size is 1200 in diameter and 25 in thickness.

供电系统：按每天工作6小时，连续阴雨天7天连续工作设计，太阳能板80W，电池100AH。Power supply system: designed to work 6 hours a day and 7 consecutive days of rainy and cloudy days, with 80W solar panels and 100AH batteries.

经济效益评价Economic benefit evaluation

临水临崖路段主动安全预警系统可以及时地对驾驶员发出警示，降低了事故发生的可能性，可以避免或减轻严重交通事故带来的人员伤亡和车辆损毁等情况，从而减少了事故损失成本。The active safety warning system on roads near water or cliffs can issue warnings to drivers in a timely manner, reducing the possibility of accidents. It can avoid or mitigate casualties and vehicle damage caused by serious traffic accidents, thereby reducing the cost of accident losses.

本实施例中使用声探装置进行车辆检测，相比于一般预警系统中使用的雷达、摄像头等装置，成本更加低廉，相关配套设施的搭建简便、投入量少，能够实现低成本的全天候车辆检测和预警。In this embodiment, an acoustic detection device is used for vehicle detection. Compared with radars, cameras and other devices used in general early warning systems, the cost is lower, the construction of related supporting facilities is simple and the investment is small, and low-cost all-weather vehicle detection and early warning can be achieved.

社会效益评价Social Benefit Evaluation

临水临崖路段主动安全预警系统能够帮助驾驶员意识到潜在的危险，保障特殊路段上的行车安全，减少交通事故的发生，大大降低交通事故的伤亡和死亡人数，保护行车人员的生命和财产安全。从而提高道路安全水平和出行的舒适度与信心。The active safety warning system for roads near water or cliffs can help drivers realize potential dangers, ensure driving safety on special roads, reduce the occurrence of traffic accidents, greatly reduce the number of casualties and deaths in traffic accidents, and protect the lives and property safety of drivers, thereby improving the level of road safety and the comfort and confidence of travel.

本文中所描述的具体实施例仅仅是对本发明精神做举例说明。本发明所属技术领域的技术人员可以对所描述的具体实施例做各种各样的修改或补充或采用类似的方式替代，但并不会偏离本发明的精神或者超越所附权利要求书所定义的范围。The specific embodiments described herein are merely examples of the spirit of the present invention. Those skilled in the art may make various modifications or additions to the specific embodiments described or replace them in similar ways, but they will not deviate from the spirit of the present invention or exceed the scope defined by the appended claims.

Claims

Translated fromChinese

1.一种临水临崖路段主动安全预警系统，其特征在于它包括探测单元，数据处理单元，通信单元和预警单元，其中：1. An active safety warning system for roads near water or cliffs, characterized in that it includes a detection unit, a data processing unit, a communication unit and an early warning unit, wherein:

2.根据权利要求1所述的系统，其特征在于所述探测单元为多个声音传感器组成。2. The system according to claim 1 is characterized in that the detection unit is composed of multiple sound sensors.

3.根据权利要求1所述的系统，其特征在于所述探测单元包括主动降噪模块，主动降噪模块执行以下步骤：3. The system according to claim 1, wherein the detection unit comprises an active noise reduction module, and the active noise reduction module performs the following steps:

4.根据权利要求3所述的系统，其特征在于在主动降噪过程中，自适应滤波算法通过以下基本数学表达式实现：4. The system according to claim 3, characterized in that in the active noise reduction process, the adaptive filtering algorithm is implemented by the following basic mathematical expression:

5.根据权利要求1所述的系统，其特征在于所述预处理模块对探测到的声音信号进行预处理，具体步骤为：5. The system according to claim 1, characterized in that the preprocessing module preprocesses the detected sound signal, specifically in the following steps:

4)声音数据特征提取：4) Sound data feature extraction:

6.根据权利要求1所述的系统，其特征在于所述声音识别模块，基于预处理后的声音数据，使用声音模型进行识别，并输出检测结果，声音模型构建的具体步骤为：6. The system according to claim 1 is characterized in that the sound recognition module uses a sound model to perform recognition based on the preprocessed sound data and outputs a detection result, and the specific steps of constructing the sound model are:

7.根据权利要求1所述的系统，其特征在于所述预警单元为LED发光标志标志牌，预警工作时，警告标志牌长亮，提醒标志牌闪烁，正常时候不亮。7. The system according to claim 1 is characterized in that the early warning unit is an LED luminous sign. When the early warning is working, the warning sign is always on, the reminder sign flashes, and it is not on normally.

8.根据权利要求1所述的系统，其特征在于所述预警单元为太阳能道钉，预警工作时，道钉发出红色光线，多个道钉同步闪烁。8. The system according to claim 1 is characterized in that the warning unit is a solar road stud. When the warning is working, the road stud emits red light and multiple road studs flash synchronously.