Movatterモバイル変換


[0]ホーム

URL:


CN116115239A - Embarrassing working gesture recognition method for construction workers based on multi-mode data fusion - Google Patents

Embarrassing working gesture recognition method for construction workers based on multi-mode data fusion
Download PDF

Info

Publication number
CN116115239A
CN116115239ACN202211474969.3ACN202211474969ACN116115239ACN 116115239 ACN116115239 ACN 116115239ACN 202211474969 ACN202211474969 ACN 202211474969ACN 116115239 ACN116115239 ACN 116115239A
Authority
CN
China
Prior art keywords
data
features
posture
multimodal
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211474969.3A
Other languages
Chinese (zh)
Other versions
CN116115239B (en
Inventor
夏侯遐迩
李子睿
夏吉康
李启明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast UniversityfiledCriticalSoutheast University
Priority to CN202211474969.3ApriorityCriticalpatent/CN116115239B/en
Publication of CN116115239ApublicationCriticalpatent/CN116115239A/en
Application grantedgrantedCritical
Publication of CN116115239BpublicationCriticalpatent/CN116115239B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,包括采集被监测者的原始脑电数据、原始行为数据以及原始姿势图像;对原始脑电数据进行预处理,提取时域、频域以及非线性特征;对原始行为数据进行标准化操作,并提取均值作为行为数据特征;从被监测者姿势图像中提取人体主要点位的空间坐标,作为姿势状态特征;基于前期融合策略对提取的数据特征进行融合;将融合后特征数据集输入至已训练好的BP神经网络,输出被监测者的尴尬姿势工作类别,从而通过提取被监测者的脑电数据、行为数据以及姿势图像的多模态数据融合特征,实现尴尬工作姿势自动识别,改善了基于单模态数据识别姿势而存在的准确率不足等局限。

Figure 202211474969

The invention discloses a construction worker's embarrassing working posture recognition method based on multimodal data fusion, which includes collecting the original EEG data, original behavior data and original posture images of the monitored person; Time domain, frequency domain and nonlinear features; standardize the original behavior data, and extract the mean value as the behavior data feature; extract the spatial coordinates of the main points of the human body from the posture image of the monitored person as the posture state feature; based on the previous fusion The strategy fuses the extracted data features; the fused feature data set is input to the trained BP neural network, and the awkward posture work category of the monitored person is output, so that by extracting the monitored person's EEG data, behavior data and posture The multi-modal data fusion feature of the image realizes the automatic recognition of awkward working postures, and improves the limitations of insufficient accuracy of posture recognition based on single-modal data.

Figure 202211474969

Description

Translated fromChinese
基于多模态数据融合的建筑工人尴尬工作姿势识别方法Awkward working posture recognition method for construction workers based on multimodal data fusion

技术领域Technical Field

本发明涉及建筑安全健康管理技术领域,特别是涉及一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法。The present invention relates to the technical field of building safety and health management, and in particular to a method for identifying awkward working postures of construction workers based on multimodal data fusion.

背景技术Background Art

当前我国建筑业安全形势十分严峻,其事故率是各行业平均事故率的3倍,据统计,建筑工人不安全行为所导致的事故占建筑业安全事故总数的80%以上,其中,尴尬工作姿势占工人作业时所涉及不安全行为总数的13.1%以上,从而被视为最重要的类别,并且,建筑工人普遍长期处于尴尬工作姿势执行任务,会引发一系列肌肉骨骼疾病,并直接威胁其职业健康安全,致使建筑业也被列为肌肉骨骼疾病和伤害最危险的行业之一,对此,亟待引入前沿技术来识别和预警现场建筑工人的尴尬工作姿势,完善建筑工人职业健康安全管理体系。The current safety situation in my country's construction industry is very serious, with an accident rate three times the average accident rate of all industries. According to statistics, accidents caused by unsafe behaviors of construction workers account for more than 80% of the total number of safety accidents in the construction industry, among which awkward working postures account for more than 13.1% of the total number of unsafe behaviors involved in workers' operations, and are therefore regarded as the most important category. In addition, construction workers generally perform tasks in awkward working postures for a long time, which can cause a series of musculoskeletal diseases and directly threaten their occupational health and safety, causing the construction industry to be listed as one of the industries with the most dangerous musculoskeletal diseases and injuries. In this regard, it is urgent to introduce cutting-edge technologies to identify and warn of awkward working postures of on-site construction workers and improve the occupational health and safety management system for construction workers.

目前,已有大量计算机科学以及自动化领域的前沿技术应用于识别与预警建筑工人的尴尬工作姿势,从活动角度,计算机视觉技术结合深度学习算法被应用于直接姿势、动作以及行为的识别,与此同时,多种运动传感器被用于提取人体行为数据,从而实现尴尬工作姿势的实时识别与预警;从心理角度,多种可穿戴设备(脑电头盔、智能腕带以及眼动仪等)被用于测定心电、脑电、皮电以及眼动轨迹等反映认知状态和情绪状态的生理数据,间接实现对尴尬工作姿势的监测。At present, a large number of cutting-edge technologies in computer science and automation have been applied to identify and warn construction workers of awkward working postures. From the perspective of activities, computer vision technology combined with deep learning algorithms are applied to the direct recognition of postures, movements and behaviors. At the same time, a variety of motion sensors are used to extract human behavior data, thereby realizing real-time recognition and warning of awkward working postures; from a psychological perspective, a variety of wearable devices (EEG helmets, smart wristbands and eye trackers, etc.) are used to measure physiological data such as electrocardiogram, electroencephalogram, skin electricity and eye movement trajectories that reflect cognitive and emotional states, thereby indirectly realizing the monitoring of awkward working postures.

当前已有识别技术倾向于从单一模态数据实现解释和分类的功能,而工人的动作与认知存在复杂性与交互作用,仅从单一模态数据难以客观和准确地识别姿势,多模态融合技术基于多种异构模态数据协同推理,通过多项技术采集与融合多模态数据,能够更为准确地识别尴尬工作姿势;鉴于当前建筑领域缺乏基于多模态融合的尴尬工作姿势识别应用框架,本发明提供了一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法。Current recognition technologies tend to implement interpretation and classification functions from single modal data, but workers' movements and cognition are complex and interactive, and it is difficult to objectively and accurately identify postures from single modal data alone. Multimodal fusion technology is based on collaborative reasoning of multiple heterogeneous modal data. It collects and fuses multimodal data through multiple technologies, and can more accurately identify awkward working postures. In view of the lack of an awkward working posture recognition application framework based on multimodal fusion in the current construction field, the present invention provides a construction worker's awkward working posture recognition method based on multimodal data fusion.

发明内容Summary of the invention

为了解决以上技术问题,本发明提供一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,包括以下步骤In order to solve the above technical problems, the present invention provides a construction worker's awkward working posture recognition method based on multimodal data fusion, comprising the following steps:

S1、采集被监测者的多模态数据,多模态数据包括原始脑电数据、原始行为数据以及姿势图像;S1. Collect multimodal data of the monitored person, where the multimodal data includes original EEG data, original behavior data, and posture images;

S2、对原始脑电数据进行伪迹消除以及特征提取处理,得到脑电数据特征;S2, performing artifact elimination and feature extraction processing on the original EEG data to obtain EEG data features;

S3、对原始行为数据进行标准化、窗口划分以及均值特征提取处理,得到行为数据特征;S3, standardizing, windowing, and extracting mean features on the original behavior data to obtain behavior data features;

S4、对姿势图像进行人体点位识别与空间坐标提取,得到姿势状态特征;S4, performing human body point recognition and spatial coordinate extraction on the posture image to obtain posture state features;

S5、基于前期融合策略对脑电数据特征、行为数据特征以及姿势状态特征进行特征融合,得到多模态数据融合特征;S5, based on the previous fusion strategy, feature fusion is performed on the EEG data features, the behavior data features and the posture state features to obtain multimodal data fusion features;

S6、将多模态数据融合特征输入至训练好的BP神经网络;S6, inputting the multimodal data fusion features into the trained BP neural network;

S7、输出被监测者的尴尬工作姿势类别。。S7. Output the category of the monitored person's awkward working posture.

本发明进一步限定的技术方案是:The technical solution further defined in the present invention is:

进一步的,步骤S2中,对原始脑电数据进行伪迹去除包括外部伪迹消除和内部伪迹消除,通过有限脉冲响应带通滤波器进行外部伪迹消除;通过独立成分分析对原始脑电数据中的内部伪迹进行判别与筛除。Furthermore, in step S2, artifact removal is performed on the original EEG data, including external artifact removal and internal artifact removal. External artifact removal is performed by a finite impulse response bandpass filter; and internal artifacts in the original EEG data are identified and screened out by independent component analysis.

前所述的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,步骤S2中,通过固定窗口划分法提取被监测者的多维度脑电特征数据,多维度脑电特征数据包括时域特征、频域特征以及非线性特征,时域特征包括标准差、波动指数以及峰度;频域特征包括Delta频带功率谱密度、Theta频带功率谱密度、Alpha频带功率谱密度、Beta频带功率谱密度以及Gamma频带功率谱密度;非线性特征包括近似熵、模糊熵以及赫斯特指数。In the above-mentioned method for identifying awkward working postures of construction workers based on multimodal data fusion, in step S2, multi-dimensional EEG feature data of the monitored person is extracted by a fixed window division method, and the multi-dimensional EEG feature data includes time domain features, frequency domain features and nonlinear features. The time domain features include standard deviation, fluctuation index and kurtosis; the frequency domain features include Delta band power spectral density, Theta band power spectral density, Alpha band power spectral density, Beta band power spectral density and Gamma band power spectral density; the non-linear features include approximate entropy, fuzzy entropy and Hurst exponent.

前所述的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,时域特征中标准差的求取算法如下,In the above-mentioned method for identifying awkward working postures of construction workers based on multimodal data fusion, the algorithm for obtaining the standard deviation in the time domain features is as follows:

Figure BDA0003959078670000031
Figure BDA0003959078670000031

其中,n表示该通道下所采集数据点的总数,xi表示该通道下所采集的第i个数据点,

Figure BDA0003959078670000034
表示该通道下所采集n个数据点的平均值;Where n represents the total number of data points collected under this channel,xi represents the ith data point collected under this channel,
Figure BDA0003959078670000034
It represents the average value of n data points collected under this channel;

时域特征中波动指数的求取算法如下,The algorithm for obtaining the volatility index in the time domain characteristics is as follows:

Figure BDA0003959078670000032
Figure BDA0003959078670000032

其中,n表示该通道下所采集数据点的总数,x(i)表示该通道下所采集的第i个数据点,x(i+1)表示该通道下所采集的第i+1个数据点;Where n represents the total number of data points collected under the channel, x(i) represents the i-th data point collected under the channel, and x(i+1) represents the i+1-th data point collected under the channel;

时域特征中峰度的求取算法如下,The algorithm for obtaining the kurtosis in the time domain features is as follows:

Figure BDA0003959078670000033
Figure BDA0003959078670000033

其中,n表示该通道下所采集数据点的总数,s表示该通道下所采集n个数据点的标准差,xi表示该通道下所采集的第i个数据点,

Figure BDA0003959078670000035
表示该通道下所采集n个数据点的平均值。Where n represents the total number of data points collected under this channel, s represents the standard deviation of n data points collected under this channel,xi represents the ith data point collected under this channel,
Figure BDA0003959078670000035
It represents the average value of n data points collected under this channel.

前所述的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,非线性特征中近似熵的计算包括以下步骤In the above-mentioned method for identifying awkward working postures of construction workers based on multimodal data fusion, the calculation of approximate entropy in nonlinear features includes the following steps:

S2.1.1、设原始脑电数据为x(1),x(2),……,x(n),共n个点,按序号排列顺序组成m维向量X(i)=[x(i),x(i+1),……,x(i+m-1)],其中i=1,2,……,n-m+1;S2.1.1, suppose the original EEG data are x(1), x(2), ..., x(n), a total of n points, arranged in order to form an m-dimensional vector X(i) = [x(i), x(i+1), ..., x(i+m-1)], where i = 1, 2, ..., n-m+1;

S2.1.2、定义第i个向量X(i)与第j个向量X(j)之间的距离dijS2.1.2, define the distance dij between the i-th vector X(i) and the j-th vector X(j),

dij=max[|x(i+k)-x(j+k)|],0≤k≤m-1dij =max[|x(i+k)-x(j+k)|], 0≤k≤m-1

S2.1.3、给定阈值r,对每个向量X(i),统计满足dij≤r*Std条件的次数,其中Std为序列数据的标准差,并求出统计次数与距离总数n-m的比值,记作

Figure BDA0003959078670000043
S2.1.3. Given a threshold r, for each vector X(i), count the number of times that the condition dij ≤ r*Std is satisfied, where Std is the standard deviation of the sequence data, and find the ratio of the statistical number to the total number of distances nm, denoted as
Figure BDA0003959078670000043

S2.1.4、将

Figure BDA0003959078670000044
取对数,再对所有的i取平均值,记为φm(r),S2.1.4.
Figure BDA0003959078670000044
Take the logarithm and then take the average value for all i, denoted as φm (r),

Figure BDA0003959078670000041
Figure BDA0003959078670000041

S2.1.5、m值加1,以m+1维度重复步骤S2.1.1至步骤S2.1.4,得到

Figure BDA0003959078670000045
和φm+1(r),并求出近似熵为S2.1.5, add 1 to the value of m, and repeat steps S2.1.1 to S2.1.4 in the m+1 dimension to obtain
Figure BDA0003959078670000045
and φm+1 (r), and the approximate entropy is

ApEn=∑n→∞m(r)-φm+1(r)]ApEn=∑n→∞m (r)-φm+1 (r)]

其中,ApEn表示近似熵。Wherein, ApEn represents approximate entropy.

前所述的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,非线性特征中模糊熵的算法包括以下步骤The aforementioned method for identifying awkward working postures of construction workers based on multimodal data fusion, the algorithm of fuzzy entropy in nonlinear features includes the following steps

S2.2.1、设原始脑电数据为x(1),x(2),……,x(n),共n个点;S2.2.1, let the original EEG data be x(1), x(2), ..., x(n), with a total of n points;

S2.2.2、定义嵌入维数m与相似容忍度r,重构相空间,生成一组m维向量X(i)=[x(i),x(i+1),…,x(i+m-1)]-x0(i),其表示自x(i)起的m个连续数据点所组成的向量,其中i=1,2,…,n-m+1,S2.2.2.2. Define the embedding dimension m and similarity tolerance r, reconstruct the phase space, and generate a set of m-dimensional vectors X(i) = [x(i), x(i+1), ..., x(i+m-1)] -x0 (i), which represents the vector composed of m consecutive data points starting from x(i), where i = 1, 2, ..., n-m+1,

Figure BDA0003959078670000042
Figure BDA0003959078670000042

其中,x0(i)表示该m个数据点的均值;Wherein, x0 (i) represents the mean of the m data points;

S2.2.3、定义模糊隶属函数A(x),S2.2.3, define the fuzzy membership function A(x),

Figure BDA0003959078670000051
Figure BDA0003959078670000051

其中,r表示相似容忍度;Among them, r represents similarity tolerance;

S2.2.4、根据A(x)表达式将其变形为S2.2.4, according to the expression A(x), it is transformed into

Figure BDA0003959078670000052
Figure BDA0003959078670000052

其中,j=1,2,…,n-m+1,且j与i不相等,

Figure BDA0003959078670000053
表示窗口向量X(i)与X(j)之间的最大绝对距离,计算如下所示,Where j = 1, 2, ..., n-m+1, and j is not equal to i.
Figure BDA0003959078670000053
Represents the maximum absolute distance between window vectors X(i) and X(j), which is calculated as follows:

Figure BDA0003959078670000054
Figure BDA0003959078670000054

S2.2.5、定义函数

Figure BDA0003959078670000055
S2.2.5. Define function
Figure BDA0003959078670000055

Figure BDA0003959078670000056
Figure BDA0003959078670000056

S2.2.6、重复步骤S2.2.1至步骤S2.2.5,按照序列顺序重构生成m+1维矢量,定义函数

Figure BDA0003959078670000057
S2.2.6. Repeat steps S2.2.1 to S2.2.5 to reconstruct the m+1 dimensional vector in sequence order and define the function
Figure BDA0003959078670000057

Figure BDA0003959078670000058
Figure BDA0003959078670000058

S2.2.7、在步骤S2.2.5的基础上,定义模糊熵为S2.2.7, based on step S2.2.5, define the fuzzy entropy as

Figure BDA0003959078670000059
Figure BDA0003959078670000059

对于由N个数据点所组成的有限时间序列数据,模糊熵最终可以表示为For a finite time series data consisting of N data points, the fuzzy entropy can finally be expressed as

Figure BDA00039590786700000510
Figure BDA00039590786700000510

其中,FuzzyEn(m,r,N)表示模糊熵。Among them, FuzzyEn(m,r,N) represents fuzzy entropy.

前所述的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,非线性特征中赫斯特指数的计算包括以下步骤The aforementioned method for identifying awkward working postures of construction workers based on multimodal data fusion, the calculation of the Hurst exponent in the nonlinear feature includes the following steps

S2.3.1、根据由n个数据点所组成的原始脑电数据序列x(1),x(2),……,x(n),计算平均值为S2.3.1. Based on the original EEG data sequence consisting of n data points x(1), x(2), ..., x(n), calculate the average value as

Figure BDA0003959078670000061
Figure BDA0003959078670000061

S2.3.2、计算出前t个数与平均值m的偏差之和wtS2.3.2. Calculate the sum of the deviations of the first t numbers from the average value m as wt :

Figure BDA0003959078670000062
Figure BDA0003959078670000062

S2.3.3、计算偏差之和的最大值和最小值的差R(n)为S2.3.3. Calculate the difference between the maximum and minimum values of the sum of deviations R(n) as

R(n)=max(0,w1,w2,…,wn)-min(0,w1,w2,…,wn)R(n)=max(0,w1 ,w2 ,…,wn )-min(0,w1 ,w2 ,…,wn )

S2.3.4、计算赫斯特指数H为S2.3.4. Calculate the Hurst index H as

Figure BDA0003959078670000063
Figure BDA0003959078670000063

其中,S(n)为原始脑电数据序列的标准差。Among them, S(n) is the standard deviation of the original EEG data sequence.

前所述的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,步骤S4中,得到姿势状态特征的方法包括以下步骤In the above-mentioned method for identifying awkward working postures of construction workers based on multimodal data fusion, in step S4, the method for obtaining posture state features includes the following steps:

S4.1、对于每个时间窗口所对应的姿势图像,识别人体的33个主要点位;S4.1. For each posture image corresponding to each time window, identify 33 main points of the human body;

S4.2、识别人体边缘,并沿边缘剪裁图片尺寸;S4.2, identifying the edge of the human body and cropping the image size along the edge;

S4.3、定位人体中心点为形心,选取包含所有33个主要点位的最小正方形框,将裁剪后的姿势图像定位于三维空间中;S4.3, locate the center point of the human body as the centroid, select the smallest square box containing all 33 main points, and locate the cropped posture image in three-dimensional space;

S4.4、以最小正方形框的右上角点为原点建立三维直角坐标系;S4.4. Establish a three-dimensional rectangular coordinate system with the upper right corner of the smallest square frame as the origin;

S4.5、进一步确定33个主要点位的空间坐标(x,y,z)与可见性坐标(v),将空间坐标作为被监测者的姿势状态特征。S4.5. Further determine the spatial coordinates (x, y, z) and visibility coordinates (v) of 33 main points, and use the spatial coordinates as the posture state features of the monitored person.

前所述的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,步骤S6中,BP神经网络的训练方法包括模型训练数据集的准备方法和BP神经网络的搭建与训练方法,其中模型训练数据集的准备方法包括以下步骤In the above-mentioned method for identifying awkward working postures of construction workers based on multimodal data fusion, in step S6, the training method of the BP neural network includes a method for preparing a model training data set and a method for building and training the BP neural network, wherein the method for preparing a model training data set includes the following steps:

S6.1.1、召集施工现场建筑工人,为建筑工人佩戴上各项用于采集多模态数据的设备以及传感器;S6.1.1. Gather construction workers at the construction site and equip them with various devices and sensors for collecting multimodal data;

S6.1.2、施工工人在不同的尴尬工作姿势下进行模拟施工作业;S6.1.2, Construction workers perform simulated construction work in different awkward working postures;

S6.1.3、各项用于采集多模态数据的设备以及传感器对多模态数据进行采集和同步导出;S6.1.3. Various devices and sensors for collecting multimodal data to collect and synchronously export multimodal data;

S6.1.4、对已采集的多模态数据,实施相应预处理、特征提取以及融合,最终获得处于尴尬工作姿势下建筑工人的多模态数据融合特征;S6.1.4. Perform corresponding preprocessing, feature extraction and fusion on the collected multimodal data, and finally obtain the multimodal data fusion features of the construction workers in awkward working postures;

S6.1.5、获得用于训练BP神经网络的训练数据集。S6.1.5. Obtain a training data set for training the BP neural network.

前所述的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,BP神经网络的搭建与训练方法包括以下步骤The aforementioned method for identifying awkward working postures of construction workers based on multimodal data fusion, the construction and training method of the BP neural network includes the following steps:

S6.2.1、将多模态数据融合特征作为输入层,隐藏层的神经元数量设置为500,设置各个尴尬工作姿势所对应的标签;S6.2.1. Use the multimodal data fusion features as the input layer, set the number of neurons in the hidden layer to 500, and set the labels corresponding to each awkward working posture;

S6.2.2、初始化BP神经网络的超参数,各层之间的单向全链接的权值设置为[-1,1]之间的随机数;S6.2.2. Initialize the hyperparameters of the BP neural network. The weights of the one-way full links between each layer are set to random numbers between [-1, 1].

S6.2.3、将已标记完成的不同尴尬工作姿势下所对应的训练数据输入至BP神经网络;S6.2.3, inputting the marked training data corresponding to different awkward working postures into the BP neural network;

S6.2.4、对神经网络的学习率α和各个超参数进行不断调整,直至得到最小的误差值,获得最小误差值所对应的学习率;S6.2.4. Continuously adjust the learning rate α and various hyperparameters of the neural network until the minimum error value is obtained, and obtain the learning rate corresponding to the minimum error value;

S6.2.5、最终获得已训练完成的适用于尴尬工作姿势识别的BP神经网络。S6.2.5. Finally, a trained BP neural network suitable for awkward working posture recognition is obtained.

本发明的有益效果是:The beneficial effects of the present invention are:

本发明中,通过脑电仪采集被监测者的脑电数据,通过陀螺仪加速度计和压力传感器采集被监测者的行为数据,其包括加速度、角速度以及角度,同时通过计算机视觉技术采集被监测者的姿势图像;再对采集的脑电数据进行伪迹去除与特征提取处理,对采集的行为数据进行标准化、窗口划分与均值特征提取,以及对采集的姿势图像进行人体点位识别与空间坐标提取,得到了脑电数据特征、行为数据特征以及姿势状态特征;基于前期融合策略,将提取数据进行特征融合,再输入至已训练好的BP神经网络,输出被监测者的尴尬姿势工作类别,能够对现场建筑工人尴尬工作姿势实施既准确又客观的自动化识别。In the present invention, the EEG data of the monitored person are collected by an electroencephalograph, and the behavioral data of the monitored person, including acceleration, angular velocity and angle, are collected by a gyroscope, accelerometer and pressure sensor. At the same time, the posture image of the monitored person is collected by computer vision technology; then, the collected EEG data are subjected to artifact removal and feature extraction processing, the collected behavioral data are subjected to standardization, window division and mean feature extraction, and the collected posture image is subjected to human point recognition and spatial coordinate extraction, so as to obtain EEG data features, behavioral data features and posture state features; based on the previous fusion strategy, the extracted data are subjected to feature fusion, and then input into the trained BP neural network, and the embarrassing posture work category of the monitored person is output, so as to implement accurate and objective automatic recognition of the embarrassing working posture of on-site construction workers.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明实施例中识别方法的整体流程图;FIG1 is an overall flow chart of an identification method according to an embodiment of the present invention;

图2为本发明实施例中原始脑电数据的处理流程图;FIG2 is a flowchart of processing raw EEG data in an embodiment of the present invention;

图3为本发明实施例中姿势图像的处理流程图;FIG3 is a flowchart of processing a posture image in an embodiment of the present invention;

图4为本发明实施例中基于多模态数据融合的尴尬工作姿势识别流程图;FIG4 is a flowchart of awkward working posture recognition based on multimodal data fusion in an embodiment of the present invention;

图5为本发明实施例中多模态数据特征的融合流程图;FIG5 is a flowchart of the fusion of multimodal data features in an embodiment of the present invention;

图6为本发明实施例中BP神经网络的搭建与训练流程图。FIG6 is a flowchart of building and training a BP neural network in an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

本实施例提供的一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,如图1所示,包括以下步骤This embodiment provides a method for identifying awkward working postures of construction workers based on multimodal data fusion, as shown in FIG1 , comprising the following steps:

S1、采集被监测者的多模态数据,多模态数据包括原始脑电数据、原始行为数据以及姿势图像;S1. Collect multimodal data of the monitored person, where the multimodal data includes original EEG data, original behavior data, and posture images;

通过搭载14个电极(AF3、F3、F7、FC5、T7、P7、O1、O2、P8、T8、FC6、F8、F4、AF4)的EmotivEpoc X脑电仪,按照128Hz的采样率对14个电极所对应通道下的脑电数据进行采集,并搭配Emotiv Pro软件实现对脑电数据的实时记录、存储与导出,获得被监测者的原始脑电数据。The EmotivEpoc X electroencephalogram (EEG) device is equipped with 14 electrodes (AF3, F3, F7, FC5, T7, P7, O1, O2, P8, T8, FC6, F8, F4, and AF4). The EEG data of the channels corresponding to the 14 electrodes are collected at a sampling rate of 128 Hz, and the Emotiv Pro software is used to record, store, and export the EEG data in real time to obtain the original EEG data of the monitored person.

基于阵列分布式柔性薄膜压力传感器的足底压力传感鞋垫被用于实时测定脚底压力分布情况,按照每0.5秒平均2次的频率记录每只鞋垫所分布16个感应点的读数,以反映被监测者脚底实时压力分布与波动情况,并实时导出所记录的压力数据,获得被监测者的原始足底压力数据。Plantar pressure sensing insoles based on array distributed flexible thin film pressure sensors are used to measure the plantar pressure distribution in real time. The readings of 16 sensing points distributed on each insole are recorded at an average frequency of 2 times every 0.5 seconds to reflect the real-time pressure distribution and fluctuation of the monitored person's sole, and the recorded pressure data are exported in real time to obtain the original plantar pressure data of the monitored person.

基于六轴陀螺仪加速度计,按照100Hz采样率,对被监测者的加速度、角速度以及角度在三维空间中X、Y、Z坐标轴的分量进行实时记录与导出,获得被监测者的加速度、角速度以及角度数据,并与原始足底压力数据合并、整理为原始行为数据。Based on the six-axis gyro accelerometer, the acceleration, angular velocity and angle components of the monitored person in the X, Y, and Z coordinate axes in three-dimensional space are recorded and exported in real time at a sampling rate of 100 Hz to obtain the monitored person's acceleration, angular velocity and angle data, which are then merged with the original plantar pressure data and organized into original behavioral data.

基于现场布设的摄像头,按照每0.5秒1张的记录频率对被监测者所处姿势进行拍摄图像记录,并实时导出与上传,获得被监测者的姿势图像。Based on the cameras deployed on site, the posture of the monitored person is recorded at a frequency of 1 image every 0.5 seconds, and the images are exported and uploaded in real time to obtain the posture images of the monitored person.

S2、对原始脑电数据进行伪迹消除以及特征提取处理,得到脑电数据特征;S2, performing artifact elimination and feature extraction processing on the original EEG data to obtain EEG data features;

如图2所示,对采集的被监测者原始脑电数据进行预处理操作,对其中所包含的内部伪迹和外部伪迹进行消除,以确保消除伪迹后的脑电数据能够准确、客观地反映大脑的真实活动情况;再以0.5s大小的固定窗对脑电数据进行划分;再对消除伪迹后的脑电数据提取时域特征、频域特征与非线性特征,作为脑电数据信息性质与可测量性质的度量。As shown in Figure 2, the collected original EEG data of the monitored person are preprocessed to eliminate the internal and external artifacts contained therein to ensure that the EEG data after artifact elimination can accurately and objectively reflect the real activity of the brain; the EEG data are then divided into a fixed window of 0.5s; and the time domain features, frequency domain features and nonlinear features are extracted from the EEG data after artifact elimination as a measure of the information properties and measurable properties of the EEG data.

外部伪迹消除:脑电原始数据存在的外部伪迹由环境与物理两方面因素所引起,本发明采用有效脉冲响应(FIR)带通滤波器,将高低截止频率分别设置为50Hz与1Hz,以去除高于50Hz与低于1Hz的外部伪迹,以避免脑电数据的快速漂移与缓慢漂移,实现对脑电数据外部伪迹的消除。External artifact elimination: The external artifacts in the raw EEG data are caused by both environmental and physical factors. The present invention adopts an effective impulse response (FIR) bandpass filter and sets the high and low cutoff frequencies to 50 Hz and 1 Hz respectively to remove external artifacts above 50 Hz and below 1 Hz, so as to avoid fast and slow drift of EEG data and eliminate external artifacts of EEG data.

其中,高截止频率的设置考虑了奈奎斯特频率这一上限值,该频率等于采样率的一半,即64Hz,因此将高截止频率设置为50Hz;另外,低截止频率的设置考虑了所采集频率最低的Delta波的频率范围,即1-4Hz,因此将低截止频率设置为1Hz。The setting of the high cutoff frequency takes into account the upper limit of the Nyquist frequency, which is equal to half of the sampling rate, that is, 64Hz, so the high cutoff frequency is set to 50Hz; in addition, the setting of the low cutoff frequency takes into account the frequency range of the lowest frequency Delta wave collected, that is, 1-4Hz, so the low cutoff frequency is set to 1Hz.

内部伪迹消除:脑电数据的内部伪迹由受试者眨眼和肌肉运动等生理活动所产生的电信号引起,本发明通过对脑电数据进行独立成分分析(ICA),并进一步对所判别出的眼动和肌肉等伪迹成分进行筛除,实现对脑电数据内部伪迹的消除,并获得被监测者消除内外部伪迹后的脑电数据。Internal artifact elimination: The internal artifacts of EEG data are caused by the electrical signals generated by physiological activities such as blinking and muscle movement of the subject. The present invention eliminates the internal artifacts of EEG data by performing independent component analysis (ICA) on the EEG data and further screening out the artifact components such as eye movement and muscle movement, thereby obtaining the EEG data of the monitored person after eliminating internal and external artifacts.

为保证所提取的脑电数据特征具备足够的判别力,以充分反映和度量被监测者的认知和情绪状态,综合考虑作为非线性、非平稳时间序列的脑电数据所具备的混沌性,选择提取时域、频域与非线性三类脑电特征。In order to ensure that the extracted EEG data features have sufficient discriminative power to fully reflect and measure the cognitive and emotional state of the monitored person, the chaos of EEG data as a nonlinear and non-stationary time series is comprehensively considered, and three types of EEG features, namely time domain, frequency domain and nonlinear, are selected for extraction.

本发明选择固定窗口划分法,并以0.5秒的窗口大小,提取被监测者的154维度脑电特征数据,作为0.5秒每张姿势图像的记录频率下被监测者实时的认知和情绪状态的量化指标,表1为被监测者的154维度脑电特征数据。The present invention selects a fixed window division method and extracts the 154-dimensional EEG feature data of the monitored person with a window size of 0.5 seconds as a quantitative indicator of the real-time cognitive and emotional state of the monitored person at a recording frequency of 0.5 seconds per posture image. Table 1 shows the 154-dimensional EEG feature data of the monitored person.

表1提取的脑电数据特征汇总Table 1 Summary of extracted EEG data features

Figure BDA0003959078670000101
Figure BDA0003959078670000101

本发明采用时域分析法,提取标准差、波动指数以及峰度三项数学统计参数作为脑电数据的时域特征。The present invention adopts the time domain analysis method to extract three mathematical statistical parameters, namely standard deviation, fluctuation index and kurtosis, as the time domain characteristics of EEG data.

标准差(Standard deviation,Std)为方差(Variance)的平方根,是序列中各变量值与其平均数离差平方的算术平均数的平方根,可以较好地反映数据的离散程度,是广泛应用于测度脑电时域离散程度的特征值,标准差的求取算法如下,Standard deviation (Std) is the square root of variance, which is the square root of the arithmetic mean of the squares of the deviations between the values of each variable in the sequence and its mean. It can better reflect the degree of discreteness of the data and is a characteristic value widely used to measure the degree of discreteness of EEG time domain. The algorithm for calculating the standard deviation is as follows:

Figure BDA0003959078670000111
Figure BDA0003959078670000111

其中,n表示该通道下所采集数据点的总数,xi表示该通道下所采集的第i个数据点,

Figure BDA0003959078670000114
表示该通道下所采集n个数据点的平均值。Where n represents the total number of data points collected under this channel,xi represents the ith data point collected under this channel,
Figure BDA0003959078670000114
It represents the average value of n data points collected under this channel.

波动指数(Volatility index)广泛地应用于脑电和心电信号处理,用信号相邻之间的差值总和的平均数来表示时间序列数据的波动强度,波动指数的求取算法如下,The volatility index is widely used in EEG and ECG signal processing. The average of the sum of the differences between adjacent signals is used to represent the volatility intensity of time series data. The algorithm for calculating the volatility index is as follows:

Figure BDA0003959078670000112
Figure BDA0003959078670000112

其中,n表示该通道下所采集数据点的总数,xi表示该通道下所采集的第i个数据点,

Figure BDA0003959078670000115
表示该通道下所采集n个数据点的平均值。Where n represents the total number of data points collected under this channel,xi represents the ith data point collected under this channel,
Figure BDA0003959078670000115
It represents the average value of n data points collected under this channel.

峰度即峰态系数(Kurtosis),是对数据分布平峰或尖峰程度的测度,峰度为0表示该序列数据的分布与正态分布的陡缓程度相同;峰度大于0表示数据分布相较于正态分布更陡峭;峰度小于0表示数据分布相较于正态分布更平缓,峰度的求取算法如下,Kurtosis is a measure of the degree of peak or peak in data distribution. A kurtosis of 0 means that the distribution of the sequence data is as steep as the normal distribution; a kurtosis greater than 0 means that the data distribution is steeper than the normal distribution; a kurtosis less than 0 means that the data distribution is flatter than the normal distribution. The algorithm for calculating kurtosis is as follows:

Figure BDA0003959078670000113
Figure BDA0003959078670000113

其中,n表示该通道下所采集数据点的总数,s表示该通道下所采集n个数据点的标准差,xi表示该通道下所采集的第i个数据点,

Figure BDA0003959078670000116
表示该通道下所采集n个数据点的平均值。Where n represents the total number of data points collected under this channel, s represents the standard deviation of n data points collected under this channel,xi represents the ith data point collected under this channel,
Figure BDA0003959078670000116
It represents the average value of n data points collected under this channel.

本发明采用频谱分析法,提取了功率谱密度作为被监测者脑电数据的频域特征,通过傅里叶变化将时间序列的脑电数据转化为一系列正余弦函数的加和,实现从时域转换至频域,从而反映脑电数据(功率)随频率分布的模式。The present invention adopts spectrum analysis method to extract power spectrum density as the frequency domain feature of the EEG data of the monitored person, and converts the time series EEG data into the sum of a series of sine and cosine functions through Fourier transform, thereby realizing the conversion from time domain to frequency domain, thereby reflecting the pattern of EEG data (power) distribution with frequency.

本发明选择了加窗平均周期法(Welch法)对序列功率谱密度进行估计,并计算了14个通道的Delta波(1-3Hz)、Theta波(4-8Hz)、Alpha波(9-13Hz)、Beta波(14-30Hz)以及Gamma波(>30Hz)的5个频带的功率谱密度作为脑电数据频域特征。The present invention selects the windowed average period method (Welch method) to estimate the power spectral density of the sequence, and calculates the power spectral density of 5 frequency bands of Delta wave (1-3Hz), Theta wave (4-8Hz), Alpha wave (9-13Hz), Beta wave (14-30Hz) and Gamma wave (>30Hz) of 14 channels as the frequency domain features of EEG data.

本发明采用非线性动力算法,提取近似熵、模糊熵与赫斯特指数作为脑电数据的非线性特征,从而反映建筑工人大脑活动的复杂度水平。The present invention adopts a nonlinear dynamic algorithm to extract approximate entropy, fuzzy entropy and Hurst exponent as nonlinear features of EEG data, thereby reflecting the complexity level of brain activities of construction workers.

近似熵用以量化时间序列信号波动的规律性和不可预测性的非线性动力学参数,是用以反映信号整体特征的指标,该参数通过非负数来表示时间序列的复杂性,反映了时间序列中新信息发生的可能性,本发明将其作为信号规律的非线性统计特征,反映被监测者脑电数据的变动情况,作为机器学习分类器的输入信息,近似熵的计算包括以下步骤Approximate entropy is a nonlinear dynamic parameter used to quantify the regularity and unpredictability of time series signal fluctuations. It is an indicator used to reflect the overall characteristics of the signal. This parameter represents the complexity of the time series through a non-negative number and reflects the possibility of new information in the time series. The present invention uses it as a nonlinear statistical feature of the signal law to reflect the changes in the EEG data of the monitored person and as the input information of the machine learning classifier. The calculation of approximate entropy includes the following steps:

S2.1.1、设原始脑电数据为x(1),x(2),……,x(n),共n个点,按序号排列顺序组成m维向量X(i)=[x(i),x(i+1),……,x(i+m-1)],其中i=1,2,……,n-m+1;S2.1.1, suppose the original EEG data are x(1), x(2), ..., x(n), a total of n points, arranged in order to form an m-dimensional vector X(i) = [x(i), x(i+1), ..., x(i+m-1)], where i = 1, 2, ..., n-m+1;

S2.1.2、定义第i个向量X(i)与第j个向量X(j)之间的距离dijS2.1.2, define the distance dij between the i-th vector X(i) and the j-th vector X(j),

dij=max[|x(i+k)-x(j+k)|],0≤k≤m-1dij =max[|x(i+k)-x(j+k)|], 0≤k≤m-1

S2.1.3、给定阈值r,对每个向量X(i),统计满足dij≤r*Std条件的次数,其中Std为序列数据的标准差,并求出统计次数与距离总数n-m的比值,记作

Figure BDA0003959078670000121
S2.1.3. Given a threshold r, for each vector X(i), count the number of times that the condition dij ≤ r*Std is satisfied, where Std is the standard deviation of the sequence data, and find the ratio of the statistical number to the total number of distances nm, denoted as
Figure BDA0003959078670000121

S2.1.4、将

Figure BDA0003959078670000122
取对数,再对所有的i取平均值,记为φm(r),S2.1.4.
Figure BDA0003959078670000122
Take the logarithm and then take the average value for all i, denoted as φm (r),

Figure BDA0003959078670000123
Figure BDA0003959078670000123

S2.1.5、m值加1,以m+1维度重复步骤S2.1.1至步骤S2.1.4,得到

Figure BDA0003959078670000131
和φm+1(r),并求出近似熵为S2.1.5, add 1 to the value of m, and repeat steps S2.1.1 to S2.1.4 in the m+1 dimension to obtain
Figure BDA0003959078670000131
and φm +1 (r), and the approximate entropy is

ApEn=∑n→∞m(r)-φm+1(r)1ApEn=∑n→∞m (r)-φm+1 (r)1

其中,ApEn表示近似熵。Wherein, ApEn represents approximate entropy.

一般情况下,序列长度n为有限值,按照步骤S2.1.1至步骤S2.1.5得出的结果是序列长度为数值n时近似熵的估计值。Generally, the sequence length n is a finite value, and the results obtained from steps S2.1.1 to S2.1.5 are estimates of the approximate entropy when the sequence length is the value n.

模糊熵(Fuzzy entropy)与近似熵功能相近,是用以准确分析混沌序列复杂性的度量算法,序列的复杂性影响着信号的随机性与可恢复性,相较于近似熵与样本熵等算法,在复杂性测度上有效性更强,对参数敏感性和依赖性更低,并且具备鲁棒性与测度连续性,因此模糊熵算法被广泛应用于脑电等生理信号分析领域,本发明选择提取14通道序列数据的模糊熵作为脑电数据的非线性特征,模糊熵的算法包括以下步骤Fuzzy entropy is similar to approximate entropy in function. It is a measurement algorithm used to accurately analyze the complexity of chaotic sequences. The complexity of the sequence affects the randomness and recoverability of the signal. Compared with algorithms such as approximate entropy and sample entropy, it is more effective in complexity measurement, less sensitive and dependent on parameters, and has robustness and measurement continuity. Therefore, the fuzzy entropy algorithm is widely used in the field of physiological signal analysis such as EEG. The present invention selects the fuzzy entropy of 14-channel sequence data as the nonlinear feature of EEG data. The fuzzy entropy algorithm includes the following steps:

S2.2.1、设原始脑电数据为x(1),x(2),……,x(n),共n个点;S2.2.1, let the original EEG data be x(1), x(2), ..., x(n), with a total of n points;

S2.2.2、定义嵌入维数m与相似容忍度r,重构相空间,生成一组m维向量X(i)=[x(i),x(i+1),…,x(i+m-1)]-x0(i),其表示自x(i)起的m个连续数据点所组成的向量,其中i=1,2,…,n-m+1,S2.2.2.2. Define the embedding dimension m and similarity tolerance r, reconstruct the phase space, and generate a set of m-dimensional vectors X(i) = [x(i), x(i+1), ..., x(i+m-1)] -x0 (i), which represents the vector composed of m consecutive data points starting from x(i), where i = 1, 2, ..., n-m+1,

Figure BDA0003959078670000132
Figure BDA0003959078670000132

其中,x0(i)表示该m个数据点的均值;Wherein, x0 (i) represents the mean of the m data points;

S2.2.3、定义模糊隶属函数A(x),S2.2.3, define the fuzzy membership function A(x),

Figure BDA0003959078670000133
Figure BDA0003959078670000133

其中,r表示相似容忍度;Among them, r represents similarity tolerance;

S2.2.4、根据A(x)表达式将其变形为S2.2.4, according to the expression A(x), it is transformed into

Figure BDA0003959078670000141
Figure BDA0003959078670000141

其中,j=1,2,…,n-m+1,且j与i不相等,

Figure BDA0003959078670000142
表示窗口向量X(i)与X(j)之间的最大绝对距离,计算如下所示,Where j = 1, 2, ..., n-m+1, and j is not equal to i.
Figure BDA0003959078670000142
Represents the maximum absolute distance between window vectors X(i) and X(j), which is calculated as follows:

Figure BDA0003959078670000143
Figure BDA0003959078670000143

S2.2.5、定义函数

Figure BDA0003959078670000144
S2.2.5. Define function
Figure BDA0003959078670000144

Figure BDA0003959078670000145
Figure BDA0003959078670000145

S2.2.6、重复步骤S2.2.1至步骤S2.2.5,按照序列顺序重构生成m+1维矢量,定义函数

Figure BDA0003959078670000146
S2.2.6. Repeat steps S2.2.1 to S2.2.5 to reconstruct the m+1 dimensional vector in sequence order and define the function
Figure BDA0003959078670000146

Figure BDA0003959078670000147
Figure BDA0003959078670000147

S2.2.7、在步骤S2.2.5的基础上,定义模糊熵为S2.2.7, based on step S2.2.5, define the fuzzy entropy as

Figure BDA0003959078670000148
Figure BDA0003959078670000148

对于由N个数据点所组成的有限时间序列数据,模糊熵最终可以表示为For a finite time series data consisting of N data points, the fuzzy entropy can finally be expressed as

Figure BDA0003959078670000149
Figure BDA0003959078670000149

其中,FuzzyEn(m,r,N)表示模糊熵。Among them, FuzzyEn(m,r,N) represents fuzzy entropy.

赫斯特指数(Hurstindex)是一种时间序列分析方法,可以用于测量分形时间序列的平滑性,本发明中,赫斯特指数作为脑电数据的非线性特征,用于衡量所采集脑电数据的非平稳特征,从而表征受试者的心理波动情况,并作为输入机器学习分类器的结构化信息,赫斯特指数的计算包括以下步骤Hurst index is a time series analysis method that can be used to measure the smoothness of fractal time series. In the present invention, Hurst index is used as a nonlinear feature of EEG data to measure the non-stationary characteristics of the collected EEG data, thereby characterizing the psychological fluctuations of the subjects and serving as structured information for input into the machine learning classifier. The calculation of Hurst index includes the following steps:

S2.3.1、根据由n个数据点所组成的原始脑电数据序列x(1),x(2),……,x(n),计算平均值为S2.3.1. Based on the original EEG data sequence consisting of n data points x(1), x(2), ..., x(n), calculate the average value as

Figure BDA0003959078670000151
Figure BDA0003959078670000151

S2.3.2、计算出前t个数与平均值m的偏差之和wtS2.3.2. Calculate the sum of the deviations of the first t numbers from the average value m as wt :

Figure BDA0003959078670000152
Figure BDA0003959078670000152

S2.3.3、计算偏差之和的最大值和最小值的差R(n)为S2.3.3. Calculate the difference between the maximum and minimum values of the sum of deviations R(n) as

R(n)=max(0,w1,w2,…,wn)-min(0,w1,w2,…,wn)R(n)=max(0,w1 ,w2 ,…,wn )-min(0,w1 ,w2 ,…,wn )

S2.3.4、计算赫斯特指数H为S2.3.4. Calculate the Hurst index H as

Figure BDA0003959078670000153
Figure BDA0003959078670000153

其中,S(n)为原始脑电数据序列的标准差。Among them, S(n) is the standard deviation of the original EEG data sequence.

S3、对原始行为数据进行标准化、窗口划分以及均值特征提取处理,得到行为数据特征;S3, standardizing, windowing, and extracting mean features on the original behavior data to obtain behavior data features;

在0.5秒大小的时间窗口中,对传感器所采集的时间序列数据(包括脚底压力数据、加速度、角速度以及角度数据)计算方差,如果处于误差允许范围内,即可认为该时间窗口所采集的行为数据具备有效性,进一步实施标准化处理并提取行为数据特征;否则将该时间窗口所采集数据剔除,选择对下一段0.5秒时间窗口所采集的行为数据进行方差计算与判别。In a 0.5-second time window, the variance of the time series data collected by the sensor (including plantar pressure data, acceleration, angular velocity, and angle data) is calculated. If it is within the allowable error range, the behavioral data collected in this time window can be considered valid, and further standardization processing is implemented and behavioral data features are extracted; otherwise, the data collected in this time window is discarded, and the variance calculation and judgment of the behavioral data collected in the next 0.5-second time window is selected.

根据足底压力传感鞋垫实时测定与输出的脚底压力数据(时间序列数据),按照脑电数据特征提取过程中所采用的0.5秒大小的时间窗口,分别对32个分布于两只鞋垫的压力感应点在每0.5秒中所采集的2个压力读数求取算数平均值,作为该时间窗口中的压力特征数据。According to the plantar pressure data (time series data) measured and output in real time by the plantar pressure sensing insole, and in accordance with the 0.5-second time window used in the EEG data feature extraction process, the arithmetic mean of the two pressure readings collected every 0.5 seconds at the 32 pressure sensing points distributed on the two insoles is calculated as the pressure feature data in this time window.

通过六轴陀螺仪加速度计实时测定与输出的加速度、角速度以及角度数据(时间序列数据),六轴陀螺仪加速度计的采样率设置为100Hz,按照脑电数据特征提取过程中所采用的0.5秒大小的时间窗口,对每0.5秒中所采集的50次x方向的加速度x(g)、y方向的加速度x(g)、z方向的加速度计z(g)、x方向的角速度wx、y方向的角速度wy、z方向的角速度wz、x方向的角度θx、y方向的角度θy以及z方向的角度θz共9个读数求取算数平均值,与压力特征数据合并为该时间窗口中的行为数据特征。The acceleration, angular velocity and angle data (time series data) are measured and output in real time by the six-axis gyroscope accelerometer. The sampling rate of the six-axis gyroscope accelerometer is set to 100 Hz. According to the 0.5-second time window used in the EEG data feature extraction process, the arithmetic average of the 9 readings of 50 times of x-direction acceleration x(g), y-direction acceleration x(g), z-direction accelerometer z(g), x-direction angular velocitywx , y-direction angular velocitywy , z-direction angular velocitywz , x-direction angleθx , y-direction angleθy and z-direction angleθz collected every 0.5 seconds is calculated and merged with the pressure feature data into the behavioral data feature in this time window.

S4、对姿势图像进行人体点位识别与空间坐标提取,得到姿势状态特征;S4, performing human body point recognition and spatial coordinate extraction on the posture image to obtain posture state features;

采用基于MediaPipe框架的计算机视觉技术,通过BlazePose姿势估计模型对每0.5秒1张的记录频率所采集被监测者的姿势图像进行处理,识别每张图像中33个人体主要点位(包括鼻、眼、耳、臀、膝盖等),并提取各点位的空间坐标,作为反映被监测者姿势状态的特征指标,如图3所示,包括以下步骤S4.1、对于每个时间窗口所对应的姿势图像,识别人体的33个主要点位;Computer vision technology based on the MediaPipe framework is used to process the posture images of the monitored person collected at a recording frequency of one image every 0.5 seconds through the BlazePose posture estimation model, identify 33 main points of the human body (including nose, eyes, ears, hips, knees, etc.) in each image, and extract the spatial coordinates of each point as a characteristic indicator reflecting the posture state of the monitored person, as shown in Figure 3, including the following steps S4.1, for each posture image corresponding to the time window, identify 33 main points of the human body;

S4.2、识别人体边缘,并沿边缘剪裁图片尺寸;S4.2, identifying the edge of the human body and cropping the image size along the edge;

S4.3、定位人体中心点为形心,选取包含所有33个主要点位的最小正方形框,将裁剪后的姿势图像定位于三维空间中;S4.3, locate the center point of the human body as the centroid, select the smallest square box containing all 33 main points, and locate the cropped posture image in three-dimensional space;

对于每个0.5秒大小时间窗口所对应的姿势图像,通过BlazePose姿势检测模型估计人体姿势,并识别人体的33个主要点位(包括鼻、眼、耳、臀、膝盖等);进一步识别人体边缘并沿边缘剪裁图片尺寸;进一步定位人体中心点为形心,选取包含所有主要点位的最小正方形框,实现将剪裁后的姿势图像定位于三维空间,完成姿势图像的预处理,其中具体所涉及的代码解决方案如下所列:For each posture image corresponding to a 0.5 second time window, the BlazePose posture detection model is used to estimate the human posture and identify 33 main points of the human body (including nose, eyes, ears, hips, knees, etc.); the human body edge is further identified and the image size is cropped along the edge; the center point of the human body is further located as the centroid, and the smallest square frame containing all the main points is selected to locate the cropped posture image in three-dimensional space and complete the preprocessing of the posture image. The specific code solutions involved are listed as follows:

静态图像模式:如果设置为false,则解决方案将输入图像视为视频流,将尝试检测第一张图像中最突出的人像,并在成功检测后进一步定位姿势标志,在随后的图像中,该模式只是简单跟踪地标而不调用其他检测,直到它失去跟踪,以减少计算和延迟;如果设置为true,则人员检测会运行每个输入图像,从而更加适合于处理一批静态的、可能不相关的图像;默认设置静态图像模式为false。Static image mode: If set to false, the solution treats the input image as a video stream and will try to detect the most prominent person in the first image and further locate the pose landmark after successful detection. In subsequent images, the mode simply tracks the landmark without calling other detections until it loses tracking to reduce calculation and latency. If set to true, person detection runs for each input image, which is more suitable for processing a batch of static and possibly unrelated images. The default setting for static image mode is false.

模型复杂度:位置地标模型的复杂度包括0、1或2,地标准确性和推理延迟通常会随着模型的复杂性而增加;默认设置模型复杂度为1。Model complexity: The complexity of the location landmark model includes 0, 1, or 2. Landmark accuracy and inference latency generally increase with the complexity of the model; the default setting model complexity is 1.

平滑地标:如果平滑地标设置为true,则解决方案会过滤不同输入图像中的地标以减少抖动,但如果static_image_mode也设置为true,则忽略;默认设置平滑地标为true。smooth_landmarks: If smooth_landmarks is set to true, the solution filters landmarks in different input images to reduce jitter, but is ignored if static_image_mode is also set to true; the default setting is smooth_landmarks to true.

ENABLE_SEGMENTATION:如果设置为true,除了姿势界标之外,该解决方案还生成分割掩码;默认设置该参数为false。ENABLE_SEGMENTATION: If set to true, the solution generates segmentation masks in addition to pose landmarks; by default this parameter is set to false.

平滑分割:如果该参数设置为true,则解决方案会过滤不同输入图像的分割掩码以减少抖动;如果enable_segmentation参数设置为false或static_image_mode参数设置为true,则忽略;默认设置平滑分割为true。smooth_segmentation: If this parameter is set to true, the solution will filter the segmentation masks of different input images to reduce jitter; ignored if the enable_segmentation parameter is set to false or the static_image_mode parameter is set to true; the default setting is smooth_segmentation to true.

MIN_DETECTION_CONFIDENCE:人检测模型中的最小置信度值处于区间[0.0,1.0]时,被认定为成功的检测;默认设置该参数为0.5。MIN_DETECTION_CONFIDENCE: When the minimum confidence value in the human detection model is in the interval [0.0, 1.0], it is considered a successful detection; the default setting of this parameter is 0.5.

MIN_TRACKING_CONFIDENCE:来自标志跟踪模型的最小置信度值为[0.0,1.0],将其设置为更高的值可以提高解决方案的稳健性,但代价是更高的延迟;如果static_image_mode参数设置为true,则忽略,其中人员姿势监测仅在每个图像上运行;默认设置该参数为0.5。MIN_TRACKING_CONFIDENCE: Minimum confidence value from the landmark tracking model in [0.0, 1.0]. Setting it to higher values can improve the robustness of the solution at the expense of higher latency. Ignored if the static_image_mode parameter is set to true, where human pose detection is only run on each image. By default, this parameter is set to 0.5.

S4.4、以最小正方形框的右上角点为原点建立三维直角坐标系;S4.4. Establish a three-dimensional rectangular coordinate system with the upper right corner of the smallest square frame as the origin;

S4.5、进一步确定33个主要点位的空间坐标(x,y,z)与可见性坐标(v),将空间坐标作为被监测者的姿势状态特征;S4.5, further determining the spatial coordinates (x, y, z) and visibility coordinates (v) of 33 main points, and using the spatial coordinates as the posture state features of the monitored person;

在对每个被监测者姿势图像实施预处理的基础上,即已经将剪裁后的姿势图像定位于三维空间,接着以正方形右上角点为原点建立三维直角坐标系,并进一步确定33个点位的空间坐标(x,y,z)与可见性坐标(v),将空间坐标作为被监测者的姿势状态特征,其中具体所涉及的输出设置如下所列:On the basis of preprocessing the posture image of each monitored person, that is, the cropped posture image has been positioned in the three-dimensional space, and then a three-dimensional rectangular coordinate system is established with the upper right corner of the square as the origin, and the spatial coordinates (x, y, z) and visibility coordinates (v) of 33 points are further determined, and the spatial coordinates are used as the posture state features of the monitored person. The specific output settings involved are listed as follows:

姿势地标:设置了姿势坐标列表,x和y分别指图像宽度和高度归一化的地标坐标;z指以臀部中点深度为原点的地标坐标深度,数值越小,坐标距离相机越近,本发明所使用的规模z大致相同于x;v指该点位在图像中可见(存在且未被遮挡)的可能性的值,即可见性。Pose landmark: A list of pose coordinates is set. x and y refer to the landmark coordinates normalized to the image width and height, respectively; z refers to the depth of the landmark coordinate with the midpoint of the hip as the origin. The smaller the value, the closer the coordinate is to the camera. The scale z used in the present invention is roughly the same as x; v refers to the value of the possibility that the point is visible (existing and not obscured) in the image, that is, visibility.

姿势世界坐标:该参数指另一个世界坐标中的姿势地标列表,x、y以及z表示以米为单位的真实世界三维坐标,原点位于臀部之间的中心;v则与对应的pose_landmarks中定义的相同。pose_world_coordinates: This parameter refers to a list of pose landmarks in another world coordinate system, where x, y, and z represent real-world 3D coordinates in meters with the origin centered between the hips; v is the same as defined in the corresponding pose_landmarks.

本发明通过对被监测者的姿势图像进行分析,识别出重要点位,然后该模型使用OpenCV的cv2模块将结果通过实时预测窗口呈现到屏幕,同时该照片各个点位数据以包含四个坐标(x、y、z和可见性)的地标形式呈现,并且坐标以数字坐标(x、y、z和v)的形式输出至CSV文件,此处“v”表示特定点位在姿势图像上的可见性大小,取值范围为0到1。The present invention analyzes the posture image of the monitored person and identifies important points. Then the model uses the cv2 module of OpenCV to present the results to the screen through a real-time prediction window. At the same time, the data of each point in the photo is presented in the form of a landmark containing four coordinates (x, y, z and visibility), and the coordinates are output to a CSV file in the form of digital coordinates (x, y, z and v), where "v" represents the visibility of a specific point on the posture image, and the value range is 0 to 1.

多模态数据分析过程,即被监测者尴尬工作姿势识别过程如图4所示,在已提取的脑电数据特征、行为数据特征以及姿势状态特征的基础上,通过特征融合获取了多模态数据融合特征,并将特征数据输入至已训练好的BP神经网络,随后输出被监测者的尴尬姿势识别结果,包括8种预设的尴尬工作姿势以及非尴尬工作姿势。The multimodal data analysis process, that is, the process of identifying the monitored person's embarrassing working posture, is shown in Figure 4. Based on the extracted EEG data features, behavioral data features, and posture state features, the multimodal data fusion features are obtained through feature fusion, and the feature data is input into the trained BP neural network. Then, the embarrassing posture recognition results of the monitored person are output, including 8 preset embarrassing working postures and non-awkward working postures.

S5、基于前期融合策略对脑电数据特征、行为数据特征以及姿势状态特征进行特征融合,得到多模态数据融合特征;S5. Based on the previous fusion strategy, feature fusion is performed on the EEG data features, behavior data features, and posture state features to obtain multimodal data fusion features;

如图5所示,本发明选用了决策前进行融合的前期融合策略,对已提取的三类多模态数据特征(脑电数据特征、行为数据特征以及姿势状态特征)进行融合,实现在早期利用和挖掘来自不同模态的多个数据特征之间的关联性。As shown in Figure 5, the present invention selects a pre-fusion strategy of fusing before decision-making, and fuses the three types of multimodal data features extracted (EEG data features, behavioral data features, and posture state features) to achieve early utilization and mining of the correlation between multiple data features from different modalities.

S6、将多模态数据融合特征输入至训练好的BP神经网络;S6, inputting the multimodal data fusion features into the trained BP neural network;

根据已提取的三类多模态数据特征构建输入BP神经网络的多模态数据集包括:计算机视觉识别的33个主要点位的空间坐标x、y和z(99维度)、时域、频域和非线性脑电数据特征(154维度)、压力传感器采集的足底压力特征(32维度)和陀螺仪加速器采集的加速度、角速度以及角度数据特征(9维度),共计394维度,对于被监测者的未知姿势状态,将构建的多模态数据集{x(1),x(2),……,x(294)}输入至训练好的BP神经网络中,最后输出被监测者尴尬工作姿势识别结果,8种预设的尴尬工作姿势以及非尴尬工作姿势如表2所示。The multimodal data set for input into the BP neural network is constructed based on the three types of multimodal data features extracted, including: the spatial coordinates x, y and z (99 dimensions) of 33 main points recognized by computer vision, time domain, frequency domain and nonlinear EEG data features (154 dimensions), plantar pressure features collected by pressure sensors (32 dimensions) and acceleration, angular velocity and angle data features collected by gyroscope accelerometers (9 dimensions), totaling 394 dimensions. For the unknown posture state of the monitored person, the constructed multimodal data set {x(1) , x(2) , ..., x(294) } is input into the trained BP neural network, and finally the recognition result of the monitored person's embarrassing working posture is output. The eight preset embarrassing working postures and non-awkward working postures are shown in Table 2.

表2尴尬工作姿势识别结果Table 2 Awkward working posture recognition results

Figure BDA0003959078670000191
Figure BDA0003959078670000191

S7、输出被监测者的尴尬工作姿势类别。S7. Output the category of the monitored person's awkward working posture.

本发明提出的尴尬工作姿势识别方法是基于已训练好的BP神经网络所构建,BP神经网络的训练方法包括模型训练数据集的准备方法和BP神经网络的搭建与训练方法。The awkward working posture recognition method proposed in the present invention is constructed based on a trained BP neural network. The training method of the BP neural network includes a method for preparing a model training data set and a method for building and training the BP neural network.

如图6所示,模型训练数据集的准备方法包括以下步骤:召集施工现场建筑工人开展多模态数据采集实验,为工人佩戴EmotivEpoc X脑电采集仪、压力传感鞋垫以及陀螺仪加速度计,要求工人模拟8种不同尴尬工作姿势状态下的施工作业,尴尬工作姿势包括:用手在头顶上方工作(例如:抹灰操作)、弯曲膝盖或脚踝并以蹲姿工作、向后或向前弯曲脖颈、弯曲腰部、身体后仰作业、攀登梯子身体扭转进行作业、狭小空间身体过于折叠坐姿作业以及单脚支撑往前触碰。As shown in FIG6 , the method for preparing the model training data set includes the following steps: recruiting construction workers on the construction site to conduct a multimodal data collection experiment, and providing the workers with EmotivEpoc X electroencephalogram (EEG) collectors, pressure sensing insoles, and gyroscope accelerometers. The workers are required to simulate construction work in 8 different awkward working postures, including: working with hands above the head (e.g., plastering operation), bending the knees or ankles and squatting, bending the neck backward or forward, bending the waist, leaning back to work, climbing a ladder with the body twisted to work, sitting in a small space with the body too folded, and supporting one foot forward to touch.

在建筑工人以尴尬工作姿势模拟施工作业的同时,采集并同步导出受试者工人的脑电数据、行为数据(压力数据、加速度、角速度、角度数据),并通过摄像头拍摄建筑工人的姿势图像,最终获得了处于8种尴尬工作姿势下建筑工人的多模态数据。While the construction workers simulated construction work in awkward working postures, the EEG data and behavioral data (pressure data, acceleration, angular velocity, angle data) of the subject workers were collected and synchronously exported, and the posture images of the construction workers were captured by a camera. Finally, multimodal data of construction workers in 8 awkward working postures were obtained.

对已采集的多模态数据,实施相应预处理与特征提取,最终获得处于尴尬工作姿势下建筑工人的多模态数据特征,包括脑电数据特征、行为数据特征、姿势状态特征,进一步通过前期融合策略实现对多模态数据特征的融合,获得用于训练BP神经网络的训练数据。For the collected multimodal data, corresponding preprocessing and feature extraction are carried out, and finally the multimodal data characteristics of construction workers in awkward working postures are obtained, including EEG data characteristics, behavioral data characteristics, and posture state characteristics. The multimodal data features are further fused through the early fusion strategy to obtain training data for training the BP neural network.

如图6所示,BP神经网络的搭建与训练方法包括以下步骤:As shown in FIG6 , the construction and training method of the BP neural network includes the following steps:

BP神经网络的搭建:将多模态特征数据集{x(1),x(2),……,x(294)}中294个特征设置为输入层,设计隐藏层神经元数量为500,Y1-Y8为8个尴尬工作姿势所对应的标签;Construction of BP neural network: 294 features in the multimodal feature data set {x(1) , x(2) , …, x(294) } are set as the input layer, the number of hidden layer neurons is designed to be 500, and Y1-Y8 are the labels corresponding to the 8 awkward working postures;

初始化神经网络的超参数:各层之间的单向全链接的权值设置为[-1,1]之间的随机数;Initialize the hyperparameters of the neural network: the weights of the one-way full links between each layer are set to random numbers between [-1, 1];

输入训练样本:将已标记完成的不同尴尬工作姿势下所对应的多模态融合特征数据(训练数据)输入至BP神经网络模型,其中本发明按照8:2的比例划分训练集与测试集;Input training samples: input the multimodal fusion feature data (training data) corresponding to different awkward working postures that have been marked into the BP neural network model, wherein the present invention divides the training set and the test set into a ratio of 8:2;

对神经网络的学习率α进行调整,得到最小的误差值:经不断调整神经网络的学习率和各个超参数,最终获得最小误差值对应的学习率α=0.09,通过调整优化过程,获得已训练完成的适用于尴尬工作姿势识别的BP神经网络。The learning rate α of the neural network is adjusted to obtain the minimum error value: After continuously adjusting the learning rate and various hyperparameters of the neural network, the learning rate α=0.09 corresponding to the minimum error value is finally obtained. By adjusting the optimization process, a trained BP neural network suitable for awkward working posture recognition is obtained.

除上述实施例外,本发明还可以有其他实施方式。凡采用等同替换或等效变换形成的技术方案,均落在本发明要求的保护范围。In addition to the above embodiments, the present invention may also have other implementations. Any technical solution formed by equivalent replacement or equivalent transformation falls within the protection scope required by the present invention.

Claims (10)

Translated fromChinese
1.一种基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:包括以下步骤1. A method for identifying awkward working postures of construction workers based on multimodal data fusion, characterized in that: it comprises the following stepsS1、采集被监测者的多模态数据,多模态数据包括原始脑电数据、原始行为数据以及姿势图像;S1. Collect multimodal data of the monitored person, where the multimodal data includes original EEG data, original behavior data, and posture images;S2、对原始脑电数据进行伪迹消除以及特征提取处理,得到脑电数据特征;S2, performing artifact elimination and feature extraction processing on the original EEG data to obtain EEG data features;S3、对原始行为数据进行标准化、窗口划分以及均值特征提取处理,得到行为数据特征;S3, standardizing, windowing, and extracting mean features on the original behavior data to obtain behavior data features;S4、对姿势图像进行人体点位识别与空间坐标提取,得到姿势状态特征;S4, performing human body point recognition and spatial coordinate extraction on the posture image to obtain posture state features;S5、基于前期融合策略对脑电数据特征、行为数据特征以及姿势状态特征进行特征融合,得到多模态数据融合特征;S5, based on the previous fusion strategy, feature fusion is performed on the EEG data features, the behavior data features and the posture state features to obtain multimodal data fusion features;S6、将多模态数据融合特征输入至训练好的BP神经网络;S6, inputting the multimodal data fusion features into the trained BP neural network;S7、输出被监测者的尴尬工作姿势类别。S7. Output the category of the monitored person's awkward working posture.2.根据权利要求1所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述步骤S2中,对原始脑电数据进行伪迹去除包括外部伪迹消除和内部伪迹消除,通过有限脉冲响应带通滤波器进行外部伪迹消除;通过独立成分分析对原始脑电数据中的内部伪迹进行判别与筛除。2. According to the method for identifying awkward working postures of construction workers based on multimodal data fusion in claim 1, it is characterized in that: in the step S2, the artifact removal of the original EEG data includes external artifact elimination and internal artifact elimination, and the external artifact elimination is performed by a finite impulse response bandpass filter; the internal artifacts in the original EEG data are identified and screened out by independent component analysis.3.根据权利要求1所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述步骤S2中,通过固定窗口划分法提取被监测者的多维度脑电特征数据,多维度脑电特征数据包括时域特征、频域特征以及非线性特征,时域特征包括标准差、波动指数以及峰度;频域特征包括Delta频带功率谱密度、Theta频带功率谱密度、Alpha频带功率谱密度、Beta频带功率谱密度以及Gamma频带功率谱密度;非线性特征包括近似熵、模糊熵以及赫斯特指数。3. According to claim 1, the method for identifying awkward working postures of construction workers based on multimodal data fusion is characterized in that: in the step S2, the multi-dimensional EEG feature data of the monitored person is extracted by a fixed window division method, and the multi-dimensional EEG feature data includes time domain features, frequency domain features and nonlinear features. The time domain features include standard deviation, fluctuation index and kurtosis; the frequency domain features include Delta band power spectral density, Theta band power spectral density, Alpha band power spectral density, Beta band power spectral density and Gamma band power spectral density; the non-linear features include approximate entropy, fuzzy entropy and Hurst exponent.4.根据权利要求3所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述时域特征中标准差的求取算法如下,4. The method for identifying awkward working postures of construction workers based on multimodal data fusion according to claim 3 is characterized in that: the algorithm for obtaining the standard deviation in the time domain features is as follows:
Figure FDA0003959078660000021
Figure FDA0003959078660000021
其中,n表示该通道下所采集数据点的总数,xi表示该通道下所采集的第i个数据点,
Figure FDA0003959078660000024
表示该通道下所采集n个数据点的平均值;
Where n represents the total number of data points collected under this channel,xi represents the ith data point collected under this channel,
Figure FDA0003959078660000024
It represents the average value of n data points collected under this channel;
时域特征中波动指数的求取算法如下,The algorithm for obtaining the volatility index in the time domain characteristics is as follows:
Figure FDA0003959078660000022
Figure FDA0003959078660000022
其中,n表示该通道下所采集数据点的总数,x(i)表示该通道下所采集的第i个数据点,x(i+1)表示该通道下所采集的第i+1个数据点;Where n represents the total number of data points collected under the channel, x(i) represents the i-th data point collected under the channel, and x(i+1) represents the i+1-th data point collected under the channel;时域特征中峰度的求取算法如下,The algorithm for obtaining the kurtosis in the time domain features is as follows:
Figure FDA0003959078660000023
Figure FDA0003959078660000023
其中,n表示该通道下所采集数据点的总数,s表示该通道下所采集n个数据点的标准差,xi表示该通道下所采集的第i个数据点,
Figure FDA0003959078660000025
表示该通道下所采集n个数据点的平均值。
Where n represents the total number of data points collected under this channel, s represents the standard deviation of n data points collected under this channel,xi represents the ith data point collected under this channel,
Figure FDA0003959078660000025
It represents the average value of n data points collected under this channel.
5.根据权利要求3所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述非线性特征中近似熵的计算包括以下步骤5. The method for identifying awkward working postures of construction workers based on multimodal data fusion according to claim 3 is characterized in that the calculation of the approximate entropy in the nonlinear feature includes the following steps:S2.1.1、设原始脑电数据为x(1),x(2),……,x(n),共n个点,按序号排列顺序组成m维向量X(i)=[x(i),x(i+1),……,x(i+m-1)],其中i=1,2,……,n-m+1;S2.1.1, suppose the original EEG data are x(1), x(2), ..., x(n), a total of n points, arranged in order to form an m-dimensional vector X(i) = [x(i), x(i+1), ..., x(i+m-1)], where i = 1, 2, ..., n-m+1;S2.1.2、定义第i个向量X(i)与第j个向量X(j)之间的距离dijS2.1.2, define the distance dij between the i-th vector X(i) and the j-th vector X(j),dij=max[|x(i+k)-x(j+k)|],0≤k≤m-1dij =max[|x(i+k)-x(j+k)|], 0≤k≤m-1S2.1.3、给定阈值r,对每个向量X(i),统计满足dij≤r*Std条件的次数,其中Std为序列数据的标准差,并求出统计次数与距离总数n-m的比值,记作
Figure FDA0003959078660000031
S2.1.3. Given a threshold r, for each vector X(i), count the number of times that the condition dij ≤ r*Std is satisfied, where Std is the standard deviation of the sequence data, and find the ratio of the statistical number to the total number of distances nm, denoted as
Figure FDA0003959078660000031
S2.1.4、将
Figure FDA0003959078660000032
取对数,再对所有的i取平均值,记为φm(r),
S2.1.4.
Figure FDA0003959078660000032
Take the logarithm and then take the average value for all i, recorded as φm(r),
Figure FDA0003959078660000033
Figure FDA0003959078660000033
S2.1.5、m值加1,以m+1维度重复步骤S2.1.1至步骤S2.1.4,得到
Figure FDA0003959078660000034
和φm+1(r),并求出近似熵为
S2.1.5, add 1 to the value of m, and repeat steps S2.1.1 to S2.1.4 in the m+1 dimension to obtain
Figure FDA0003959078660000034
and φm+1(r), and the approximate entropy is
ApEn=∑n→∞m(r)-φm+1(r)]ApEn=∑n→∞m (r)-φm+1 (r)]其中,ApEn表示近似熵。Wherein, ApEn represents approximate entropy.
6.根据权利要求3所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述非线性特征中模糊熵的算法包括以下步骤6. The method for identifying awkward working postures of construction workers based on multimodal data fusion according to claim 3 is characterized in that: the algorithm of fuzzy entropy in the nonlinear feature includes the following stepsS2.2.1、设原始脑电数据为x(1),x(2),……,x(n),共n个点;S2.2.1, let the original EEG data be x(1), x(2), ..., x(n), with a total of n points;S2.2.2、定义嵌入维数m与相似容忍度r,重构相空间,生成一组m维向量X(i)=[x(i),x(i+1),…,x(i+m-1)]-x0(i),其表示自x(i)起的m个连续数据点所组成的向量,其中i=1,2,…,n-m+1,S2.2.2.2. Define the embedding dimension m and similarity tolerance r, reconstruct the phase space, and generate a set of m-dimensional vectors X(i) = [x(i), x(i+1), ..., x(i+m-1)] -x0 (i), which represents the vector composed of m consecutive data points starting from x(i), where i = 1, 2, ..., n-m+1,
Figure FDA0003959078660000035
Figure FDA0003959078660000035
其中,x0(i)表示该m个数据点的均值;Wherein, x0 (i) represents the mean of the m data points;S2.2.3、定义模糊隶属函数A(x),S2.2.3, define the fuzzy membership function A(x),
Figure FDA0003959078660000036
Figure FDA0003959078660000036
其中,r表示相似容忍度;Among them, r represents similarity tolerance;S2.2.4、根据A(x)表达式将其变形为S2.2.4, according to the expression A(x), it is transformed into
Figure FDA0003959078660000041
Figure FDA0003959078660000041
其中,j=1,2,…,n-m+1,且j与i不相等,
Figure FDA0003959078660000042
表示窗口向量X(i)与X(j)之间的最大绝对距离,计算如下所示,
Where j = 1, 2, ..., n-m+1, and j is not equal to i.
Figure FDA0003959078660000042
Represents the maximum absolute distance between window vectors X(i) and X(j), which is calculated as follows:
Figure FDA0003959078660000043
Figure FDA0003959078660000043
S2.2.5、定义函数
Figure FDA0003959078660000044
S2.2.5. Define function
Figure FDA0003959078660000044
Figure FDA0003959078660000045
Figure FDA0003959078660000045
S2.2.6、重复步骤S2.2.1至步骤S2.2.5,按照序列顺序重构生成m+1维矢量,定义函数
Figure FDA0003959078660000046
S2.2.6. Repeat steps S2.2.1 to S2.2.5 to reconstruct the m+1 dimensional vector in sequence order and define the function
Figure FDA0003959078660000046
Figure FDA0003959078660000047
Figure FDA0003959078660000047
S2.2.7、在步骤S2.2.5的基础上,定义模糊熵为S2.2.7, based on step S2.2.5, define the fuzzy entropy as
Figure FDA0003959078660000048
Figure FDA0003959078660000048
对于由N个数据点所组成的有限时间序列数据,模糊熵最终可以表示为For a finite time series data consisting of N data points, the fuzzy entropy can finally be expressed as
Figure FDA0003959078660000049
Figure FDA0003959078660000049
其中,FuzzyEn(m,r,N)表示模糊熵。Among them, FuzzyEn(m,r,N) represents fuzzy entropy.
7.根据权利要求3所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述非线性特征中赫斯特指数的计算包括以下步骤7. The method for identifying awkward working postures of construction workers based on multimodal data fusion according to claim 3 is characterized in that the calculation of the Hurst exponent in the nonlinear feature includes the following steps:S2.3.1、根据由n个数据点所组成的原始脑电数据序列x(1),x(2),……,x(n),计算平均值为S2.3.1. Based on the original EEG data sequence x(1), x(2), ..., x(n) consisting of n data points, the average value is calculated as
Figure FDA0003959078660000051
Figure FDA0003959078660000051
S2.3.2、计算出前t个数与平均值m的偏差之和wtS2.3.2. Calculate the sum of the deviations of the first t numbers from the average value m as wt :
Figure FDA0003959078660000052
Figure FDA0003959078660000052
S2.3.3、计算偏差之和的最大值和最小值的差R(n)为S2.3.3. Calculate the difference between the maximum and minimum values of the sum of deviations R(n) asR(n)=max(0,w1,w2,…,wn)-min(0,w1,w2,…,wn)R(n)=max(0,w1 ,w2 ,…,wn )-min(0,w1 ,w2 ,…,wn )S2.3.4、计算赫斯特指数H为S2.3.4. Calculate the Hurst index H as
Figure FDA0003959078660000053
Figure FDA0003959078660000053
其中,S(n)为原始脑电数据序列的标准差。Among them, S(n) is the standard deviation of the original EEG data sequence.
8.根据权利要求1所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述步骤S4中,得到姿势状态特征的方法包括以下步骤8. The method for identifying awkward working postures of construction workers based on multimodal data fusion according to claim 1 is characterized in that: in step S4, the method for obtaining posture state features comprises the following stepsS4.1、对于每个时间窗口所对应的姿势图像,识别人体的33个主要点位;S4.1. For each posture image corresponding to each time window, identify 33 main points of the human body;S4.2、识别人体边缘,并沿边缘剪裁图片尺寸;S4.2, identifying the edge of the human body and cropping the image size along the edge;S4.3、定位人体中心点为形心,选取包含所有33个主要点位的最小正方形框,将裁剪后的姿势图像定位于三维空间中;S4.3, locate the center point of the human body as the centroid, select the smallest square box containing all 33 main points, and locate the cropped posture image in three-dimensional space;S4.4、以最小正方形框的右上角点为原点建立三维直角坐标系;S4.4. Establish a three-dimensional rectangular coordinate system with the upper right corner of the smallest square frame as the origin;S4.5、进一步确定33个主要点位的空间坐标(x,y,z)与可见性坐标(v),将空间坐标作为被监测者的姿势状态特征。S4.5. Further determine the spatial coordinates (x, y, z) and visibility coordinates (v) of 33 main points, and use the spatial coordinates as the posture state features of the monitored person.9.根据权利要求1所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述步骤S6中,BP神经网络的训练方法包括模型训练数据集的准备方法和BP神经网络的搭建与训练方法,其中模型训练数据集的准备方法包括以下步骤9. The method for recognizing awkward working postures of construction workers based on multimodal data fusion according to claim 1 is characterized in that: in step S6, the training method of the BP neural network includes a method for preparing a model training data set and a method for building and training the BP neural network, wherein the method for preparing a model training data set includes the following steps:S6.1.1、召集施工现场建筑工人,为建筑工人佩戴上各项用于采集多模态数据的设备以及传感器;S6.1.1. Gather construction workers at the construction site and equip them with various devices and sensors for collecting multimodal data;S6.1.2、施工工人在不同的尴尬工作姿势下进行模拟施工作业;S6.1.2, Construction workers perform simulated construction work in different awkward working postures;S6.1.3、各项用于采集多模态数据的设备以及传感器对多模态数据进行采集和同步导出;S6.1.3. Various devices and sensors for collecting multimodal data to collect and synchronously export multimodal data;S6.1.4、对已采集的多模态数据,实施相应预处理、特征提取以及融合,最终获得处于尴尬工作姿势下建筑工人的多模态数据融合特征;S6.1.4. Perform corresponding preprocessing, feature extraction and fusion on the collected multimodal data, and finally obtain the multimodal data fusion features of the construction workers in awkward working postures;S6.1.5、获得用于训练BP神经网络的训练数据集。S6.1.5. Obtain a training data set for training the BP neural network.10.根据权利要求9所述的基于多模态数据融合的建筑工人尴尬工作姿势识别方法,其特征在于:所述BP神经网络的搭建与训练方法包括以下步骤10. The method for identifying awkward working postures of construction workers based on multimodal data fusion according to claim 9 is characterized in that: the method for building and training the BP neural network comprises the following steps:S6.2.1、将多模态数据融合特征作为输入层,隐藏层的神经元数量设置为500,设置各个尴尬工作姿势所对应的标签;S6.2.1. Use the multimodal data fusion features as the input layer, set the number of neurons in the hidden layer to 500, and set the labels corresponding to each awkward working posture;S6.2.2、初始化BP神经网络的超参数,各层之间的单向全链接的权值设置为[-1,1]之间的随机数;S6.2.2. Initialize the hyperparameters of the BP neural network. The weights of the one-way full links between each layer are set to random numbers between [-1, 1].S6.2.3、将已标记完成的不同尴尬工作姿势下所对应的训练数据输入至BP神经网络;S6.2.3, inputting the marked training data corresponding to different awkward working postures into the BP neural network;S6.2.4、对神经网络的学习率α和各个超参数进行不断调整,直至得到最小的误差值,获得最小误差值所对应的学习率;S6.2.4. Continuously adjust the learning rate α and various hyperparameters of the neural network until the minimum error value is obtained, and obtain the learning rate corresponding to the minimum error value;S6.2.5、最终获得已训练完成的适用于尴尬工作姿势识别的BP神经网络。S6.2.5. Finally, a trained BP neural network suitable for awkward working posture recognition is obtained.
CN202211474969.3A2022-11-232022-11-23Embarrassing working gesture recognition method for construction workers based on multi-mode data fusionActiveCN116115239B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202211474969.3ACN116115239B (en)2022-11-232022-11-23Embarrassing working gesture recognition method for construction workers based on multi-mode data fusion

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202211474969.3ACN116115239B (en)2022-11-232022-11-23Embarrassing working gesture recognition method for construction workers based on multi-mode data fusion

Publications (2)

Publication NumberPublication Date
CN116115239Atrue CN116115239A (en)2023-05-16
CN116115239B CN116115239B (en)2025-01-07

Family

ID=86296283

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202211474969.3AActiveCN116115239B (en)2022-11-232022-11-23Embarrassing working gesture recognition method for construction workers based on multi-mode data fusion

Country Status (1)

CountryLink
CN (1)CN116115239B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN117095465A (en)*2023-10-192023-11-21华夏天信智能物联(大连)有限公司Coal mine safety supervision method and system
CN119337252A (en)*2024-12-202025-01-21浙江工商大学 A method and system for monitoring Internet addiction behavior based on multimodal data fusion

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR20160119408A (en)*2015-04-032016-10-13경북대학교 산학협력단Mornitoring system for near miss in workplace and Mornitoring method using thereof
CN108433728A (en)*2018-03-062018-08-24大连理工大学A method of million accidents of danger are fallen based on smart mobile phone and ANN identification construction personnel
CN109919036A (en)*2019-01-182019-06-21南京理工大学 Worker pose classification method based on time domain analysis deep network
CN114469024A (en)*2021-12-232022-05-13广东建采网科技有限公司Construction worker safety early warning method and system based on smart band
US20220156965A1 (en)*2020-11-162022-05-19Waymo LlcMulti-modal 3-d pose estimation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR20160119408A (en)*2015-04-032016-10-13경북대학교 산학협력단Mornitoring system for near miss in workplace and Mornitoring method using thereof
CN108433728A (en)*2018-03-062018-08-24大连理工大学A method of million accidents of danger are fallen based on smart mobile phone and ANN identification construction personnel
CN109919036A (en)*2019-01-182019-06-21南京理工大学 Worker pose classification method based on time domain analysis deep network
US20220156965A1 (en)*2020-11-162022-05-19Waymo LlcMulti-modal 3-d pose estimation
CN114469024A (en)*2021-12-232022-05-13广东建采网科技有限公司Construction worker safety early warning method and system based on smart band

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIAYU CHEN 等: "Construction worker\'s awkward posture recognition through supervised motion tensor decomposition", AUTOMATION IN CONSTRUCTION, 31 December 2017 (2017-12-31)*

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN117095465A (en)*2023-10-192023-11-21华夏天信智能物联(大连)有限公司Coal mine safety supervision method and system
CN117095465B (en)*2023-10-192024-02-06华夏天信智能物联(大连)有限公司Coal mine safety supervision method and system
CN119337252A (en)*2024-12-202025-01-21浙江工商大学 A method and system for monitoring Internet addiction behavior based on multimodal data fusion

Also Published As

Publication numberPublication date
CN116115239B (en)2025-01-07

Similar Documents

PublicationPublication DateTitle
Mekruksavanich et al.Exercise activity recognition with surface electromyography sensor using machine learning approach
CN114676956B (en) Elderly fall risk warning system based on multi-dimensional data fusion
CN109009017B (en) An intelligent health monitoring system and data processing method thereof
Khodabandelou et al.Attention-based gated recurrent unit for gesture recognition
CN106022213A (en)Human body motion recognition method based on three-dimensional bone information
CN107506706A (en)A kind of tumble detection method for human body based on three-dimensional camera
CN104269025B (en)Wearable single node feature and the position choosing method of monitoring is fallen down towards open air
Jensen et al.Classification of kinematic swimming data with emphasis on resource consumption
CN116115239B (en)Embarrassing working gesture recognition method for construction workers based on multi-mode data fusion
CN116269355B (en)Safety monitoring system based on figure gesture recognition
CN117883074A (en)Parkinson's disease gait quantitative analysis method based on human body posture video
CN115316982B (en)Multi-mode sensing-based intelligent detection system and method for muscle deformation
CN111089604A (en)Body-building exercise identification method based on wearable sensor
Martínez-Villaseñor et al.Deep learning for multimodal fall detection
CN114983434A (en)System and method based on multi-mode brain function signal recognition
CN118822947A (en) Breathing pattern detection method, device and controller based on deep learning neural network
Li et al.A deep cybersickness predictor through kinematic data with encoded physiological representation
Ghobadi et al.A robust automatic gait monitoring approach using a single IMU for home-based applications
CN119851421A (en)Household safety monitoring and early warning system for old people
CN104331705B (en)Automatic detection method for gait cycle through fusion of spatiotemporal information
EP3922176A1 (en)Physical health condition image analysis device, method, and system
Mesanza et al.Machine learning based fall detector with a sensorized tip
CN114881079A (en) Method and system for abnormal detection of human motion intention for wearable sensor
Huang et al.Human behavior recognition based on motion data analysis
CN118606767A (en) A multimodal signal fatigue detection method based on 3D layered convolution fusion

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp