CN116584955A

Movatterモバイル変換

Info

Publication number: CN116584955A
Application number: CN202310477389.8A
Authority: CN
Inventors: 闫镔; 李中锐; 童莉; 曾颖; 张融恺
Original assignee: PLA Information Engineering University
Current assignee: PLA Information Engineering University
Priority date: 2023-04-27
Filing date: 2023-04-27
Publication date: 2023-08-15

Abstract

The invention belongs to the technical field of electroencephalogram data processing, and discloses an electroencephalogram cognitive load assessment method and an electroencephalogram cognitive load assessment system based on a multi-feature-domain attention network. The experimental result shows that the network constructed by the invention has the highest classification accuracy compared with other four popular networks, thereby effectively improving the evaluation performance of the brain electricity cognitive load.

Description

Translated fromChinese

一种基于多特征域注意力网络的脑电认知负荷评估方法及系统An EEG cognitive load assessment method based on multi-feature domain attention network and itssystem

技术领域technical field

本发明涉及脑电数据处理技术领域，尤其涉及一种基于多特征域注意力网络的脑电认知负荷评估方法及系统。The invention relates to the technical field of EEG data processing, in particular to an EEG cognitive load evaluation method and system based on a multi-feature domain attention network.

背景技术Background technique

脑电信号作为一种神经反应能及时地反映操作人员在进行任务时的认知状态，其中认知负荷的高低将直接决定操作人员的主观能动性和生产效率。因此进行操作人员认知负荷的评估以避免负荷过载和欠载的情况对于安全生产至关重要。然而，脑电信号的微弱性和非平稳性限制了认知负荷评估的性能。当前主流的特征提取方法大多采用单一融合方法，无法实现高层次特征的提取。因此从多方面考虑来提高特征的融合性和表征能力对于认知负荷评估的短时需求和泛化性具有重要意义。As a neural response, EEG signals can reflect the cognitive state of the operator in a timely manner, and the level of cognitive load will directly determine the subjective initiative and production efficiency of the operator. Therefore, it is very important to evaluate the operator's cognitive load to avoid overload and underload for safe production. However, the weakness and non-stationarity of EEG signals limit the performance of cognitive load assessment. Most of the current mainstream feature extraction methods use a single fusion method, which cannot achieve high-level feature extraction. Therefore, it is of great significance to improve the fusion and representation ability of features from various aspects for the short-term demand and generalization of cognitive load assessment.

发明内容Contents of the invention

本发明针对当前主流的特征提取方法大多采用单一融合方法，无法实现高层次特征的提取的问题，提出一种基于多特征域注意力网络的脑电认知负荷评估方法及系统，基于多通道脑电图信号的时频空结构，提出了一种用于认知负荷评估的多特征域注意力网络，按照认知任务的神经机制，将脑电数据整合为三维特征输入，并通过多尺度卷积和跨联式的Swin-Transformer结构提取局部和全局特征，最后通过平均池化操作进行降维，并利用全连接层进行分类。Aiming at the problem that most of the current mainstream feature extraction methods use a single fusion method, which cannot realize the extraction of high-level features, the present invention proposes an EEG cognitive load evaluation method and system based on a multi-feature domain attention network, based on a multi-channel brain According to the time-frequency-space structure of electrogram signals, a multi-feature domain attention network for cognitive load assessment is proposed. According to the neural mechanism of cognitive tasks, EEG data are integrated into three-dimensional feature inputs, and multi-scale convolution The product and cross-connected Swin-Transformer structure is used to extract local and global features, and finally the dimensionality is reduced through the average pooling operation, and the fully connected layer is used for classification.

为了实现上述目的，本发明采用以下技术方案：In order to achieve the above object, the present invention adopts the following technical solutions:

本发明一方面提出一种基于多特征域注意力网络的脑电认知负荷评估方法，所述多特征域注意力网络包括多尺度卷积、跨联式的Swin-Transformer结构、平均池化层、全连接层，该方法包括：On the one hand, the present invention proposes an EEG cognitive load assessment method based on a multi-feature domain attention network, which includes multi-scale convolution, cross-connected Swin-Transformer structure, and average pooling layer , fully connected layer, the method includes:

步骤1：将原始脑电数据按照大脑处理任务的机制，在空间、时间和频域维度上进行三维特征输入；Step 1: Input the original EEG data into three-dimensional features in the dimensions of space, time and frequency domain according to the mechanism of brain processing tasks;

步骤2：通过多尺度卷积提取不同感受野上的局部特征，增加特征候选集的丰富性；Step 2: Extract local features on different receptive fields through multi-scale convolution to increase the richness of feature candidate sets;

步骤3：通过跨联式的Swin-Transformer结构将低层的注意力特征跨越式的与高层注意力特征相结合，以学习到不同尺寸的多样性注意力特征；Step 3: Combining the low-level attention features with the high-level attention features through the cross-connected Swin-Transformer structure to learn diverse attention features of different sizes;

步骤4：通过平均池化层进行降维，并利用全连接层进行分类；Step 4: Dimensionality reduction through the average pooling layer, and classification using the fully connected layer;

进一步地，在所述步骤1之前，还包括：Further, before the step 1, it also includes:

设计模拟任务，包括：从公开数据集SynISAR的模拟图像中选取45％～55％、65％～75％、85％～95％的掩蔽率图像作为低、中、高三个工作量等级，不同目标在同一掩蔽率下的图像旋转角度保持一致，选取6个飞机模型进行正式实验，且每种模型在单个工作量等级下选取45张图片，共选取45×6×3张掩蔽图像，采用g.HIamp EEG采集系统采集EEG信号。Design simulation tasks, including: selecting 45%-55%, 65%-75%, and 85%-95% of the masking rate images from the simulated images of the public dataset SynISAR as three workload levels of low, medium, and high, with different goals The rotation angle of the image under the same masking rate is consistent, and 6 aircraft models are selected for formal experiments, and 45 images are selected for each model under a single workload level, and a total of 45 × 6 × 3 masking images are selected, using g. The HIamp EEG acquisition system acquires EEG signals.

进一步地，所述步骤1包括：Further, said step 1 includes:

绘制模拟任务中图像判别按键前1s内神经反应的各导联的时频图。首先利用短时傅里叶变换进行不同电极通道脑电数据的时频图转换。频谱图由MALTAB自带的频谱函数获得，并通过汉明窗长度16，重叠窗口长度13和傅里叶变换点数为256的参数设置，获得大小为70×81的单通道时频图，分别表示0～60Hz的频域能量和0～1000ms的时域信息。其中汉明窗用于减少频谱泄露和保持良好的频率分辨率。然后将所有通道的时频图进行叠加，构建为62×70×81的三维输入张量。Draw the time-frequency diagram of each lead of the neural response within 1 s before the image discrimination button in the simulated task. First, short-time Fourier transform is used to transform the time-frequency map of EEG data from different electrode channels. The spectrogram is obtained by the spectral function that comes with MALTAB, and through the Hamming window length of 16, the overlapping window length of 13 and the parameter settings of Fourier transform points of 256, a single-channel time-frequency diagram with a size of 70×81 is obtained, respectively 0~60Hz frequency domain energy and 0~1000ms time domain information. Among them, the Hamming window is used to reduce spectral leakage and maintain good frequency resolution. Then the time-frequency images of all channels are superimposed to construct a 62×70×81 three-dimensional input tensor.

进一步地，所述步骤2中，采用多尺度卷积模块进行局部特征提取，将各个卷积模块进行并行拼接增加特征候选集的丰富性。多尺度卷积处理流程包括1×1卷积层、3×3卷积层、5×5卷积层和最大池化层，不同分支仅包含一个具有唯一规模的卷积核。为了确保不同尺度卷积后特征大小的一致性，首先通过1×1的卷积层进行降维，减少网络参数并集成局部相关性，再根据卷积核的尺寸填充对应的三维张量。Further, in the step 2, multi-scale convolution modules are used for local feature extraction, and each convolution module is spliced in parallel to increase the richness of feature candidate sets. The multi-scale convolution processing flow includes 1×1 convolution layer, 3×3 convolution layer, 5×5 convolution layer and maximum pooling layer. Different branches contain only one convolution kernel with a unique scale. In order to ensure the consistency of the feature size after convolution of different scales, the dimensionality reduction is first performed through the 1×1 convolutional layer, the network parameters are reduced and the local correlation is integrated, and then the corresponding three-dimensional tensor is filled according to the size of the convolution kernel.

进一步地，所述步骤3包括：Further, said step 3 includes:

构建一个跨联式的Swin-Transformer模块来提取层级式的全局注意力特征。共分为3个阶段，每个阶段分别有2个Swin-Transformer块。在阶段中首先将输入的特征矩阵通过Patch分区层分割成不重叠的Patches，并通过Linear Embeding对每个通道的数据进行线性变化。对于给定的C×H×W的输入尺寸，三个阶段的输出特征尺寸分别为C×H/4×W/4、2C×H/8×W/8和4C×H/16×W/16。采用短路连接(skip connection)的方式建立前面Swin-Transformer块与后面层的密集连接(dense connection)。为确保上下层特征图拼接时尺寸的一致性，分别选取合适的卷积尺寸对前面层特征图进行卷积操作后，再对特征层批归一化(batch normalization)后进行拼接操作，以避免数据量级差异。Construct a cross-connected Swin-Transformer module to extract hierarchical global attention features. It is divided into 3 stages, and each stage has 2 Swin-Transformer blocks. In the stage, the input feature matrix is first divided into non-overlapping Patches through the Patch partition layer, and the data of each channel is linearly changed through Linear Embeding. For a given input size of C×H×W, the output feature sizes of the three stages are C×H/4×W/4, 2C×H/8×W/8 and 4C×H/16×W/ 16. A dense connection between the front Swin-Transformer block and the back layer is established by means of a skip connection. In order to ensure the consistency of the size of the upper and lower layer feature maps, select the appropriate convolution size to perform the convolution operation on the front layer feature map, and then perform the splicing operation after the batch normalization of the feature layer to avoid Data magnitude difference.

本发明另一方面提出一种基于多特征域注意力网络的脑电认知负荷评估系统，所述多特征域注意力网络包括多尺度卷积、跨联式的Swin-Transformer结构、平均池化层、全连接层，该系统包括：Another aspect of the present invention proposes an EEG cognitive load assessment system based on a multi-feature domain attention network, the multi-feature domain attention network includes multi-scale convolution, cross-connected Swin-Transformer structure, and average pooling layer, fully connected layer, the system includes:

三维特征矩阵构建模块，用于将原始脑电数据按照大脑处理任务的机制，在空间、时间和频域维度上进行三维特征输入；The three-dimensional feature matrix building block is used to input the original EEG data in three-dimensional feature in space, time and frequency domain dimensions according to the mechanism of brain processing tasks;

多尺度卷积模块，用于通过多尺度卷积提取不同感受野上的局部特征，增加特征候选集的丰富性；The multi-scale convolution module is used to extract local features on different receptive fields through multi-scale convolution, increasing the richness of feature candidate sets;

跨联式Swin-Transformer模块，用于通过跨联式的Swin-Transformer结构将低层的注意力特征跨越式的与高层注意力特征相结合，以学习到不同尺寸的多样性注意力特征；The cross-connected Swin-Transformer module is used to combine the low-level attention features with the high-level attention features through the cross-connected Swin-Transformer structure to learn diverse attention features of different sizes;

分类模块，用于通过平均池化操作进行降维，并利用全连接层进行分类。Classification module for dimensionality reduction via average pooling and classification using fully connected layers.

进一步地，还包括：Further, it also includes:

模拟任务设计模块，用于设计模拟任务，包括：从公开数据集SynISAR的模拟图像中选取45％～55％、65％～75％、85％～95％的掩蔽率图像作为低、中、高三个工作量等级，不同目标在同一掩蔽率下的图像旋转角度保持一致，选取6个飞机模型进行正式实验，且每种模型在单个工作量等级下选取45张图片，共选取45×6×3张掩蔽图像，采用g.HIamp EEG采集系统采集EEG信号。The simulation task design module is used to design simulation tasks, including: selecting 45% to 55%, 65% to 75%, and 85% to 95% of the masking rate images from the simulated images of the public data set SynISAR as low, medium and high three Each workload level, the image rotation angles of different targets under the same masking rate are consistent, 6 aircraft models are selected for formal experiments, and 45 images are selected for each model under a single workload level, a total of 45×6×3 A masked image, using the g.HIamp EEG acquisition system to acquire EEG signals.

进一步地，所述三维特征矩阵构建模块具体用于：Further, the three-dimensional feature matrix construction module is specifically used for:

绘制模拟任务中图像判别按键前1s内神经反应的各导联的时频图，首先利用短时傅里叶变换进行不同电极通道脑电数据的时频图转换，频谱图由MALTAB自带的频谱函数获得，并通过汉明窗长度16，重叠窗口长度13和傅里叶变换点数为256的参数设置，获得大小为70×81的单通道时频图，分别表示0～60Hz的频域能量和0～1000ms的时域信息，然后将所有通道的时频图进行叠加，构建为62×70×81的三维输入张量。To draw the time-frequency diagram of each lead of the neural response within 1 second before the image discrimination button in the simulation task, first use the short-time Fourier transform to convert the time-frequency diagram of the EEG data of different electrode channels. The function is obtained, and through the Hamming window length of 16, the overlapping window length of 13 and the parameter setting of Fourier transform points of 256, a single-channel time-frequency map with a size of 70×81 is obtained, which respectively represent the frequency domain energy and 0-1000ms time-domain information, and then superimpose the time-frequency images of all channels to construct a 62×70×81 three-dimensional input tensor.

进一步地，所述多尺度卷积模块中，采用多尺度卷积进行局部特征提取，将各个卷积进行并行拼接，所述多尺度卷积处理流程包括1×1卷积层、3×3卷积层、5×5卷积层和最大池化层，不同分支仅包含一个具有唯一规模的卷积核。Further, in the multi-scale convolution module, multi-scale convolution is used for local feature extraction, and each convolution is spliced in parallel. The multi-scale convolution processing flow includes 1×1 convolution layer, 3×3 convolution layer Convolutional layer, 5×5 convolutional layer and maximum pooling layer, different branches contain only one convolution kernel with a unique scale.

进一步地，所述跨联式Swin-Transformer模块具体用于：Further, the cross-linked Swin-Transformer module is specifically used for:

构建一个跨联式的Swin-Transformer结构来提取层级式的全局注意力特征，共分为3个阶段，每个阶段分别有2个Swin-Transformer块，在各阶段中首先将输入的特征矩阵通过Patch分区层分割成不重叠的Patches，并通过线性嵌入对每个通道的数据进行线性变化，对于给定的C×H×W的输入尺寸，三个阶段的输出特征尺寸分别为C×H/4×W/4、2C×H/8×W/8和4C×H/16×W/16，采用短路连接的方式建立连接。Construct a cross-linked Swin-Transformer structure to extract hierarchical global attention features. It is divided into 3 stages, each stage has 2 Swin-Transformer blocks. In each stage, the input feature matrix is first passed through The Patch partition layer is divided into non-overlapping Patches, and the data of each channel is linearly changed by linear embedding. For a given input size of C×H×W, the output feature sizes of the three stages are C×H/ 4×W/4, 2C×H/8×W/8 and 4C×H/16×W/16, the connection is established by short-circuit connection.

与现有技术相比，本发明具有的有益效果：Compared with the prior art, the present invention has the beneficial effects:

本发明通过将低层的注意力特征跨越式的与高层注意力特征相结合，可以学习到不同尺寸的多样性注意力特征，便于网络的高级特征组合，可以实现对低、中、高三个负荷等级的有效评估。The present invention can learn diverse attention features of different sizes by combining low-level attention features with high-level attention features in leaps and bounds, which facilitates the combination of high-level features of the network, and can realize three load levels of low, medium and high effective assessment.

通过实验结果发现本发明构建的网络与其他四种流行网络相比获得了最高的分类精确度，从而有效地提高脑电认知负荷的评估性能。Through the experimental results, it is found that the network constructed by the present invention achieves the highest classification accuracy compared with other four popular networks, thereby effectively improving the evaluation performance of EEG cognitive load.

附图说明Description of drawings

图1为本发明实施例一种基于多特征域注意力网络的脑电认知负荷评估方法的基本流程图；Fig. 1 is the basic flowchart of a kind of EEG cognitive load evaluation method based on multi-feature domain attention network according to the embodiment of the present invention;

图2为本发明实施例多特征域注意力网络总体框架示意图；2 is a schematic diagram of the overall framework of the multi-feature domain attention network according to the embodiment of the present invention;

图3为本发明实施例模拟任务实验范式；Fig. 3 is the simulation task experiment paradigm of the embodiment of the present invention;

图4为本发明实施例数据处理流程；Fig. 4 is the data processing procedure of the embodiment of the present invention;

图5为本发明实施例跨联式的Swin-Transformer架构示意图；FIG. 5 is a schematic diagram of a cross-linked Swin-Transformer architecture according to an embodiment of the present invention;

图6为本发明实施例特征可视化分析；Fig. 6 is a visual analysis of the features of the embodiment of the present invention;

图7为本发明实施例一种基于多特征域注意力网络的脑电认知负荷评估系统的架构示意图。FIG. 7 is a schematic diagram of an EEG cognitive load assessment system based on a multi-feature domain attention network according to an embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图和具体的实施例对本发明做进一步的解释说明：The present invention will be further explained below in conjunction with accompanying drawing and specific embodiment:

如图1所示，一种基于多特征域注意力网络的脑电认知负荷评估方法，包括：As shown in Figure 1, an EEG cognitive load assessment method based on multi-feature domain attention network includes:

步骤S101：将原始脑电数据按照大脑处理任务的机制，在空间、时间和频域维度上进行三维特征输入；Step S101: Input the original EEG data into three-dimensional features in the dimensions of space, time and frequency domain according to the brain processing task mechanism;

步骤S102：通过多尺度卷积提取不同感受野上的局部特征，增加特征候选集的丰富性；Step S102: Extracting local features on different receptive fields through multi-scale convolution, increasing the richness of feature candidate sets;

步骤S103：通过跨联式的Swin-Transformer结构将低层的注意力特征跨越式的与高层注意力特征相结合，以学习到不同尺寸的多样性注意力特征；Step S103: Combining low-level attention features with high-level attention features in a leap-forward manner through a cross-connected Swin-Transformer structure, so as to learn diverse attention features of different sizes;

步骤S104：通过平均池化层进行降维，并利用全连接层进行分类。Step S104: Dimensionality reduction is performed through the average pooling layer, and classification is performed using the fully connected layer.

图2所示为多特征域注意力网络模型的总架构图，这是一种基于CNN和Swin-Transformer的串行融合模型。其中CNN因其良好的局部特征表示特性，Swin-Transformer因其层级表达和窗内交互的特性而被纳入。该模型主要包括多尺度卷积、跨联式的Swin-Transformer结构、平均池化层、全连接层。Figure 2 shows the general architecture diagram of the multi-feature domain attention network model, which is a serial fusion model based on CNN and Swin-Transformer. Among them, CNN is included because of its good local feature representation characteristics, and Swin-Transformer is included because of its hierarchical expression and in-window interaction characteristics. The model mainly includes multi-scale convolution, cross-connected Swin-Transformer structure, average pooling layer, and fully connected layer.

进一步地，在步骤S101之前，还包括，模拟任务设计：Further, before step S101, it also includes, simulation task design:

本实施例设计了一项模拟图像识别任务，共包含三个工作量等级，如图3所示。原始实验刺激材料来源于公开数据集SynISAR，共有7个飞机模型图像。为了增加模拟任务的真实性，进一步对图像进行放大、伪彩色填充、图像掩蔽和旋转等操作。从模拟图像中选取45％～55％、65％～75％、85％～95％的掩蔽率图像作为低、中、高三个工作量等级，为确保任务难度的一致性，不同目标在同一掩蔽率下的图像旋转角度保持一致。本实验共选取6个飞机模型进行正式实验，且每种模型在单个工作量等级下选取45张图片，共选取45×6×3张掩蔽图像。如图3所示，模拟实验共包含三组图像识别任务，每组任务针对一个工作量等级，包含6个目标共45×6张图片的判别。In this embodiment, a simulated image recognition task is designed, which includes three workload levels, as shown in FIG. 3 . The original experimental stimulus materials come from the public dataset SynISAR, and there are 7 aircraft model images in total. In order to increase the realism of the simulation task, the image is further enlarged, filled with false color, masked and rotated. Select 45%-55%, 65%-75%, and 85%-95% masking rate images from the simulated images as three workload levels of low, medium, and high. The image rotation angle under the same rate remains the same. In this experiment, a total of 6 aircraft models were selected for formal experiments, and 45 images were selected for each model under a single workload level, and a total of 45 × 6 × 3 masked images were selected. As shown in Figure 3, the simulation experiment includes three groups of image recognition tasks, each group of tasks is aimed at a workload level, including the discrimination of 6 targets with a total of 45×6 pictures.

模拟图像识别任务在实验室进行，任务过程中采用奥地利g.tec公司提供的g.HIamp EEG采集系统采集EEG信号，其中包含62个有效EEG信号通道。EEG系统的电极放置符合国际10-20标准系统，在线采样频率为512Hz，带通滤波器为0.01～00Hz，陷波频率为50Hz。The simulated image recognition task was carried out in the laboratory. During the task, the g.HIamp EEG acquisition system provided by the Austrian g.tec company was used to collect EEG signals, which contained 62 effective EEG signal channels. The electrode placement of the EEG system conforms to the international 10-20 standard system, the online sampling frequency is 512Hz, the band-pass filter is 0.01-00Hz, and the notch frequency is 50Hz.

进一步地，步骤S101包括：Further, step S101 includes:

如图4所示，在预处理阶段使用0.1～60Hz的带通滤波器离线滤除慢漂移和高频噪声。带通滤波器由低通滤波器和高通切比雪夫滤波器组成，其中低通滤波器(阶数，3；阻带起始频率，50Hz；阻带截止频率，60Hz；通带衰减，0.5dB；阻带衰减，5dB)、高通滤波器(阶数，1；阻带起始频率，0.01Hz；阻带截止频率，1Hz；通带衰减，1dB；阻带衰减，10dB)均由MATLAB内置函数获得。均值参考的方法被用于重参考，该方法被认为在可追溯性和脑网络分析中具有更高的准确性。为了减少数据不稳定而造成的基线差异，将刺激呈现前200ms的EEG信号作为基线。损坏导联的数据用相邻导联的均值进行替换。采用独立分量分析的方法去除信号中的眼电伪影。预处理后的EEG数据被下采样到256Hz，以减少数据处理和计算量。刺激呈现后按键选择前1s的脑电数据作为单次样本数据，并将任务的工作量等级作为样本标记，最终获得样本数据量大小为25×3×6×45(人次×工作量等级×目标个数×单目标的图像数量)。As shown in Figure 4, a band-pass filter of 0.1-60 Hz is used in the preprocessing stage to filter out slow drift and high-frequency noise off-line. The band-pass filter is composed of a low-pass filter and a high-pass Chebyshev filter, wherein the low-pass filter (order, 3; stop-band start frequency, 50Hz; stop-band cut-off frequency, 60Hz; pass-band attenuation, 0.5dB ; Stop band attenuation, 5dB), high-pass filter (order, 1; stop band start frequency, 0.01Hz; stop band cutoff frequency, 1Hz; pass band attenuation, 1dB; stop band attenuation, 10dB) are all provided by MATLAB built-in function get. The mean-referenced method was used for re-referencing, which is considered to have higher accuracy in traceability and brain network analysis. In order to reduce the baseline difference caused by data instability, the EEG signal 200 ms before stimulus presentation was used as the baseline. Data from damaged leads are replaced with the mean values of adjacent leads. The oculograph artifacts in the signal were removed by independent component analysis. The preprocessed EEG data were downsampled to 256Hz to reduce data processing and computation. After the stimulation is presented, press the button to select the EEG data in the first 1 second as the single sample data, and mark the workload level of the task as the sample, and finally obtain the sample data size as 25 × 3 × 6 × 45 (person times × workload level × target number × number of images of a single target).

为了充分利用脑电信号每个维度的有用信息，呈现更全面和丰富的脑电特征，本文构建了空域、频域和时域的多域特征表示，从多个角度反映认知任务中的神经活动。如图4所示，在特征提取阶段，首先利用短时傅里叶变换进行不同电极通道脑电数据的时频图转换。频谱图由MALTAB自带的频谱函数获得，并通过汉明窗长度16，重叠窗口长度13和傅里叶变换点数为256的参数设置，获得大小为70×81的单通道时频图，分别表示0～60Hz的频域能量和0～1000ms的时域信息。其中汉明窗用于减少频谱泄露和保持良好的频率分辨率。然后将所有通道的时频图进行叠加，构建为62×70×81的三维输入张量，用于表示EEG的空间、频率和时间特性，有利于网络训练中的多角度特征融合。In order to make full use of the useful information of each dimension of the EEG signal and present more comprehensive and rich EEG features, this paper constructs a multi-domain feature representation in the spatial domain, frequency domain and time domain, which reflects the neural network in cognitive tasks from multiple perspectives. Activity. As shown in Figure 4, in the feature extraction stage, the short-time Fourier transform is first used to convert the time-frequency map of the EEG data of different electrode channels. The spectrogram is obtained by the spectral function that comes with MALTAB, and through the Hamming window length of 16, the overlapping window length of 13 and the parameter settings of Fourier transform points of 256, a single-channel time-frequency diagram with a size of 70×81 is obtained, respectively 0~60Hz frequency domain energy and 0~1000ms time domain information. Among them, the Hamming window is used to reduce spectral leakage and maintain good frequency resolution. Then the time-frequency images of all channels are superimposed to construct a 62×70×81 three-dimensional input tensor, which is used to represent the spatial, frequency and temporal characteristics of EEG, which is beneficial to the multi-angle feature fusion in network training.

本发明在进行认知负荷评估时，尝试构建多角度的脑电信号表示，有效的整合了脑电特征中的空间、时间和频域信息，以有效地学习大脑动态变化之间的联系。When evaluating the cognitive load, the present invention attempts to construct a multi-angle EEG signal representation, effectively integrating the spatial, time and frequency domain information in the EEG features, so as to effectively learn the connection between the dynamic changes of the brain.

进一步地，步骤S102包括：Further, step S102 includes:

为了增加网络中的特征感受野，本实施例采用多尺度卷积进行特征提取，将多个卷积进行并行拼接组成整体网络架构的前半部分。多尺度卷积处理流程包括1×1卷积层、3×3卷积层、5×5卷积层和最大池化层，不同分支仅包含一个具有唯一规模的卷积核。为了确保不同尺度卷积后特征大小的一致性，首先通过1×1的卷积层进行降维，减少网络参数并集成局部相关性，再根据卷积核的尺寸填充对应的三维张量。多分支卷积作为多尺度特征提取的重要结构，有利于从宏观到细节捕捉EEG特征，并提取多样的互补信息。该步骤的输出包含多卷积核提取的综合特征，并将三维信息传输到下一阶段。In order to increase the feature receptive field in the network, this embodiment uses multi-scale convolution for feature extraction, and multiple convolutions are spliced in parallel to form the first half of the overall network architecture. The multi-scale convolution processing flow includes 1×1 convolution layer, 3×3 convolution layer, 5×5 convolution layer and maximum pooling layer. Different branches contain only one convolution kernel with a unique scale. In order to ensure the consistency of the feature size after convolution of different scales, the dimensionality reduction is first performed through the 1×1 convolutional layer, the network parameters are reduced and the local correlation is integrated, and then the corresponding three-dimensional tensor is filled according to the size of the convolution kernel. As an important structure for multi-scale feature extraction, multi-branch convolution is conducive to capturing EEG features from macro to details and extracting diverse complementary information. The output of this step contains the comprehensive features extracted by multiple convolution kernels and transfers the 3D information to the next stage.

进一步地，步骤S103包括：Further, step S103 includes:

为了增加网络中的全局特征信息，并尽可能的减少运算成本。考虑到Swin-Transformer模型使用移动加窗方案将自注意力计算限制到不重叠的局部窗口，以及分层的全局注意力机制带来了更高的网络效率。如图5所示，本实施例构建了一个跳联式的Swin-Transformer结构来提取层级式的全局注意力特征。共分为3个阶段，每个阶段分别有2个Swin-Transformer块。在各阶段中首先将输入的特征矩阵通过Patch分区层分割成不重叠的Patches，并通过Linear Embeding(线性嵌入)对每个通道的数据进行线性变化。对于给定的C×H×W的输入尺寸，三个阶段的输出特征尺寸分别为C×H/4×W/4、2C×H/8×W/8和4C×H/16×W/16。然后分别通过对应的Swin-Transformer块。为了综合利用浅层复杂度低的注意力特征，提高特征的利用率。采用短路连接(skip connection)的方式建立前面层Swin-Transformer块与后面层的密集连接(dense connection)。为确保上下层特征图拼接时尺寸的一致性，分别选取合适的卷积尺寸对前面层特征图进行卷积操作后，再对特征层批归一化(batch normalization)后进行拼接操作，以避免数据量级差异。最后采用1×1的卷积层实现合并后的特征层与本层特征层输出尺寸的一致性。In order to increase the global feature information in the network and reduce the operation cost as much as possible. Considering that the Swin-Transformer model uses a mobile windowing scheme to limit the self-attention calculation to non-overlapping local windows, and the hierarchical global attention mechanism brings higher network efficiency. As shown in Figure 5, this embodiment constructs a jump-connected Swin-Transformer structure to extract hierarchical global attention features. It is divided into 3 stages, and each stage has 2 Swin-Transformer blocks. In each stage, the input feature matrix is first divided into non-overlapping Patches through the Patch partition layer, and the data of each channel is linearly changed through Linear Embedding. For a given input size of C×H×W, the output feature sizes of the three stages are C×H/4×W/4, 2C×H/8×W/8 and 4C×H/16×W/ 16. Then pass through the corresponding Swin-Transformer blocks respectively. In order to comprehensively utilize attention features with low complexity in the shallow layer, the utilization rate of features is improved. The dense connection between the front layer Swin-Transformer block and the back layer is established by means of skip connection. In order to ensure the consistency of the size of the upper and lower layer feature maps, select the appropriate convolution size to perform the convolution operation on the front layer feature map, and then perform the splicing operation after the batch normalization of the feature layer to avoid Data magnitude difference. Finally, a 1×1 convolutional layer is used to achieve the consistency of the output size of the merged feature layer and the feature layer of this layer.

为验证本发明效果，进行如下实验：For verifying effect of the present invention, carry out following experiment:

(1)多特征域注意力网络和其他基线方法对比：(1) Comparison of multi-feature domain attention network and other baseline methods:

为了验证多特征域注意力网络模型的有效性，将该模型与常用的4种主流网络模型进行性能对比，包括CNN模型、CNN-LSTM模型、Transformer模型和Swin-Transformer模型，比较结果如表1所示。其中CNN模型是一种前馈神经网络，在脑电信号分类领域具有较高的识别性能。CNN-LSTM是CNN和LSTM的混合模型，该方法利用卷积层的表示能力和LSTM捕获时间相关性的能力来提取EEG特征。Transformer是一种基于编码器-解码器结构的网络模型，通过特有的注意力机制提取全局EEG特征。Swin-Transformer模型是Ze Liu等人提出的一种新型视觉Transformer，通过层次性、局部性和平移不变性增加了Transformer网络结构的性能。结果表明与其他主流网络相比，多特征域注意力网络模型在三种工作量评估种具有最好的性能。In order to verify the effectiveness of the multi-feature domain attention network model, the performance of this model is compared with four commonly used mainstream network models, including CNN model, CNN-LSTM model, Transformer model and Swin-Transformer model. The comparison results are shown in Table 1 shown. Among them, the CNN model is a feedforward neural network, which has high recognition performance in the field of EEG signal classification. CNN-LSTM is a hybrid model of CNN and LSTM, which exploits the representational power of convolutional layers and the ability of LSTM to capture temporal correlations to extract EEG features. Transformer is a network model based on encoder-decoder structure, which extracts global EEG features through a unique attention mechanism. The Swin-Transformer model is a new type of visual Transformer proposed by Ze Liu et al., which increases the performance of the Transformer network structure through hierarchy, locality and translation invariance. The results show that compared with other mainstream networks, the multi-feature domain attention network model has the best performance in the three workload evaluations.

表1各方法的比较Table 1 Comparison of each method

其他网络的局限性是模型的单一化，忽略了不同维度上和不同区域内特征之间的相关性。由于大脑区域处于非欧几里得空间中，将分布在不规则网格上的脑电通道展平为具有规则网格的二维表示，不能充分准确地反映信号的空间关系。而本文所使用的多特征域注意力网络模型通过层级式结构增加了脑区间不同尺度的注意力特征，且分窗的操作也为网络自身带来了更高的效率。使用Skip connection为注意力机制的空间维度带来了新的交互。因此，本文提出的方法优于其他基线方法，在准确率上分别高于Swin-Transformer和CNN模型11.29％和24.72％，同时高于Transformer和CNN-LSTM模型18.87％和20.96％。The limitation of other networks is the simplification of the model, ignoring the correlation between features in different dimensions and in different regions. Since brain regions are in non-Euclidean space, flattening EEG channels distributed on an irregular grid into a 2D representation with a regular grid cannot adequately and accurately reflect the spatial relationship of signals. The multi-feature domain attention network model used in this paper adds attention features of different scales in the brain region through a hierarchical structure, and the operation of windowing also brings higher efficiency to the network itself. Using the Skip connection brings new interactions to the spatial dimension of the attention mechanism. Therefore, the method proposed in this paper is superior to other baseline methods, and its accuracy is 11.29% and 24.72% higher than the Swin-Transformer and CNN models, and 18.87% and 20.96% higher than the Transformer and CNN-LSTM models.

(2)多特征域注意力网络模型消融实验：(2) Multi-feature domain attention network model ablation experiment:

为了验证每个模块对模型性能的贡献程度，在本发明所创建的数据集上进行了消融实验，结果如表2所示。首先对多尺度卷积模块的卷积核组合策略进行消融实验，分别探索了单尺度卷积核(SSC)、双尺度卷积核(DSC)和多尺度卷积核(MSC)对模型的影响。结果显示在单尺度卷积核中1×1卷积核的性能优于3×3卷积核和5×5卷积核，在双尺度卷积核中1×1卷积核与3×3卷积核的组合策略优于1×1卷积核与5×5卷积核的组合策略和3×3卷积核与5×5卷积核的组合策略，相比于以上两种策略，多尺度卷积核的组合策略性能最优，可达到87.4±0.78的工作量评估准确度。In order to verify the contribution of each module to the model performance, an ablation experiment was carried out on the dataset created by the present invention, and the results are shown in Table 2. First, the ablation experiment was carried out on the convolution kernel combination strategy of the multi-scale convolution module, and the influence of single-scale convolution kernel (SSC), dual-scale convolution kernel (DSC) and multi-scale convolution kernel (MSC) on the model was explored respectively. . The results show that the performance of the 1×1 convolution kernel in the single-scale convolution kernel is better than the 3×3 convolution kernel and the 5×5 convolution kernel, and the 1×1 convolution kernel and the 3×3 convolution kernel in the dual-scale convolution kernel The combination strategy of convolution kernel is better than the combination strategy of 1×1 convolution kernel and 5×5 convolution kernel and the combination strategy of 3×3 convolution kernel and 5×5 convolution kernel. Compared with the above two strategies, The combination strategy of multi-scale convolution kernels has the best performance, which can achieve a workload evaluation accuracy of 87.4±0.78.

表2消融实验Table 2 Ablation experiments

为了测试“短路连接”的影响，单独进行了跨联式的Swin-Transformer结构的性能测试。相较于卷积模块，该结构的性能提升相对较小。对于整个模块来说，局部特征与全局特征的融合策略可以更加准确地反映不同工作量等级之间的差异，证明了本发明所提模型在性能提升方面的优越性。In order to test the influence of "short-circuit connection", the performance test of the cross-connected Swin-Transformer structure was carried out separately. Compared with the convolution module, the performance improvement of this structure is relatively small. For the whole module, the fusion strategy of local features and global features can more accurately reflect the differences between different workload levels, which proves the superiority of the proposed model in terms of performance improvement.

(3)模型可视化分析：(3) Model visualization analysis:

为了进一步分析各模块对认知负荷特征的提取效果，我们进行了模型中不同层时频特征的可视化分析，每层输出通过z-score归一化特征图的注意力强度后绘制对应的特征热图。图6显示了特征提取过程的可视化结果。对于多尺度卷积层我们可以看到，不同尺度的卷积核均着重捕获低频信号特征，而对于中高频信号，不同卷积核的注意区域呈现出多样化和互补性的结果。可见多尺度并行卷积的操作即有效地保留了原始EEG特征，还捕获到EEG特征的多样性和互补性。通过本发明跨联式的Swin-Transformer结构得到的注意区域则呈现出从上下两侧向中间逼近的趋势，相较于卷积模块关注于局部特征的特性，本发明则展示出集成上下文全局特征的优势。In order to further analyze the effect of each module on the extraction of cognitive load features, we performed a visual analysis of the time-frequency features of different layers in the model, and the output of each layer was normalized by z-score to the attention intensity of the feature map, and then the corresponding feature heat was drawn. picture. Figure 6 shows the visualization results of the feature extraction process. For the multi-scale convolution layer, we can see that the convolution kernels of different scales focus on capturing low-frequency signal features, while for medium and high-frequency signals, the attention areas of different convolution kernels show diverse and complementary results. It can be seen that the operation of multi-scale parallel convolution effectively preserves the original EEG features, and also captures the diversity and complementarity of EEG features. The attention area obtained through the cross-connected Swin-Transformer structure of the present invention shows a tendency to approach from the upper and lower sides to the middle. Compared with the feature of the convolution module focusing on local features, the present invention shows the global features of the integrated context. The advantages.

在上述实施例的基础上，如图7所示，本发明还提出一种基于多特征域注意力网络的脑电认知负荷评估系统，所述多特征域注意力网络包括多尺度卷积、跨联式的Swin-Transformer结构、平均池化层、全连接层，该系统包括：On the basis of the above embodiments, as shown in Figure 7, the present invention also proposes an EEG cognitive load assessment system based on a multi-feature domain attention network, the multi-feature domain attention network includes multi-scale convolution, Cross-linked Swin-Transformer structure, average pooling layer, fully connected layer, the system includes:

进一步地，还包括：Further, it also includes:

综上，本发明针对当前主流的特征提取方法大多采用单一融合方法，无法实现高层次特征的提取的问题，提出一种基于多特征域注意力网络的脑电认知负荷评估方法及系统，基于多通道脑电图信号的时频空结构，提出了一种用于认知负荷评估的多特征域注意力网络，按照认知任务的神经机制，将脑电数据整合为三维特征输入，并通过多尺度卷积和跨联式的Swin-Transformer结构提取局部和全局特征，最后通过平均池化操作进行降维，并利用全连接层进行分类。本发明通过将低层的注意力特征跨越式的与高层注意力特征相结合，可以学习到不同尺寸的多样性注意力特征，便于网络的高级特征组合，可以实现对低、中、高三个负荷等级的有效评估。通过实验结果发现本发明构建的网络与其他四种流行网络相比获得了最高的分类精确度，从而有效地提高脑电认知负荷的评估性能。To sum up, the present invention aims at the problem that most of the current mainstream feature extraction methods use a single fusion method, which cannot realize the extraction of high-level features, and proposes an EEG cognitive load evaluation method and system based on multi-feature domain attention networks, based on The time-frequency-space structure of multi-channel EEG signals, a multi-feature domain attention network for cognitive load assessment is proposed, according to the neural mechanism of cognitive tasks, the EEG data is integrated into three-dimensional feature input, and passed Multi-scale convolution and cross-connected Swin-Transformer structure extract local and global features, and finally perform dimensionality reduction through average pooling operations, and use fully connected layers for classification. The present invention can learn diverse attention features of different sizes by combining low-level attention features with high-level attention features in leaps and bounds, which facilitates the combination of high-level features of the network, and can realize three load levels of low, medium and high effective assessment. Through the experimental results, it is found that the network constructed by the present invention achieves the highest classification accuracy compared with other four popular networks, thereby effectively improving the evaluation performance of EEG cognitive load.

以上所示仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。What is shown above is only a preferred embodiment of the present invention. It should be pointed out that for those of ordinary skill in the art, some improvements and modifications can also be made without departing from the principles of the present invention. It should be regarded as the protection scope of the present invention.

Claims

Translated fromChinese

1.一种基于多特征域注意力网络的脑电认知负荷评估方法，其特征在于，所述多特征域注意力网络包括多尺度卷积、跨联式的Swin-Transformer结构、平均池化层、全连接层，该方法包括：1. A method for assessing EEG cognitive load based on a multi-feature domain attention network, characterized in that the multi-feature domain attention network includes multi-scale convolution, cross-connected Swin-Transformer structure, and average pooling layer, fully connected layer, the method includes:

步骤4：通过平均池化层进行降维，并利用全连接层进行分类。Step 4: Dimensionality reduction by average pooling layer and classification by fully connected layer.

2.根据权利要求1所述的一种基于多特征域注意力网络的脑电认知负荷评估方法，其特征在于，在所述步骤1之前，还包括：2. a kind of EEG cognitive load evaluation method based on multi-feature domain attention network according to claim 1, is characterized in that, before described step 1, also comprises:

3.根据权利要求2所述的一种基于多特征域注意力网络的脑电认知负荷评估方法，其特征在于，所述步骤1包括：3. a kind of EEG cognitive load assessment method based on multi-feature domain attention network according to claim 2, is characterized in that, described step 1 comprises:

4.根据权利要求1所述的一种基于多特征域注意力网络的脑电认知负荷评估方法，其特征在于，所述步骤2中，采用多尺度卷积进行局部特征提取，将各个卷积进行并行拼接，所述多尺度卷积处理流程包括1×1卷积层、3×3卷积层、5×5卷积层和最大池化层，不同分支仅包含一个具有唯一规模的卷积核。4. a kind of EEG cognitive load evaluation method based on multi-feature domain attention network according to claim 1, is characterized in that, in described step 2, adopt multi-scale convolution to carry out local feature extraction, each volume The multi-scale convolution processing flow includes 1×1 convolution layer, 3×3 convolution layer, 5×5 convolution layer and maximum pooling layer. Different branches only contain a convolution with a unique scale. Accumulation.

5.根据权利要求1所述的一种基于多特征域注意力网络的脑电认知负荷评估方法，其特征在于，所述步骤3包括：5. a kind of EEG cognitive load assessment method based on multi-feature domain attention network according to claim 1, is characterized in that, described step 3 comprises:

6.一种基于多特征域注意力网络的脑电认知负荷评估系统，其特征在于，所述多特征域注意力网络包括多尺度卷积、跨联式的Swin-Transformer结构、平均池化层、全连接层，该系统包括：6. An EEG cognitive load assessment system based on a multi-feature domain attention network, characterized in that the multi-feature domain attention network includes multi-scale convolution, cross-connected Swin-Transformer structure, and average pooling layer, fully connected layer, the system includes:

7.根据权利要求6所述的一种基于多特征域注意力网络的脑电认知负荷评估系统，其特征在于，还包括：7. a kind of EEG cognitive load assessment system based on multi-feature domain attention network according to claim 6, is characterized in that, also comprises:

8.根据权利要求7所述的一种基于多特征域注意力网络的脑电认知负荷评估系统，其特征在于，所述三维特征矩阵构建模块具体用于：8. a kind of EEG cognitive load evaluation system based on multi-feature domain attention network according to claim 7, is characterized in that, described three-dimensional feature matrix construction module is specifically used for:

9.根据权利要求6所述的一种基于多特征域注意力网络的脑电认知负荷评估系统，其特征在于，所述多尺度卷积模块中，采用多尺度卷积进行局部特征提取，将各个卷积进行并行拼接，所述多尺度卷积处理流程包括1×1卷积层、3×3卷积层、5×5卷积层和最大池化层，不同分支仅包含一个具有唯一规模的卷积核。9. a kind of EEG cognitive load assessment system based on multi-feature domain attention network according to claim 6, is characterized in that, in described multi-scale convolution module, adopt multi-scale convolution to carry out local feature extraction, Each convolution is spliced in parallel. The multi-scale convolution processing flow includes 1×1 convolution layer, 3×3 convolution layer, 5×5 convolution layer and maximum pooling layer. Different branches contain only one with a unique The size of the convolution kernel.

10.根据权利要求6所述的一种基于多特征域注意力网络的脑电认知负荷评估系统，其特征在于，所述跨联式Swin-Transformer模块具体用于：10. a kind of EEG cognitive load evaluation system based on multi-feature domain attention network according to claim 6, is characterized in that, described straddle type Swin-Transformer module is specifically used for: