CN119564205B

Movatterモバイル変換

Info

Publication number: CN119564205B
Application number: CN202411521011.4A
Authority: CN
Inventors: 胡斌; 沈健; 董群喜; 田福泽; 王康; 朱可欣; 胡文博
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2024-10-29
Filing date: 2024-10-29
Publication date: 2025-10-03
Anticipated expiration: 2044-10-29
Also published as: CN119564205A

Abstract

The invention discloses a depressive disorder recognition system based on multi-modal domain adaptation, which realizes dynamic fusion and parameter optimization between signals by modeling the interaction relation of three signals of brain electricity, skin electricity and electrocardio into a deep learning frame. The method fully utilizes complementary information of the multi-mode signals, enables the model to extract more discriminative features from complex physiological data, effectively improves accuracy and robustness of identifying depressive disorder, successfully digs out cross-domain common features by drawing source domain and target domain data distribution twice, and effectively improves adaptability of the model in different data domains by using the cross-domain alignment strategy and remarkably reduces negative effects caused by domain offset. In general, the identification method provided by the invention has obvious advantages in improving the detection accuracy of depressive disorder and enhancing the generalization capability of the model.

Description

Depressive disorder recognition system based on multi-modal domain adaptation

Technical Field

The invention belongs to the technical fields of intelligent medicine, psychophysiological calculation, transfer learning and the like, and particularly relates to a depressive disorder recognition system based on multi-modal domain adaptation.

Background

Depressive disorder is a mental disorder characterized by a sustained fall in mood and a decline in interest. With the increasing of the life pressure and the vigorous competition of the modern society, the incidence of depressive disorder is obviously increased, and the depressive disorder becomes an important public health problem in the global scope. It is counted that more than 3.5 hundred million people worldwide are afflicted with depressive disorders, severely affecting the quality of life and the ability of the individual to work. Depressive disorders not only result in significant impairment of emotional, cognitive and behavioral functions, but are often accompanied by other mental or physical disorders, placing a great burden on society and economy. Although the existing clinical means can treat the depressive disorder to a certain extent, the disease condition of many patients cannot be found and intervened in time due to the lack of effective objective diagnosis standards and early recognition methods, and the treatment effect is affected. Therefore, developing a high-efficiency and objective method for identifying and evaluating depressive disorder has important significance for improving the accuracy of disease identification, making personalized treatment schemes and relieving social and economic burden caused by depressive disorder.

Depressive disorders affect not only the emotional state of an individual, but also profoundly affect the mechanism of emotional regulation. Affective modulation is driven by a complex interaction between the central nervous system and the autonomic nervous system. When an individual is subjected to an emotional stimulus, the sensory system captures the relevant information and initiates a series of physiological responses. These responses regulate physiological functions such as heart rate, blood pressure and respiration through the sympathetic and parasympathetic nervous systems in the autonomic nervous system, and also elicit responses in the central nervous system that reflect the emotional state of the individual. Because of the influence of depressive disorder on emotion regulation mechanism, research of physiological signals of depressive disorder has important significance. Electroencephalogram signals, skin electrical signals, electrocardiosignals and the like are all important physiological data sources for researching depressive disorder. The brain electrical signals are able to directly capture electrical activity of the central nervous system, providing detailed information about brain functional dynamics and neural network activity. The skin electrical signal can reflect the activity of the autonomic nervous system, revealing the effect of the emotional state on the skin conductivity. The electrocardiosignals provide key information about heart activity and indirectly reflect the regulating effect of emotion on the autonomic nervous system. While each physiological signal provides unique and valuable information, a single modality signal is often insufficient to fully capture an individual's complex emotional state. In order to improve the identification accuracy and reliability of depressive disorder, it is important to comprehensively analyze multi-mode data such as brain electric signals, skin electric signals, electrocardiosignals and the like.

Although the use of multimodal physiological signals in the identification of depressive disorders has advanced, there are still a number of challenges in how to effectively fuse data of different modalities and address challenges presented by individual variability. The brain electrical signal, the skin electrical signal, the electrocardiosignal and the like have unique physiological characteristics, and the interrelation between the brain electrical signal, the skin electrical signal, the electrocardiosignal and the like is complex and changeable. Multi-modal domain adaptation techniques offer new possibilities to address these challenges. The technology aims at realizing effective integration between signals and improving the accuracy of identifying depressive disorder by effectively fusing data of different modes and carrying out feature alignment on different domains. By learning and optimizing the common characteristics among different domains, the adaptability of the depressive disorder recognition model can be improved. However, achieving this goal also faces challenges such as how to guarantee consistency of cross-modal information, reduce noise interference, and promote generalization ability of the model over diverse data. In addition, existing studies often ignore potential correlations and synergies between multimodal data, failing to take full advantage of the complementarity of these data to enhance the effect of depressive disorder recognition. Therefore, the invention focuses on deep exploration of interrelationships and synergistic effects among different modal data, enhances the accuracy and the robustness of depressive disorder identification by optimizing a multi-modal domain adaptation strategy, and provides more comprehensive and scientific support for clinical diagnosis.

Disclosure of Invention

In view of the above, the present invention aims to provide a depressive disorder recognition system based on multi-modal domain adaptation, which is capable of improving fusion capability between different data modalities and feature alignment capability between different domains by collecting multi-modal data including electroencephalogram signals, skin electric conduction and electrocardiosignals and combining multi-modal domain adaptation technology, so as to improve accuracy and robustness of depressive disorder recognition. The method utilizes a deep migration learning algorithm to realize optimization in a multi-modal feature space, enhances the recognition capability of depressive disorder, adapts to physiological differences of different individuals, and effectively supports real-time analysis and diagnosis.

A depressive disorder recognition system based on multi-modal domain adaptation comprises a data acquisition module, a feature extraction module, a multi-modal domain adaptation module, a neural network classification module and a model training and depression detection module;

The data acquisition module is used for acquiring three mode signals of brain electricity, skin electricity and electrocardio under the state of rest and audio stimulus;

the characteristic extraction module is used for extracting characteristics of electroencephalogram, skin electricity and electrocardiosignal;

The multi-modal domain adaptation module is configured to:

dividing the three mode signal data into source domain data and target domain data;

For each source domain and target domain data, using the brain electrical signal characteristics, skin electrical signal characteristics and electrocardiosignal characteristics to form tensors, and then carrying out CP decomposition on the tensors, wherein the tensors are specifically expressed as:

wherein S represents a source domain feature tensor, T represents a target domain feature tensor, R is a decomposition rank and represents the number of components used in decomposition; represents the corresponding scalar weights when the source domain features are decomposed,Respectively representing the interaction characteristics of the electroencephalogram signal, the skin electric signal and the electrocardiosignal on different ranks when the source domain characteristics are decomposed, which are the parameters to be learned, and correspondingly,Represents the scalar weight corresponding to the decomposition of the target domain feature,The interaction characteristics of the electroencephalogram signal, the skin electric signal and the electrocardiosignal on different ranks when the source and target domain characteristics are decomposed are respectively represented, and the interaction characteristics are parameters to be learned.

After tensor decomposition, the first alignment loss is calculated:

Wherein phi () is a feature mapping function representing mapping the original feature space into a higher dimensional space, S_i represents the ith sample feature in the source domain feature S, T_j represents the jth sample feature in the target domain feature T, n_S and n_T are the source domain and target domain sample numbers, respectively;

relevant information between a source domain and a target domain is extracted through an attention mechanism, and the method comprises the following specific steps:

constructing a new target domain feature T' =t+a containing commonality information;

Wherein Q is a query vector for searching for related information, K is a key vector for matching with the query vector to find information related thereto, V is a value vector representing information on the match which is information of ultimate interest, d_k is a dimension of the key;

For the input source domain feature S and target domain feature T, obtaining corresponding query vectors and key vector number vectors through linear transformation:

Q_S＝SW_Q,K_S＝SW_K,V_S＝SW_V;

Q_T＝TW_Q,K_T＝TW_K,V_T＝TW_V;

Wherein, W_Q、W_K and W_V are weight matrices to be learned;

Constructing a new source domain feature S' =s+b containing commonality information;

The second alignment loss is calculated and expressed as:

Wherein S '_i represents the i-th sample feature in the new source domain feature S', T '_j represents the j-th sample feature in the new target domain feature T', and n_S′ and n_T′ are the new source domain and target domain sample numbers, respectively;

the neural network classification module is used for classifying the input characteristic data to obtain a classification result;

The model training and depression detection module is used for:

1) Model training:

the source domain data is input to a neural network classification module, and the classification cross entropy loss is calculated according to the obtained classification result:

Where y_i is the actual tag of the feature data,Is a predictive tag;

finally, a total loss function is calculated by combining dynamic fusion and cross-domain alignment to optimize the whole model:

L=λ₁L_mmd1+λ₂L_mmd2+L_cls

Wherein λ₁、λ₂ is a hyper-parameter for balancing the effects of different penalty terms;

Training a neural network classification module and parameters to be learned based on the loss function L combining dynamic fusion and cross-domain alignment, and updating model weights through a back propagation algorithm to finally obtain an optimized depression detection system.

2) Depression detection:

in practical application, the acquired new tested multi-mode signals are input into the feature extraction module for preprocessing, and then are input into the trained neural network classification module to generate classification results.

2. The depressive disorder recognition system based on multi-modal domain adaptation of claim 1, wherein the model training and depression detection module further comprises detection of a neural network classification module:

and detecting the classification performance of the neural network classification module by adopting the source domain data and the corresponding labels.

Further, the device also comprises a data preprocessing module, which comprises the steps of carrying out power frequency denoising, band-pass filtering, artifact removing and downsampling on the electroencephalogram signals, carrying out low-pass filtering and smoothing on the skin electric signals, and carrying out power frequency denoising, band-pass filtering and smoothing on the electrocardiosignals so as to improve the signal quality.

Preferably, the feature extraction module performs manual feature extraction based on priori knowledge on the electroencephalogram signal, the skin electric signal and the electrocardiosignal signal respectively, and then extracts depth features through a transducer encoder.

Further, the feature extraction module is further configured to:

Three types of linear features and three types of nonlinear features are extracted from the electroencephalogram signals, wherein the linear features comprise average values, power spectrum densities and center frequencies, and the nonlinear features comprise sample entropy, approximate entropy and Lyapunov indexes.

Further, the feature extraction module is further configured to:

heart rate variability, frequency ratio and fractal dimension features are extracted for the electrocardiographic signals.

Further, the feature extraction module is further configured to:

skin conductance level, peak number and variance features are extracted for the skin electrical signal.

Preferably, the neural network classification module is composed of a plurality of fully-connected layers, and an activation function is connected to the back of each fully-connected layer.

Furthermore, the neural network classification module also introduces batch normalization and Dropout technologies to accelerate convergence and stabilize the training process.

The invention has the following beneficial effects:

The invention aims to solve the challenges of low recognition rate of single-mode physiological signals in detection of depressive disorder and individual differences. These challenges are particularly pronounced in depressive disorder identification due to the significant differences in physiological signal characteristics among different individuals. Therefore, a depressive disorder recognition method based on multi-modal domain adaptation is proposed. The domain refers in the present invention to the physiological signal feature space of different individuals. By aligning the feature spaces between these different individuals, the multi-modal domain adaptation can overcome the influence of individual differences on the recognition effect, thereby improving the accuracy and robustness of depressive disorder recognition (see fig. 1). By integrating multi-mode data such as an electroencephalogram signal, a skin electric signal, an electrocardiosignal and the like, effective fusion among signals is realized, and the identification accuracy is improved by utilizing complementary information of signals in different modes. In addition, domain adaptation techniques are introduced to solve the problem of data distribution offset caused by individual differences, thereby enhancing generalization ability and stability of the model. The method provides more efficient and accurate technical support for clinical application.

According to the depressive disorder recognition method based on multi-modal domain adaptation, the interaction relation of three signals of electroencephalogram, skin electricity and electrocardio is modeled into a deep learning framework, so that dynamic fusion and parameter optimization between the signals are realized. The method fully utilizes the complementary information of the multi-mode signals, so that the model can extract more discriminative characteristics from complex physiological data, and the accuracy and the robustness of identifying the depressive disorder are effectively improved. In addition, the cross-domain commonality feature is successfully mined by pulling the source domain and the target domain data distribution twice. Firstly aligning and using the maximum mean value difference to ensure the consistency of the source domain and the target domain in the feature space, secondly, further deeply excavating and fusing information features related to the target through an attention mechanism to construct more accurate feature characterization. The cross-domain alignment strategy effectively improves the adaptability of the model in different data domains, and remarkably reduces the negative influence caused by domain offset. In general, the identification method provided by the invention has obvious advantages in improving the detection accuracy of depressive disorder and enhancing the generalization capability of the model. It not only demonstrates the effectiveness of multi-modal fusion and domain adaptation, but also in practice provides an innovative solution for the automated detection of depressive disorders.

Drawings

Fig. 1 is a flowchart of a method for identifying depressive disorder based on multi-modal domain adaptation of the present invention;

FIG. 2 is a schematic view of a 3-electrode position, a galvanic skin position, and an electrocardiographic position;

fig. 3 is a flow chart of a depressive disorder recognition framework based on multi-modal domain adaptation.

Detailed Description

The invention will now be described in detail by way of example with reference to the accompanying drawings.

The invention provides a depressive disorder recognition system based on multi-modal domain adaptation, which comprises the steps of firstly preprocessing multi-modal electrophysiological signals in a resting state and an audio stimulus state and extracting corresponding features, wherein the extracted features are artificial features and depth features which are commonly used in the corresponding modes, then modeling dynamic interaction relations of electroencephalogram signals, skin electrical signals and electrocardiosignals through CP tensor synthesis and decomposition, secondly digging a cross-domain commonality feature training model through two maximum mean value differences and an attention mechanism, and finally using a test set to detect depression to obtain the expected depression detection recognition rate. As shown in fig. 1, the system includes the following modules:

(1) The data acquisition module acquires the information of the number, name, sex, age and the like of the tested, and simultaneously acquires signals of brain electricity, skin electricity, electrocardio and the like of the tested in a resting state and an audio stimulation state in sequence. The acquisition positions of brain electricity, skin electricity and electrocardio are shown in figure 2.

In this embodiment, the electroencephalogram signal is collected by the self-grinding three-conductive electroencephalogram device (see left of fig. 2) and the software, and the skin electricity signal and the electrocardiosignal signal are collected by the secondarily developed software, and the two software record the time stamps at the same time, so as to ensure that the data are aligned in time. All the tested students in universities have normal hearing and intelligence, and no other mental medical history exists in the past. No psychotropic drugs were taken prior to data acquisition, and all subjects were conducted under the same laboratory conditions. The sampling rate of the electroencephalogram equipment is 250Hz, and the sampling rate of the skin electricity and electrocardio equipment is 50Hz.

(2) The data preprocessing module comprises power frequency denoising, band-pass filtering, artifact removing and downsampling for the electroencephalogram signals, low-pass filtering and smoothing for the skin electric signals, and power frequency denoising, band-pass filtering and smoothing for the electrocardiosignals so as to improve the signal quality.

In this embodiment, a notch filter is first used to remove the 50Hz line interference to eliminate the power frequency noise. Then, the brain electrical signal, the skin electrical signal and the electrocardiosignal are respectively subjected to band-pass filtering, and the filtering ranges are respectively set to be 0.5-50Hz, 0.5-2Hz and 0.5-40Hz so as to keep effective signal frequency bands. Then, artifact removal is performed, and abnormal data caused by motion artifacts or eye electrical artifacts are detected and removed, so that the quality of signals is improved. In order to reduce random fluctuation in the signal, a moving average technology is applied to carry out signal smoothing processing, so that the stability of data is enhanced. Finally, the electroencephalogram sampling rate is reduced from 250Hz to 100Hz through downsampling, the data volume is reduced, and key information is kept.

(3) And the feature extraction module is used for respectively carrying out manual feature extraction based on priori knowledge on the preprocessed electroencephalogram signals, skin electric signals, electrocardiosignals and the like, and extracting depth features through a transducer encoder. And the electroencephalogram signal extracts three types of linear characteristics and three types of nonlinear characteristics. The linear features include average value, power spectral density, center frequency, and the nonlinear features include sample entropy, approximate entropy, and Lyapunov exponent. And the electrocardiosignal extracts the characteristics such as heart rate variability, frequency ratio, fractal dimension and the like. Skin electric signal, which extracts the characteristics of skin conductance level, peak value number and variance.

(4) The multi-modal domain adaptation module models dynamic interaction between multi-modal physiological signals through tensor synthesis and decomposition, and embeds the dynamic interaction into a training process of deep learning to optimize model parameters. And then, mining the commonality information of the source domain and the target domain, and performing cross-domain alignment by using a maximum mean difference method, so that a model which is well represented on the target domain is trained, and the source domain features are effectively applied to the target domain.

(5) The neural network classification module consists of a plurality of fully connected layers, and an activation function is immediately followed by each fully connected layer and is used for introducing nonlinearity and avoiding the problem of gradient disappearance. In addition, in order to enhance the generalization capability of the network and prevent overfitting, regularization technologies such as batch normalization and Dropout are also introduced into the network, so as to accelerate convergence and stabilize the training process.

(6) The model training and depression detection module (see figure 3) uses the trained model and the neural network classification module to detect depression of the newly acquired data so as to realize efficient classification and identification.

The data acquisition module comprises the following steps:

1) The experimental design is that firstly, the multi-mode physiological signals of the tested in the eye-closing resting state are collected. An auditory stimulus experiment was then performed using auditory stimuli of a variety of different emotional attributes (e.g., 2 positive stimuli, 2 neutral stimuli, 2 negative stimuli), each auditory stimulus lasting for a set period of time. After each auditory stimulus, the test will have a brief rest time. In the whole process, the brain electrical signals, skin electrical signals, electrocardiosignals and the like under the eye-closing state of the tested are continuously collected, and finally complete data are obtained.

2) Electroencephalogram acquisition, namely, electroencephalogram acquisition is carried out by adopting electroencephalogram equipment meeting the international widely-used 10-20 system electrode position standard.

3) The multi-mode electrophysiological acquisition realizes synchronous and stable acquisition of other physiological signals through multi-mode acquisition equipment. The equipment is optimally designed, so that the accuracy and stability of signal acquisition are ensured.

And in the data preprocessing stage, power frequency denoising is realized by using a notch filter to remove 50Hz wire interference.

The band-pass filtering ranges of the electroencephalogram signal, the skin electric signal and the electrocardiosignal in the data preprocessing stage are respectively 0.5-50Hz, 0.5-2Hz and 0.5-40Hz.

And artifact removal in the data preprocessing stage is to detect and remove data anomalies caused by motion artifacts or eye electrical artifacts so as to improve the quality of signals.

The smoothing in the data preprocessing stage is to reduce random fluctuation of signals by a moving average technology so as to improve the stability of data.

The downsampling in the data preprocessing stage is to reduce the sampling rate from 250Hz to 100Hz.

The characteristic extraction module is used for respectively carrying out manual characteristic extraction based on priori knowledge on the preprocessed brain electrical signals, skin electrical signals and electrocardiosignals and extracting depth characteristics through a transducer encoder. For electroencephalogram signals, three types of linear features (average value, power spectral density, center frequency) and three types of nonlinear features (sample entropy, approximate entropy, lyapunov index) are extracted. For electrocardiosignals, characteristics such as heart rate variability, frequency ratio, fractal dimension and the like are extracted. For skin electrical signals, features such as skin conductance level, peak number and variance are extracted. Depth feature extraction is then performed by a transducer encoder, and the complex timing patterns and global dependencies in the signal are automatically learned using a self-attention mechanism.

Power spectral density:

Where x (t) is the time domain signal and f is the frequency.

Center frequency:

Where f₁ is the frequency down line and f₂ is the frequency up line.

Sample entropy:

Wherein a_i is the logarithm of the sequence with a matching length of m+1, B_i is the logarithm of the sequence with a matching length of m, r is the tolerance, and N is the signal length.

Approximate entropy:

ApEn(m,r,N)=φ^m(r)-φ^m+1(r)

Wherein, theIs all the sequence pairs similar to the template sequence.

Lyapunov index:

Wherein, theIs a small disturbance at time i.

Heart rate variability:

wherein RR_i is the RR interval of adjacent heartbeats,Is the average of the RR intervals.

Frequency ratio:

Where HFB represents a high frequency band range and LFB represents a low frequency band range.

Fractal dimension:

Where e is the scale parameter of the fractal structure and N is the length of the signal.

Skin conductance level:

Where SC_i is the skin conductance value at the i-th time point.

The multi-modal domain adaptation module performs the following method steps:

1) Dynamic fusion:

First, a tensor X ε R^I×J×K is composed using an electroencephalogram feature (feature dimension: I), a skin electrical feature (feature dimension: J), and an electrocardiosignal feature (feature dimension: K). The resultant tensor X captures the co-action of the three signals in the respective characteristic dimensions. In order to extract meaningful mutual information from this synthesized tensor, CP decomposition is used, which decomposes a high-dimensional tensor into a combination of multiple low-dimensional factor vectors, expressing the intrinsic structure of the tensor, specifically expressed as:

Wherein R is the rank of decomposition, which represents the number of components used in the decomposition, lambda_r is scalar weight, which represents the importance of different components, a_r、b_r and c_r are corresponding factor vectors, which are respectively from factor matrices of the characteristics of the brain electrical signal, the skin electrical signal and the electrocardiosignal.Representing the outer product, a_r、b_r and c_r respectively represent the interaction characteristics of the electroencephalogram signal, the skin electric signal and the electrocardiosignal on different ranks. Lambda_r、a_r、b_r and c_r are updated parameters embedded in deep learning.

In this embodiment, the tested multi-modal data is divided into 10 parts using ten-fold cross-validation, where 9-fold is the source domain data and the remaining 1-fold is the target domain data. We can only use the source domain data and the tag, target domain data during the training phase. The remaining 1-fold data, including the target domain data and the labels, is used during the test phase.

The source domain feature S and the target domain feature T are constructed as follows:

Wherein, theRepresents the corresponding scalar weights when the source domain features are decomposed,Respectively representing the interaction characteristics of the electroencephalogram signal, the skin electric signal and the electrocardiosignal on different ranks when the source domain characteristics are decomposed, which are the parameters to be learned, and correspondingly,Represents the scalar weight corresponding to the decomposition of the target domain feature,The interaction characteristics of the electroencephalogram signal, the skin electric signal and the electrocardiosignal on different ranks when the source and target domain characteristics are decomposed are respectively represented, and the interaction characteristics are parameters to be learned.

The decomposition process is to extract a low-dimensional representation of the multimodal data, capturing complex interactions between signals, and the synthesis is to reassemble low-dimensional factors into a new, more semantically and easily aligned feature representation. X is multi-mode data, the source domain is multi-mode data, the target domain is multi-mode data, the two steps are needed, the source domain is the source domain characteristics which are decomposed and synthesized, and the target domain data is the same.

2) Cross-domain alignment:

the source domain and the target domain both contain brain electrical signals, skin electrical signals and electrocardiosignals. The present invention uses the maximum mean difference for alignment. After tensor decomposition, the source domain feature is denoted as S and the target domain feature is denoted as T. The first alignment loss can be expressed as:

Where phi () is a feature mapping function representing the mapping of the original feature space into a higher dimensional space (the regenerated kernel hilbert space). S_i represents the ith sample feature in the source domain feature S, T_j represents the jth sample feature in the target domain feature T, and n_s and n_T are the number of source domain and target domain samples, respectively.

In practice, feature mapping is achieved by a kernel function, such as a gaussian kernel (RBF kernel):

where S is the bandwidth hyper-parameter of the kernel function, S_i∈S_i,s_j∈T_j.

Constructing a new target domain feature T' =t+a containing commonality information. Q is a query vector for finding relevant information, K is a key vector for matching with the query vector to find information related thereto, and V is a value vector representing information on the match, which is information of ultimate interest. In the source domain, the features of the target domain are used as queries (Q) to match the features (K) of the source domain, thereby extracting information in the source domain that is related to the target domain, and vice versa. For the input source and target domain features (S, T), Q, K and V are obtained by linear transformation.

Q_S＝SW_Q,K_S＝SW_K,V_S＝SW_V

Where W_Q、W_K and W_V are weight matrices to be learned, d_k is the dimension of the key, and vice versa for the target domain, namely:

Q_T＝TW_Q,K_T＝TW_K,V_T＝TW_V;

And constructing a new source domain feature S' =S+B containing the commonality information.

The second alignment loss is expressed as:

Wherein S_i 'represents the i-th sample feature in the new source domain feature S', T_j 'represents the j-th sample feature in the new target domain feature T', and n_S′ and n_T′ are the new source domain and target domain sample numbers.

The neural network classification module classifies the input new source domain features S 'and the new target domain features T' to obtain classification results;

The model training and depression detection module is used for:

1) Model training and testing:

Training a model by adopting source domain data, and classifying the source domain by using two kinds of cross entropy loss:

wherein y_i is the actual label,Is a predictive tag.

Finally, the loss functions of dynamic fusion and cross-domain alignment are combined to optimize the whole model:

L=λ₁L_mmd1+λ₂L_mmd2+L_cls

where lambda₁、λ₂ is a super parameter used to balance the effects of different loss terms.

At this stage, a ten-fold cross-validation approach is used to train the model to improve its generalization ability. In the method, a data set is divided into ten equal parts, one part is selected as a test set each time, the other nine parts are used as training sets, and verification is repeated for a plurality of times. In each verification, one piece of data which is currently reserved is input into a neural network classifier, and a prediction result of the model is obtained. And comparing the model output with a real label, and calculating classification performance indexes including accuracy, sensitivity, specificity and the like so as to comprehensively evaluate the detection effect of the model. The entire process will be performed on a plurality of data under test.

2) Depression detection:

In practical application, the acquired new tested multi-mode signals (such as brain electricity, skin electricity, electrocardio and the like) are input into a feature extraction module for preprocessing, and then are input into a trained neural network model to generate a prediction result. By analyzing the output of the model, efficient depression classification and identification is performed to aid in clinical diagnosis and treatment decisions.

In summary, the above embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

Translated fromChinese

1.一种基于多模态域适应的抑郁障碍识别系统，其特征在于，包括数据采集模块、特征提取模块、多模态域适应模块、神经网络分类模块以及模型训练与抑郁检测模块；1. A depression disorder identification system based on multimodal domain adaptation, characterized by including a data acquisition module, a feature extraction module, a multimodal domain adaptation module, a neural network classification module, and a model training and depression detection module;

所述数据采集模块用于：采集被试在静息态和音频刺激态下的脑电、皮肤电和心电的三种模态信号；The data acquisition module is used to collect three modal signals of EEG, EGG and ECG from the subject in a resting state and an audio stimulation state;

所述特征提取模块用于提取脑电、皮肤电和心电信号的特征；The feature extraction module is used to extract the features of EEG, EGG and ECG signals;

所述多模态域适应模块用于：The multimodal domain adaptation module is used to:

将被试的三种模态信号数据分为源域数据和目标域数据；The three modal signal data of the subjects are divided into source domain data and target domain data;

对每个源域和目标域数据，使用脑电信号特征、皮肤电信号特征、心电信号特征组成张量，再将其进行CP分解，具体表示为:For each source domain and target domain data, the EEG signal features, skin electrical signal features, and ECG signal features are used to form a tensor, which is then decomposed by CP, specifically expressed as:

其中，S表示源域特征张量，T表示目标域特征张量；R是分解的秩，表示分解中使用的成分数量；表示源域特征进行分解时对应的标量权重，分别表示源域特征进行分解时脑电信号、皮肤电信号、心电信号在不同秩上的交互特征，均是需要学习的参数；相应的，表示目标域特征进行分解时对应的标量权重，分别表示目标域特征进行分解时脑电信号、皮肤电信号、心电信号在不同秩上的交互特征，也是需要学习的参数；Where S represents the source domain feature tensor, T represents the target domain feature tensor; R is the rank of the decomposition, which indicates the number of components used in the decomposition; Represents the scalar weight corresponding to the decomposition of source domain features, They represent the interactive features of EEG signals, skin electrical signals, and ECG signals at different ranks when the source domain features are decomposed, and are all parameters that need to be learned; correspondingly, Represents the scalar weight corresponding to the decomposition of the target domain features, They represent the interactive features of EEG signals, skin electrical signals, and ECG signals at different ranks when the target domain features are decomposed, and are also parameters that need to be learned;

在张量分解后，计算第一次对齐损失：After tensor decomposition, the first alignment loss is calculated:

其中，φ()是特征映射函数，表示将原始特征空间映射到一个更高维的空间中；S_i表示源域特征S中第i个样本特征，T_j表示目标域特征T中第j个样本特征；n_S和n_T分别为源域和目标域样本数；Among them, φ() is the feature mapping function, which means mapping the original feature space to a higher-dimensional space;_Si represents the i-th sample feature in the source domain feature S, and_Tj represents the j-th sample feature in the target domain feature T;_nS and_nT are the number of samples in the source domain and target domain respectively;

构造包含共性信息的新目标域特征T′＝T+A；Construct a new target domain feature T′=T+A containing common information;

对于输入的源域特征S和目标域特征T，通过线性变换得到相应的查询向量、键向量个值向量：For the input source domain features S and target domain features T, the corresponding query vector, key vector and value vector are obtained through linear transformation:

Q_S＝SW_Q,K_S＝SW_K,V_S＝SW_V；Q_S =SW_Q ,K_S =SW_K ,V_S =SW_V ;

Q_T＝TW_Q,K_T＝TW_K,V_T＝TW_V；Q_T =TW_Q ,K_T =TW_K ,V_T =TW_V ;

其中，W_Q、W_K和W_V是需要学习的权重矩阵；Among them, W_Q , W_K and W_V are the weight matrices that need to be learned;

构造包含共性信息的新源域特征S′＝S+B；Construct a new source domain feature S′=S+B containing common information;

计算第二次对齐损失，表示为：Calculate the second alignment loss, expressed as:

其中，S_i′表示新源域特征S′中第i个样本特征，T_j′表示新目标域特征T^′中第j个样本特征，n_S′和n_T′分别为新的源域和目标域样本数；Where, S_i ′ represents the i-th sample feature in the new source domain feature S ′, T_j ′ represents the j-th sample feature in the new target domain feature T^′ , n_{S ′} and n_{T ′} are the number of new source domain and target domain samples respectively;

所述神经网络分类模块用于对输入的特征数据进行分类，得到分类结果；The neural network classification module is used to classify the input feature data to obtain a classification result;

所述模型训练与抑郁检测模块，用于：The model training and depression detection module is used to:

1)模型训练：1) Model training:

采用源域数据输入到神经网络分类模块，根据得到的分类结果计算二分类交叉熵损失：The source domain data is input into the neural network classification module, and the binary cross entropy loss is calculated based on the obtained classification results:

其中，y_i是特征数据的真实标签，是预测标签；Among them,_yi is the true label of the feature data, is the predicted label;

最终，结合动态融合和跨域对齐计算得到总的损失函数以优化整个模型：Finally, the total loss function is calculated by combining dynamic fusion and cross-domain alignment to optimize the entire model:

L＝λ₁L_mmd1+λ₂L_mmd2+L_clsL＝λ₁ L_mmd1 +λ₂ L_mmd2 +L_cls

其中，λ₁、λ₂是超参数，用于平衡不同损失项的影响；Among them, λ₁ and λ₂ are hyperparameters used to balance the impact of different loss terms;

基于该结合动态融合和跨域对齐的损失函数L对神经网络分类模块以及需要学习的参数进行训练，通过反向传播算法更新模型权重，最终得到经过优化的抑郁检测系统；Based on the loss function L that combines dynamic fusion and cross-domain alignment, the neural network classification module and the parameters that need to be learned are trained, and the model weights are updated through the backpropagation algorithm, ultimately obtaining an optimized depression detection system.

2)抑郁检测：2) Depression detection:

在实际应用中，将采集新被试的多模态信号输入到特征提取模块进行预处理后，输入至训练好的神经网络分类模块中，生成分类结果。In practical applications, the multimodal signals collected from new subjects are input into the feature extraction module for preprocessing, and then input into the trained neural network classification module to generate classification results.

2.如权利要求1所述的一种基于多模态域适应的抑郁障碍识别系统，其特征在于，模型训练与抑郁检测模块还包括对神经网络分类模块的检测：2. The depressive disorder identification system based on multimodal domain adaptation according to claim 1, wherein the model training and depression detection module further includes testing of the neural network classification module:

采用源域数据以及对应的标签，对神经网络分类模块的分类性能进行检测。The classification performance of the neural network classification module is tested using source domain data and corresponding labels.

3.如权利要求1所述的一种基于多模态域适应的抑郁障碍识别系统，其特征在于，还包括数据预处理模块，包括对脑电信号进行工频去噪、带通滤波、伪影剔除和下采样，对皮肤电信号进行低通滤波和平滑，对心电信号进行工频去噪、带通滤波和平滑，以提高信号质量。3. The depressive disorder identification system based on multimodal domain adaptation as described in claim 1 is characterized by further including a data preprocessing module, including performing power frequency denoising, bandpass filtering, artifact removal, and downsampling on EEG signals, low-pass filtering and smoothing on skin conduction signals, and power frequency denoising, bandpass filtering, and smoothing on ECG signals to improve signal quality.

4.如权利要求1所述的一种基于多模态域适应的抑郁障碍识别系统，其特征在于，所述特征提取模块先对脑电信号、皮肤电信号和心电信号分别进行基于先验知识的手工特征提取，再通过Transformer编码器提取深度特征。4. A depressive disorder identification system based on multimodal domain adaptation as described in claim 1, characterized in that the feature extraction module first performs manual feature extraction based on prior knowledge on EEG signals, skin electrodermal signals, and ECG signals, and then extracts deep features through a Transformer encoder.

5.如权利要求4所述的一种基于多模态域适应的抑郁障碍识别系统，其特征在于，所述特征提取模块还用于：5. The depressive disorder identification system based on multimodal domain adaptation according to claim 4, wherein the feature extraction module is further configured to:

对脑电信号提取三类线性特征和三类非线性特征；线性特征包括平均值、功率谱密度、中心频率；非线性特征包括样本熵、近似熵和李雅普诺夫指数。Three types of linear features and three types of nonlinear features are extracted from EEG signals; the linear features include mean value, power spectrum density, and center frequency; the nonlinear features include sample entropy, approximate entropy, and Lyapunov exponent.

6.如权利要求5所述的一种基于多模态域适应的抑郁障碍识别系统，其特征在于，所述特征提取模块还用于：6. The depressive disorder identification system based on multimodal domain adaptation according to claim 5, wherein the feature extraction module is further configured to:

对心电信号提取心率变异性、频率比率和分形维数特征。Heart rate variability, frequency ratio and fractal dimension features are extracted from ECG signals.

7.如权利要求6所述的一种基于多模态域适应的抑郁障碍识别系统，其特征在于，所述特征提取模块还用于：7. The depressive disorder identification system based on multimodal domain adaptation according to claim 6, wherein the feature extraction module is further configured to:

对皮肤电信号提取皮肤电导水平、峰值数量和方差特征。The skin conductance level, peak number and variance characteristics of the skin electrical signal are extracted.

8.如权利要求1所述的一种基于多模态域适应的抑郁障碍识别系统，其特征在于，所述神经网络分类模块，由多层全连接层组成，每一层全连接层后面连接一个激活函数。8. A depressive disorder identification system based on multimodal domain adaptation as described in claim 1, characterized in that the neural network classification module is composed of multiple layers of fully connected layers, and each fully connected layer is followed by an activation function.

9.如权利要求1所述的一种基于多模态域适应的抑郁障碍识别系统，其特征在于，所述神经网络分类模块还引入了批量归一化和Dropout技术，加速收敛并稳定训练过程。9. A depressive disorder identification system based on multimodal domain adaptation as described in claim 1, characterized in that the neural network classification module also introduces batch normalization and dropout technology to accelerate convergence and stabilize the training process.