Single-channel electroencephalogram automatic sleep stage method based on feature reconstructionTechnical Field
The invention relates to the field of electroencephalogram classification by adopting deep learning, in particular to a single-channel electroencephalogram automatic sleep stage method based on feature reconstruction.
Background
During the sleeping process of the whole night, the activities of the brain, muscles, heart and other parts of the human body are in a dynamic change process, so that the sleeping state can be classified. According to the standard of the American sleep society, according to a polysomnogram composed of signals such as an electroencephalogram, an electrocardiogram, an electromyogram and the like, the sleep state of every thirty seconds can be divided into five stages, namely, wakefulness, rapid eye movement sleep, non-rapid eye movement sleep first stage, non-rapid eye movement sleep second stage and non-rapid eye movement sleep third stage.
Each sleep stage plays an essential role in human health, and thus sleep staging is an important means of sleep quality assessment and sleep-related disease detection.
Sleep staging is primarily done by sleep professionals with years of experience, but this approach has the following limitations:
(1) The whole night sleep stage is extremely time-consuming, and a sleep expert needs to classify signals of a plurality of channels every thirty seconds into one sleep stage, however, the whole night sleep signal is often as long as eight hours, so that the whole process needs to consume a great deal of time of the sleep expert, and a patient needs to wait for a long time to take a result;
(2) Sleep stage is very difficult, and only sleep experts with years of experience can carry out sleep stage, and due to the influence of working time and subjective experience, the sleep stage results given by two sleep experts often have differences. Accordingly, many researchers have begun focusing on how to implement automatic sleep staging.
Early researchers built automated sleep staging systems based on pattern recognition. However, the patterns corresponding to the sleep stages are complex, and are difficult to identify, thus limiting the performance of the system. After that, researchers build an automatic sleep stage model through machine learning, firstly, the researchers manually extract various features in the sleep patterns every thirty seconds, and then the features are sent into a neural network to be classified in machine learning models such as a support vector machine, a random forest and the like. The method has the following defects that (1) classification performance is extremely dependent on how to select manually extracted features, the selection of the features is not clear standard and only depends on the experience of researchers, and (2) the change of sleep stages has a certain transition rule, and the features extracted from single thirty-second signal fragments cannot accurately reflect the transition relation between the sleep stages.
With the increase of computer power in recent years, deep learning has been attracting attention of more and more researchers, and extremely excellent results have been achieved in tasks such as computer vision and natural language processing. Thus, many sleep-related researchers have begun to attempt to implement automatic sleep staging using deep learning. Unlike machine learning, which entails manually extracting features, deep learning can utilize convolutional neural networks to directly implement automatic feature extraction, thereby implementing an end-to-end automatic sleep staging system.
Based on sleep expert, the sleep stage system of pattern recognition and machine learning always need to measure the polysomnography of whole night, however, the polysomnography of whole night is made up of multichannel signal, need professional large-scale instrument, measure extremely troublesome and patient's somatosensory poor. Because different sleep stages have sleep brain waves with different frequency bands, most of automatic sleep stage models based on deep learning can finish sleep stage by only relying on single-channel electroencephalogram, and single-channel electroencephalogram can be detected by small equipment, so that measurement is simpler and the patient measurement process is more comfortable.
Unlike computer vision tasks, the electroencephalogram-based automatic sleep stage has the following technical problems that (1) the electroencephalogram data information density is low. In order to avoid signal distortion, electroencephalograms are usually acquired at a higher sampling rate (usually greater than 100 Hz), and therefore have higher data dimensions, resulting in lower information density and much smaller information content than images in the same data dimensions, and (2) have a transition relationship between sleep stages. The sleep stage standard requires that sleep labeling is to pay attention to the corresponding electroencephalogram fragments and the adjacent electroencephalogram fragments, and the change condition of the sleep stage in the whole night has a certain rule, and (3) the electroencephalogram contains a large amount of noise. The electroencephalogram is a weak potential signal, and noise caused by body movement, measurement errors and the like can be inevitably introduced in the measurement process. The automatic sleep stage model based on the electroencephalogram must consider the above problems, and the existing models can be divided into a pure convolutional neural network model and a convolutional and cyclic neural network mixed model according to structures.
The pure convolutional neural network model is a neural network only comprising convolutional layers, such as a U-Time model. Convolutional layers are often used to process images because of their translational invariance and locality characteristics. Electroencephalogram signals can be regarded as a special image as a one-dimensional data sequence. In the automatic sleep stage task, the model receives a continuous electroencephalogram fragment sequence with a certain length each time, and outputs a corresponding sleep fragment sequence. With the increasing number of convolution layers, the receptive field of the model is gradually expanded, so that features can be extracted from adjacent continuous electroencephalogram segments, and sleep transition rules are implicitly encoded. However, as mentioned above, thirty seconds of electroencephalogram fragment data is of higher dimensionality, and in order to extract features of a continuous sleep fragment, there is a higher requirement for receptive fields for the model, resulting in a more complex model, slower running and higher parameter volumes.
The convolutional and recurrent neural network hybrid model introduces a recurrent neural network, such as TINYSLEEPNET, after the convolutional neural network. The introduction of the recurrent neural network allows the model to extract the timing relationships, so that the convoluted neural network portion of the model extracts features from individual electroencephalogram segments, while the following recurrent neural network portion learns the transfer relationships between adjacent sleep segments. While such models reduce receptive field requirements, shallower models result in a convolutional neural network portion with higher time dimension of the output, and the introduction of a recurrent neural network increases the difficulty of model training.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a single-channel electroencephalogram automatic sleep stage method based on characteristic reconstruction.
According to one aspect of the present invention, there is provided a single-channel electroencephalogram automatic sleep stage method based on feature reconstruction, including:
Acquiring and labeling a single-channel electroencephalogram of sleeping overnight to obtain a data set for training a model;
preprocessing the data set, and carrying out data standardization and data augmentation;
Establishing and initializing a model, and performing unsupervised pre-training and supervised training on the model by using the data set after the pre-processing;
and marking the sleep stage by using the trained model.
Preferably, the acquisition and labeling of the overnight sleep electroencephalogram to obtain a data set for training a model comprises the steps of measuring the Fpz-Cz single-channel electroencephalogram through acquisition equipment and removing fragments containing noise caused by body movement of a subject and fragments of non-sleep stages.
Preferably, the data preprocessing includes:
carrying out standardization with the mean value of 0, the variance of 1 or the median of 0 and the quartile difference of 1 on each whole night sleep electroencephalogram record;
and randomly selecting a plurality of thirty-second sleep fragments in each training round to translate and invert so as to realize data augmentation.
Preferably, the built model includes:
The convolution reconstruction module adopts a second convolution neural network, and the second convolution neural network comprises a forward part and a reverse part, wherein the forward part extracts the characteristics of the input image, and the reverse part reconstructs and outputs the output characteristics of a first layer of the forward part;
and the global maximum and average pooling module performs data dimension reduction on the characteristics output by the convolution reconstruction module and extracts time-invariant characteristics.
Preferably, the built model further comprises:
The low-level feature extraction module adopts a first convolution neural network to perform feature extraction and dimension reduction on an input electroencephalogram to obtain low-level features, and the low-level features are used as input of the convolution reconstruction module;
the circulating neural network module adopts a first circulating neural network, receives the output of the circulating neural network module on the last electroencephalogram segment and the output of the global maximum and average pooling module as inputs, learns the transfer relation between sleep segments, and outputs a sleep stage result.
Preferably, the first convolutional neural network comprises a 1-layer convolutional layer, a 1-layer max-pooling layer and a dropout layer;
Preferably, the second convolutional neural network comprises a 3-layer forward convolutional layer, a 2-layer backward convolutional layer and a 1-layer maximum pooling layer;
the global maximum and average pooling module comprises a global maximum pooling layer, a global average pooling layer, a connecting layer and a dropout layer, wherein the global maximum pooling layer and the global average pooling layer respectively carry out global pooling on the output of the convolution reconstruction module, and the connecting layer connects the two obtained features together in the dimension of a channel.
Preferably, the circulating neural network module comprises a 1-layer long-short-term memory network and a 1-layer linear layer;
preferably, the unsupervised pre-training, for pre-training the low-level feature extraction module and the convolution reconstruction module, includes:
The loss function measuring the reconstruction quality of the convolution reconstruction module is a mean square error loss, which is defined as:
Wherein, theRepresenting the output of the forward convolutional layer of the first layer,Representing the output of the reverse convolution layer;
The supervised training to train the ensemble model with classification and reconstruction losses includes:
The loss function used adds the weighted cross entropy classification loss Lcls, and the total loss function is set as classification loss and weighted reconstruction loss, namely:
Ltotal=Lcls+α*Lrec;
Where α is the weight of the reconstruction penalty, set to 1e-5.
Compared with the prior art, the invention has the following beneficial effects:
According to the invention, the accurate prediction of sleep stage can be performed only through the single-channel brain electrical signal, so that the discomfort of patient treatment and the difficulty of signal acquisition are reduced.
The invention provides a neural network model based on feature reconstruction with a small quantity of parameters, which can better extract the features required by sleep stage from the brain electrical signals and requires less training time to achieve a sleep stage result similar to or even better than that of the existing method.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
Fig. 1 is a flowchart of a single-channel electroencephalogram automatic sleep stage method based on feature reconstruction according to an embodiment of the present invention.
Fig. 2 is a general architecture diagram of a single-channel electroencephalogram automatic sleep stage model based on feature reconstruction in a preferred embodiment provided by the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.
As shown in fig. 1, a flowchart of a single-channel electroencephalogram automatic sleep stage method based on feature reconstruction according to an embodiment of the present invention includes:
S1, acquiring and labeling a single-channel brain computer for sleeping at night to obtain a data set for training a model;
s2, preprocessing the data set, including data standardization and data augmentation;
S3, establishing and initializing a model, and performing unsupervised pre-training and supervised training on the model by using the preprocessed data set;
S4, marking the sleep stage by using the trained model.
The invention provides a preferred embodiment to execute S1 to perform acquisition of single-channel electroencephalogram. The Fpz-Cz channel electroencephalogram signals of the subject overnight are acquired using a wearable device or electroencephalograph and sleep stage labeling is performed, thereby generating a dataset for model training. In addition, a public dataset with sleep stage annotations provided by a website such as physionet may also be used. Part of sleep fragments are required to be directly removed from the data set due to the large noise introduced by the large movement of the subject body, and part of sleep fragments which cannot confirm the sleep cycle are required to be removed.
The present invention provides a preferred embodiment to perform S2, pre-processing the data set, normalizing each night sleep brain signal to aid in model training. As a preferred embodiment, the electroencephalogram signal at each night is shifted and scaled to be 0 as a mean value and 1 as a variance, or to be 0 as a median and 1 as a quartile difference, and the normalization mode has better robustness.
In deep learning, the learning ability of a model is often positively correlated with the size of the data scale. Therefore, the present embodiment performs data augmentation on a part of thirty-second electroencephalogram fragments by translation or inversion of an amplitude of 5% to 10% (i.e., 150 to 300 sample points at a sampling rate of 100 Hz), thereby generating more diversified electroencephalogram data, further expanding the data volume.
In order to solve the defects existing in the prior art, enhance the training speed and improve the stage accuracy, the invention provides a preferred embodiment for executing S3 and establishing an integral model. In this embodiment, the overall model includes four modules, respectively:
The system comprises a low-level feature extraction module, a convolution reconstruction module, a global maximum and average pooling module and a cyclic neural network module.
And the low-layer feature extraction module adopts a first convolution neural network to perform primary feature extraction and dimension reduction on the input electroencephalogram once to obtain the features of an original signal.
The convolution reconstruction module adopts a first convolution neural network and consists of a forward part and a reverse part. The forward part of the method further processes the extracted features in the low-layer feature extraction module to obtain higher-level and abstract features, and the reverse part of the method reconstructs the output features of the first layer of the forward part as much as possible.
And the global maximum and average pooling module is used for further processing the characteristics output by the convolution reconstruction module and extracting time-invariant characteristics while reducing the dimension of the data.
And the cyclic neural network module adopts a first cyclic neural network, receives the output of the cyclic neural network module on the last electroencephalogram segment and the output of the global maximum and average pooling module as inputs, learns the transfer relation among sleep segments, and finally outputs a sleep stage result.
Further, as shown in fig. 2, an overall structure diagram of the automatic sleep analysis model based on the deep neural network in this embodiment is shown. As can be seen from the figure, the model parameters established include:
the first convolutional neural network adopted in the low-level characteristic extraction module comprises a 1-level convolutional layer, a 1-level maximum pooling layer and a dropout layer. The convolution layer comprises a convolution operation layer and a ReLU activation function, the convolution kernel size is 50, the convolution step length is 6, the number of output channels is 128, the maximum pooling layer size is 8, the step length is 8, and the dropout layer probability is 0.5.
The second convolutional neural network adopted in the convolutional reconstruction module comprises 3 layers of forward convolutional layers, 2 layers of reverse convolutional layers and 1 layer of maximum pooling layer. The convolution layer comprises a convolution operation layer and a ReLU activation function, the convolution kernel size is 8, the convolution step length is 1, the number of output channels is 128, the maximum pooling layer size is 4, and the step length is 4.
The global maximum and average pooling module comprises a global maximum pooling layer, a global average pooling layer, a connecting layer and a dropout layer. The overall maximum pooling layer and the overall average pooling layer respectively carry out overall pooling on the output of the convolution reconstruction module, and the connecting layer connects the two obtained features together in the dimension of the channel.
The first cyclic neural network adopted by the cyclic neural network module comprises 1 layer of long-short-period memory network and 1 layer of linear layer. The number of hidden layers of the long-term memory network is 128, the number of neurons of the linear layer is 5, and the activation function is a Softmax function. The long-short term memory network only accepts 1-dimensional data as input, while the output of the convolution reconstruction module is characterized byWhere C represents the number of channels of the feature and T represents the time dimension of the feature. To input the feature into long and short term memory without a global max and average pooling module, the feature needs to be flattened intoHowever, since gates in long and short term memory networks resemble linear layers, this way of directly flattening features again increases the dimensions of the features, resulting in a tremendous increase in model parameters. In addition, according to the standard of sleep stage, the main distinguishing mode of sleep stage is the ratio of occurrence to occupation of brain waves corresponding to each sleep stage, and the specific position of the brain waves in the corresponding brain waves is not focused, and the characteristic flattening directly breaks the property. After adding the global maximum and average pooling module, the output characteristics are as followsThe feature dimension is obviously reduced, and the model parameter quantity is greatly reduced.
The present invention provides a pre-training that preferably performs an unsupervised training. Specifically, an unsupervised reconstruction loss pre-training low-level feature extraction module and a convolution reconstruction module are used. A reverse feature reconstruction convolutional layer is additionally proposed outside the forward convolutional neural network to aid in feature extraction of the forward structure, which is aimed at reconstructing intermediate features in the forward process as well as possible. A large amount of noise is inevitably introduced in the electroencephalogram acquisition process, so that the features close to the input part of the model contain more noise, and therefore, the reconstruction of the input electroencephalogram signals or the output of the low-level feature extraction module is very difficult, and even the interference with the forward feature extraction is possible, so that the performance of the model is reduced. Further, the reverse convolution layer takes as input the output of the forward convolution layer, and its reconstruction target is the output of the first layer forward convolution layer. The loss function measuring the reconstruction quality is the mean square error loss, which is defined as:
Wherein, theRepresenting the output of the forward convolutional layer of the first layer,Representing the output of the inverse convolutional layer.
Further, the low-level feature extraction module and the convolution reconstruction module are pre-trained 10 rounds without using sleep stage labels before formally training the model using the sleep stage labels.
The present invention provides a preferred embodiment for performing supervised training, i.e. for training the whole model with classification loss and reconstruction loss in a supervised manner. Unlike the loss function used in pre-training, the loss function used in formal training adds a weighted cross entropy classification loss Lcls, and the weights of the non-rapid eye movement sleep 1 period, the wakefulness, the rapid eye movement sleep, the non-rapid eye movement sleep 1 period, the non-rapid eye movement sleep 2 period and the non-rapid eye movement sleep 3 period are respectively 1, 1.5, 1 and 1 in consideration of the problem of unbalanced categories in the data set. The total loss function is set as the classification loss and the weighted reconstruction loss, namely:
Ltotal=Lcls+α*Lrec
Where α is the weight of the reconstruction penalty, set to 1e-5.
Unlike full convolutional neural networks, long and short term memory networks need to preserve the state of the internal neurons corresponding to each sleep stage, thus serving as input to the next sleep stage. Therefore, the electroencephalogram signals in each batch come from different records, and the electroencephalogram signals in each batch are aligned in length through zero padding, so that the model can realize batch processing of EEG records, the training process is accelerated, meanwhile, the loss of the padding part is not used for optimizing the model, and the loss calculation of the padding part is meaningless for optimizing the model. At the beginning of each batch, the internal neurons of the long-short term memory network are zero initialized until the electroencephalogram signals within the entire batch are input into the model. For example, in one round, for a data set Dn with n overnight brain electrical signals, the batch size is B, then the data of each batch is DB, where the longest brain electrical signal has L thirty second brain electrical segments, and the rest of the signals are zero-padded to length L at the end, i.eAnd introducing a filling matrixWherein each value corresponds to the original signal or the filling signal, the filling position value is 0, and the original signal position value is 1. Considering the limitation of the video memory and the memory, only one continuous electroencephalogram signal can be input at a time, if the length is l, one input of each batch isFilling matrixIn the convolutional neural network section, it can be regarded asProcessing is performed, and after the global maximum and average pooling module is performed, the method is characterized in thatThus, sleep stage results can be calculated one by one along the l direction to obtain item-by-item losses
The final classification loss is:
Wherein +.. The model is used for realizing batch processing of EEG records, and the meaning of batch processing is to accelerate the training process.
The long-term memory network neuron state of the last sleep stage is saved and is used for calculating the first segment of the next sequence adjacent to the electroencephalogram segment sequence of the length l. And after the whole batch calculation is completed, zero initialization is carried out on the long-period memory network neurons. After all batches of training are completed, the step is repeated to perform the training of the next round. The setting of the total round should be inversely related to the amount of training data, i.e. the more data requires fewer training rounds.
For better sleep stage labeling, the present invention provides a preferred embodiment to perform S4. In this embodiment, in the actual use after the model is trained, in order to quickly label sleep stages of a plurality of overnight electroencephalogram records, the test process is the same as the training process, and shorter electroencephalogram signals are also filled in. The filled brain electrical data is sent into a trained model, and after the data is sent into the model, a Softmax layer outputs vectorsWherein the method comprises the steps ofAnd 1, respectively representing the probability that the model belongs to a sleep segment and wakes up, and the model performs rapid eye movement sleep, performs non-rapid eye movement sleep for a first period, performs non-rapid eye movement sleep for a second period and performs non-rapid eye movement sleep for a third period, and selects the prediction with the maximum probability as the stage result of the model on the segment.
In this embodiment, only a single-channel electroencephalogram signal is required to be collected for training of a used model, and compared with the existing method, the used model has a small parameter amount, and can achieve sleep stage performance of the single-channel electroencephalogram signal, which is comparable to or even better than that of the existing method, with less training time.
The foregoing describes specific embodiments of the present invention. It is to be understood that the invention is not limited to the particular embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the claims without affecting the spirit of the invention. The above-described preferred features may be used in any combination without collision.