Idea typing method based on visual evoked broadband response and brain-computer interface systemTechnical Field
The invention relates to the field of brain-computer interface research, in particular to a method and a brain-computer interface system for decoding visual stimulus flow designed based on visual evoked broadband response by using a neural network decoder and realizing direct interaction and closed-loop feedback adjustment through ideas and a keyboard.
Background
The Brain-computer interface (Brain-computer interface-Computer Interface) directly establishes a communication link between the human Brain and external equipment through the steps of signal acquisition, signal processing, characteristic engineering, equipment control and the like, and constructs a biological signal closed loop of Brain-computer interaction. With the continuous development of human brain electromagnetic signal acquisition equipment towards miniaturization and portability, brain-computer interfaces become hot spots for international scientific research and industrial layout. The development of more practical brain-computer interface clinical application, more accurate brain-computer interface recognition algorithm and more portable brain-computer interface signal acquisition equipment has become an important field of the core strength of the brain science field of each country.
In many applications of brain-computer interfaces, there is much research into ideographic typing techniques. Specifically, by decoding electromagnetic signals of the human brain, typing ideas of a user are identified and fed back to a screen, and a brain-computer interface closed loop of decoding-feedback is formed. The brain-computer interface ideation typing technology does not simply utilize electromagnetic signals of the human brain to decode ideation, but decodes the typewriting ideation in the interaction of the human brain and a screen keyboard through a certain cognitive neuroscience principle.
In the traditional ideographic design, two paradigms are adopted:
First, a paradigm is designed based on ideographic typing of P300 event-related potentials (EVENT RELATED Potentials P, 300). P300 is a forward waveform with more obvious amplitude in an electroencephalogram/magnetoencephalography signal after the stimulus presents 300ms, and is related to the processing of the human brain on new abnormal stimulus. Based on this principle, the classical idea typing P300 paradigm is designed as follows. A 6 x6 grid stimulation interface is presented on the screen, with one letter or number presented in each grid. The squares of each row and each column are sequentially lightened, and simultaneously, the electroencephalogram/magnetoencephalography signals are recorded. When the desired character in the user's mind is lit, a larger P300 component is induced in the electroencephalogram/magnetoencephalography signals. By sequentially lighting the rows and columns, and combining the amplitude of the P300 component, the characters in the user's ideas can be located. The method is simple to operate, but the characteristic of the experimental paradigm makes the information transmission rate lower, the signal to noise ratio is lower because the electromagnetic signals of the brain of a person are decoded on a single test level, and meanwhile, P300 is an endogenous code for stimulation and does not fully utilize rich visual stimulation attributes.
Second, design paradigms are typed based on the idea of steady-state evoked potentials (STEADY STATE Visual Evoked Potentials, SSVEP). When looking at a visual stimulus of a fixed frequency, the visual cortex of the brain produces a continuous response related to the stimulus frequency (at the fundamental or multiple of the stimulus frequency), namely SSVEP. One essential difference between the SSVEP and P300 potentials is that the former is an exogenous visual evoked potential. Thus, the SSVEP response can be controlled by changing the visual properties of the exogenous stimulus, and the individual letters on the screen keyboard are encoded by different flashing frequencies, thereby achieving ideographic typing. This experimental paradigm is currently used in a wide range of brain-computer interfaces, but has a large limitation.
The method comprises the steps of coding each letter on a keyboard based on periodical brightness change of visual stimulus, extracting and identifying SSVEP signals through Fourier analysis and wavelet analysis, wherein firstly, SSVEP frequency selection is limited by display refresh rate, but the refresh rate of a computer display used in practical application is lower (60 Hz), the display refresh rate needs to be integral multiple of SSVEP frequency, otherwise, frame loss and frame dropping phenomenon can occur to influence subsequent signal processing, secondly, human brain visual cortex has higher signal-to-noise ratio for response of periodical flicker stimulus in a frequency range of 11-19 Hz, the frequency range is relatively narrow, the number of letters on the keyboard is limited, thirdly, the paradigm needs to make spectral analysis by electroencephalogram/magnetoencephalic signals for a period of time, time domain information of the signals is not fully utilized, time delay of identifying the characters is increased, identification speed is reduced, crosstalk is easy to be caused between different frequencies of each letter on the keyboard, identification failure is caused, and in practical operation, the stimulation lower than 15Hz brings strong flicker feeling to a user, and comfort degree is reduced to a certain extent. Sixth, conventional SSVEP data analysis only uses basic signal processing techniques, such as fourier analysis, wavelet analysis, etc., and fails to fully utilize artificial intelligence techniques to enable brain-computer interfaces.
In addition to the two ideas typing paradigms and variations thereof, there is an ideas typewriter interface paradigm designed by using principles such as motor imagery (Motor Imagery), but the two ideas typing paradigms are not widely used in the market in terms of speed and accuracy, and are not repeated here.
In summary, in the field of ideographic typing of brain-computer interfaces, a new design paradigm is needed to be provided, which can overcome the problems of poor signal-to-noise ratio, low signal transmission rate and poor user experience existing in the traditional paradigm to the greatest extent, and enables the design of the ideographic typewriter interface by utilizing the latest achievements in the field of artificial intelligence fronts.
Disclosure of Invention
In order to overcome the problems in the prior art, the inventor provides a idea typing method and a brain-computer interface interaction system based on visual evoked broadband response (Visual Evoked Spread-Spectrum Response, VESR), which can extract and process multichannel and high-flux full brain nerve activity electromagnetic signals in real time, and decode and feed back the user idea typing process at high speed and accurately.
The invention adopts the technical scheme that:
a method of ideographic typing based on visually induced broadband response, comprising the steps of:
Displaying on a display device a keyboard for ideographic typing, each character on the keyboard having two opposite presentation states;
Randomly generating a luminance sequence corresponding to the two opposite presentation states for each character on the keyboard, each character switching between the two opposite presentation states according to the respective luminance sequence;
allowing a user to observe the keyboard, and simultaneously collecting and preprocessing brain electric or brain magnetic signals of the user;
And inputting the preprocessed electroencephalogram or magnetoencephalogram signals to a decoder based on a convolutional neural network for processing, outputting a predicted stimulation sequence, comparing the stimulation sequence with brightness sequences of all characters on a keyboard, and taking characters corresponding to the most similar stimulation sequence as characters predicted by the decoder.
Preferably, the two opposite presentation states are either a bright state and a dark state or a display and a disappearance state.
Preferably, the luminance sequence is generated by random sampling by a pseudo-random algorithm or the like, and is represented as a 0-1 sequence.
Preferably, the generated luminance sequence is converted into frequency domain information by fourier transform, the power of each frequency band is normalized in the frequency domain, and then the luminance sequence for stimulation is converted back by inverse fourier transform.
Preferably, preprocessing is to filter out power frequency noise, bandpass filter and downsampling on the electroencephalogram or magnetoencephalic signals of each channel.
Preferably, the high frequency cut-off frequency of the band pass filter is the display device refresh rate and the frequency of the signal downsampling is an integer multiple of the display device refresh rate.
Preferably, the method for comparing the stimulation sequence with the brightness sequences of all characters on the keyboard is that the Pearson (Pearson) correlation coefficient is calculated by the stimulation sequence and the brightness sequences of all characters on the keyboard respectively, and the stimulation sequence with the largest correlation number is taken as the most relevant stimulation sequence.
A ideographic typewriter interface system based on visually-induced broadband response, comprising:
A stimulus presentation system for displaying on a display device a keyboard for ideographic typing, each character on the keyboard having two opposite presentation states, randomly generating for each character on the keyboard a sequence of brightnesses corresponding to said two opposite presentation states, each character switching between said two opposite presentation states according to the respective sequence of brightnesses;
The magnetoencephalography signal acquisition equipment is used for collecting magnetoencephalography signals or magnetoencephalography signals of a user;
The real-time analysis workstation is used for preprocessing the collected electroencephalogram or magnetoencephalogram signals, inputting the preprocessed electroencephalogram or magnetoencephalogram signals into a decoder based on a convolutional neural network for processing, outputting a predicted stimulation sequence, comparing the stimulation sequence with brightness sequences of all characters on a keyboard, and taking the characters corresponding to the most relevant stimulation sequence as characters predicted by the decoder;
an external feedback system for feeding back the predicted character to the stimulus presentation system.
Preferably, the display device of the stimulation presentation system is a liquid crystal display screen or an LED array capable of modulating brightness and flicker frequency, the magnetoencephalic signal acquisition equipment is a magnetoencephalic cap, and the external feedback system comprises one or more of a feedback circuit, a voice synthesizer, a screen and a prosthetic.
Preferably, the real-time analysis workstation employs GPU-accelerated optimization.
The invention has the advantages that:
1) Stimulus modulated using a wide frequency luminance sequence is presented on screen as a flicker, i.e. the variation in stimulus luminance appears as random white noise over time. Such stimulation is advantageous for detecting the brain's response to all frequency band stimulus variations simultaneously, without being limited to a particular frequency.
2) The stimulus presentation is not limited by the refresh rate of the display any more, the brightness of the letters is refreshed by the coding display, the visual attribute of the stimulus is fully utilized, so that the electroencephalogram/magnetoencephalography signal contains more information related to the stimulus coding, and the intention recognition by using the artificial intelligent decoder is facilitated.
3) The maximum change frequency of the stimulation sequence is 60Hz, the frequency exceeds the perception threshold of human eyes on flicker frequency, and in practical use, a user perceives that screen flicker is far weaker than that of the traditional stimulation sequence design based on SSVEP, and the user experience is good.
4) The decoder based on artificial intelligence can decode the brightness sequence of letters in the user idea in real time frame by frame, in actual operation, the typing idea of the user can be decoded by using 250ms of data after stimulus presentation, the information transmission rate is high, the delay is small, and the decoding accuracy is high.
Drawings
FIG. 1 is a schematic diagram of a concept typewriter interface system based on visually induced broadband response according to the present invention;
FIG. 2 is a schematic diagram of a keyboard displayed by a stimulus presentation system of the present invention;
FIG. 3 is a schematic diagram of a luminance sequence for stimulation generated by a stimulation presentation system according to the present invention;
FIG. 4 is a flow chart of a concept typing method based on a visually induced broadband response according to the present invention;
FIG. 5 is a schematic diagram of decoding effect of a concept typewriter interface system based on visual evoked broadband response according to the present invention.
Detailed Description
In the following description, the brain-computer interface system of the present invention is further described in terms of specific embodiments to facilitate a more thorough understanding of the features and advantages of the present invention by those skilled in the art. It should be noted that the following description is merely representative of one exemplary application. It will be apparent that the invention is not limited to any particular structure, function, device, and method described herein, but may have other embodiments, or combinations of other embodiments. The software/hardware modules depicted in the present invention or as shown in the accompanying drawings may also be flexibly adapted as desired.
The present embodiment provides a ideographic typewriter interface system based on visual evoked broadband response, the hardware configuration of which is shown in fig. 1, and is described in detail below.
The stimulus presentation system comprises a display device for presenting a keyboard for ideographic typing, wherein each character brightness sequence in the keyboard is modulated by a random broadband brightness sequence, and the display device of the stimulus presentation system comprises a liquid crystal display screen or an LED array capable of modulating brightness and flicker frequency.
The magnetoencephalography signal acquisition equipment can be specifically a magnetoencephalography cap which is worn on the head of a user, acquires the magnetoencephalography signals of the user in real time and transmits the magnetoencephalography signals to a real-time analysis workstation.
And the real-time analysis workstation is used for preprocessing the received electroencephalogram/magnetoencephalogram signals, and meanwhile, an electroencephalogram/magnetoencephalogram signal decoder based on a convolutional neural network is built in the workstation for decoding the idea situation of a user in real time. The real-time analysis workstation adopts GPU acceleration optimization, so that electroencephalogram/magnetoencephalography signal analysis based on artificial intelligence can be realized, and the idea situation of a user can be decoded in real time.
And the external feedback system is used for feeding back the idea condition of the user on the stimulus presentation system according to the decoding result. The system comprises all external devices, such as a voice synthesizer, a screen, a prosthetic limb and the like, for receiving the decoding result of the electromagnetic signals of the brain of the human body by the real-time analysis workstation.
The processing process of the system is roughly divided into a first step of providing a brightness sequence for stimulation based on visual induction broadband response by stimulation presentation equipment and a second step of preprocessing and real-time decoding the received electroencephalogram/magnetoencephalography signals by a real-time analysis workstation. The whole process is shown in fig. 4, and is specifically described below.
S1, arranging a keyboard for ideographic typing (shown in figure 2) on a display device, wherein each character is presented in a square. Each character has two presentation states of "bright" and "dark" (or "display" and "disappearance"). For convenience of description, the "bright" or "display" state is recorded as 1, and the "dark" or "disappearing" state is recorded as 0.
S2, respectively generating random 0-1 brightness sequences for all characters on the keyboard through a pseudo-random algorithm. The generation of random 0-1 sequences may continually decimate numbers from 0 and 1, etc. Preferably, the luminance sequences of all characters are independent of each other, e.g. by exhaustion or other suitable means such that the correlation coefficient between the luminance sequences of the respective characters is less than 0.01 and the significance of the correlation coefficient is greater than 0.05 (meaning that there is no significant correlation between the luminance sequences, considered independent of each other). Fig. 2 shows a possible scenario of a stimulus sequence within 1s, where the keyboard has 38 keys in total, the screen refresh rate is 60Hz, 60 frames are screen refreshed within 1s, each frame corresponding to one presentation state of the character. Preferably, after generating a random luminance sequence, the luminance sequence is converted into frequency domain information through fourier transformation, power of each frequency band is normalized on the frequency domain, and then the luminance sequence for stimulation is converted back through inverse fourier transformation, so that the characteristics that energy of the luminance sequence for stimulation at different frequencies is identical and accords with broadband are ensured. As shown in fig. 3.
S3, a user observes the keyboard and preprocesses the collected electroencephalogram/magnetoencephalogram signals, wherein the basic steps include filtering 50Hz power frequency noise, band-pass filtering (low frequency: 1Hz, high frequency: 60 Hz) and downsampling to 600Hz for the signals of each channel. It should be noted that, these frequency parameters may be adjusted according to practical situations, where the high-frequency cutoff frequency of the band-pass filter is the display refresh rate, and the frequency of the signal downsampling should be an integer multiple (e.g. 600 Hz) of the display refresh rate, and the setting of these parameters can generally achieve a better effect in practical applications.
And S4, respectively taking the preprocessed electroencephalogram/magnetoencephalogram signals and the stimulus sequences as input and output, and training and verifying by using a decoder based on a convolutional neural network. Specifically, in the test stage, the brain electric/brain magnetic signals of the user are transmitted into a decoder based on a convolutional neural network, a predicted stimulation sequence is output, then the pearson correlation coefficient is calculated by the predicted stimulation sequence and the real brightness sequences of all characters respectively, and the character corresponding to the stimulation sequence with the largest phase relation number is taken as the character predicted by the decoder, as shown in fig. 5.
The following is a specific application example, in which the system needs to be trained in advance before use, and the training can be normally used after completion, and the system is generally divided into two parts of a training stage and an actual application stage.
Training phase
The head movement of the user is recorded in real time through a position coil on the brain magnetic cap, so as to correct and compensate the influence of the head movement on the brain magnetic signal.
A ideographic keyboard modulated based on a broadband sequence of brightness is presented on a stimulus presentation system, while a gaze point is presented. The gaze point is presented in turn in the centre of each key, with a duration of 5s, a disappearance of 1.5s, within each key, and then the gaze point is presented in the next key. The user needs to focus on the gaze point in this process, complete blinking within 1.5s of each vanishing of the gaze point and quickly move the gaze to the next key. The gaze point traversal is one training, which is repeated 4 times.
The actual magnetoencephalography channel number is 306, and 25 magnetometer data positioned on occipital lobes can be selected for training in order to reduce the operation cost. Taking the magnetoencephalography channel number of 25, the presenting time length of each key of 5s, the keyboard character number of 38, the training repetition of 4 times, the decoding time window of 0.25s, the screen refresh rate of 60Hz, and the magnetoencephalography signal sampling rate of 600Hz as examples, 25× (5×600) ×38×4 data, and (5×60) ×38×4 stimulation sequence matrix can be obtained. After preprocessing the magnetoencephalic signals, the signals are recombined into a matrix of 25× (5×600×38×4) × (0.25×600) with a time window of 0.25s, the stimulation sequence is up-sampled to a matrix of 5×600×38×4, the former is used as input, and the latter is used as output, and the latter is sent into a convolutional neural network for training. During training, the performance of the neural network can be checked by adopting a cross-validation method.
The purpose of the training phase is to provide training data to the artificial intelligence decoder so that the decoder learns parameters appropriate for the particular user. After the artificial intelligence decoder is trained, the decoder can be directly used for the user to conduct ideographic typing activities.
Stage of practical application
In actual use, if the user wants to type a letter, he needs to look at the letter on the screen, and if the decoding time window is set to be 0.25s, the minimum time for the user to look at a letter is 0.5s. After preprocessing the brain electric/brain magnetic signals of the user, the brain electric/brain magnetic signals are transmitted into a trained neural network to obtain the prediction of the stimulation mode in the time period of 0.25s to 0.5s. If the screen refresh rate is 60Hz, the decoder will output 0.25×60=150 predicted brightnesses during this period. AND (3) respectively carrying out Pearson correlation (or AND operation) on the predicted sequence AND the brightness sequences of all letters in 0.25s to 0.5s, AND taking the character corresponding to the maximum phase relation number or the maximum value of the AND operation sum as the character in the predicted user idea.
Currently, the accuracy of SSVEP idea typing technology (accuracy, ACC) based on the improved method on the market can reach 100%, and the information transmission rate (information TRANSFER RATE, ITR) is 150 to 200bits/min. In practical application, the accuracy of idea typing can reach 100% (individual difference exists between users), the information transmission rate can reach 210bits/min, and the method has the advantages of speed and accuracy compared with common idea typing patterns in the market.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, the scope of the invention is to be indicated by the appended claims.