Movatterモバイル変換


[0]ホーム

URL:


PEER-REVIEWED

Electroencephalography (EEG) based epilepsy diagnosis via multiple feature space fusion using shared hidden space-driven multi-view learning

View article

Electroencephalography (EEG) based epilepsy diagnosis via multiple feature space fusion using shared hidden space-driven multi-view learning

1Department of Electronics and Information Engineering, Bozhou University,Bozhou, Anhui,China
2School of Computer Science and Engineering, Xi’an University of Technology,Xi’an, Shanxi,China
3Department of Biomedical Engineering, Universiti Malaya,Kuala Lumpur,Malaysia
4Department of Medical Informatics, Nantong University,Nantong, Jiangsu,China
DOI
10.7717/peerj-cs.1874
Published
Accepted
Received
Academic Editor
Subject Areas
Bioinformatics,Artificial Intelligence,Data Mining and Machine Learning,Data Science,Databases
Keywords
Multi-view learning,EEG,Epilepsy,Shared hidden space
Copyright
©2024Hu et al.
Licence
This is an open access article distributed under the terms of theCreative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
Cite this article
Hu X, Xie Y, Zhao H, Sheng G, Lai KW, Zhang Y.2024.Electroencephalography (EEG) based epilepsy diagnosis via multiple feature space fusion using shared hidden space-driven multi-view learning.PeerJ Computer Science10:e1874
The authors have chosen to makethe review history of this article public.

Abstract

Epilepsy is a chronic, non-communicable disease caused by paroxysmal abnormal synchronized electrical activity of brain neurons, and is one of the most common neurological diseases worldwide. Electroencephalography (EEG) is currently a crucial tool for epilepsy diagnosis. With the development of artificial intelligence, multi-view learning-based EEG analysis has become an important method for automatic epilepsy recognition because EEG contains difficult types of features such as time-frequency features, frequency-domain features and time-domain features. However, current multi-view learning still faces some challenges, such as the difference between samples of the same class from different views is greater than the difference between samples of different classes from the same view. In view of this, in this study, we propose a shared hidden space-driven multi-view learning algorithm. The algorithm uses kernel density estimation to construct a shared hidden space and combines the shared hidden space with the original space to obtain an expanded space for multi-view learning. By constructing the expanded space and utilizing the information of both the shared hidden space and the original space for learning, the relevant information of samples within and across views can thereby be fully utilized. Experimental results on a dataset of epilepsy provided by the University of Bonn show that the proposed algorithm has promising performance, with an average classification accuracy value of 0.9787, which achieves at least 4% improvement compared to single-view methods.

Introduction

Epilepsy is a chronic, non-infectious but genetic disease that affects all ages and is caused by paroxysmal abnormal hypersynchrony of brain neurons. It is one of the most common neurological diseases globally. Due to the diversity and complexity of the clinical manifestation of epilepsy, it is often misdiagnosed or missed. Repetitive seizures can have a persistent negative impact on the patient’s mental and cognitive functions, even threatening their life. Therefore, the study of epilepsy diagnosis and treatment has important clinical significance. The brain electroencephalogram (EEG) is a microvolt-level electrical signal generated by synchronized neurons in the brain when electrodes are placed on the scalp at specific locations. As the most commonly used and cheapest non-invasive brain wave detection method, EEG has a history of over 70 years of research and is the most effective method for diagnosing epilepsy-related diseases, such as identifying seizures, predicting their occurrence, and localizing the affected areas. With the development of artificial intelligence, machine learning models are extensively used in automatic epilepsy recognition. Feature representation is a crucial step in machine learning. Research has indicated that EEG signals can be represented by both linear and non-linear features. Time-domain features are the fundamental features in EEG signal processing, primarily extracted by directly observing and calculating relevant characteristics from the raw signal. Their advantages lie in their simplicity of computation and ease of interpretation. However, the non-stationarity of EEG signals, individual differences, and external interferences can easily affect time-domain features. Frequency-domain features are based on the significant changes in energy in EEG during epileptic seizures, assuming that the background EEG is approximately stationary. Most frequency-domain features are derived from the study of signal power spectra, and various parameter estimation methods can be used for extracting spectral features. The accuracy of these parameters also affects the quality of frequency-domain features. If we consider the amount of information contained in the features, neither pure time-domain features nor frequency-domain features can comprehensively characterize an EEG signal. Additionally, EEG analysis based on the assumption of stationarity is not rigorous. Therefore, researchers have turned their attention to time-frequency analysis methods, such as time-frequency transformations, to re-represent non-stationary EEG signals and extract corresponding features. In addition to the aforementioned linear features, many studies also consider the brain as a nonlinear system and extract corresponding nonlinear features from descriptions of complexity, persistence, synchrony, and other changes in the system. These features are not affected by the non-stationarity of EEG signals and offer more flexibility in dealing with issues such as multi-channel correlation and channel loss. Based on the aforementioned linear or nonlinear feature representations, numerous scholars have constructed machine learning models for the automatic diagnosis of epilepsy. For example, the study conducted byLi, Chen & Zhang (2016) employed a dual-tree complex discrete wavelet transform to extract nonlinear features from individual components. The researchers utilized an ANOVA analysis to select relevant classification features, including the Hurst parameter and fuzzy entropy. For the classification task, a support vector machine (SVM) was employed.Reddy & Rao (2017) computed the central correlated entropy of wavelet components obtained from tunable Q-factor wavelet transform, and utilized models such as RF, LR, and multi-layer perceptron for epileptic signal recognition.Jaiswal & Banka (2017) proposed a feature extraction method called local gradient pattern transformation and applied classification methods such as k-nearest neighbors, SVM, and decision trees for epilepsy detection.

The aforementioned machine learning-based epilepsy diagnostic models utilize single EEG feature representation for epilepsy diagnosis, which have low model complexity and high interpretability. However, these models rely on expert knowledge, and deep features are not easily observed and extracted. As a result, the accuracy is limited. Multi-view learning (Zhao et al., 2017;Jiang et al., 2020;Zhang, Chung & Wang, 2018;Yan et al., 2021) improves the classification accuracy of models by utilizing the differences and similarities between multiple different views based on the principles of view consistency and complementarity. For example,Tian et al. (2019) utilized a convolutional neural network (CNN) model to extract deep features from EEG signals in the time domain, frequency domain, and time-frequency domain. These features were constructed as three views, and multi-view learning was conducted using a multi-view Takagi-Sugeno-Kang (TSK) fuzzy system, which improved the classification and detection performance compared to a single view.Yuan et al. (2018) implemented a multi-view epilepsy automatic diagnosis by utilizing channel characteristics and intra-channel time-frequency features of multi-channel EEG signals extracted using autoencoder (AE) through channel perception technology.Liu & Li (2019) utilized a user-sensitive model for channel selection and extracted time-frequency features from each sub-band of the selected channels, forming multi-view features. They extracted numerical and morphological features using a common spatial projection matrix and utilized a maximum average difference autoencoder to extract inter-channel time-frequency domain features, enabling automatic diagnosis of epilepsy with multiple views. These effective models based on collaborative regularization can construct a common feature space for multi-view learning. However, these models also have certain limitations. While these methods construct the density distributions of each view solely based on the corresponding observed data, they overlook the correlated information among all views. Additionally, they separate the original sample space from the common space obtained through mapping. This approach solely utilizes the common space for learning, neglecting the discriminative information present in the original space.

To overcome such shortcomings, in this study, a shared hidden feature space method is constructed by using kernel density estimation, and it is extended to an expanded space by combining it with the original space. Then, SVM is introduced and a multi-view SVM based on the shared hidden space is proposed to take a careful consideration of the differences and relationships between samples from different views. Through experimental verification on different multi-view data sets, the effectiveness of this method in addressing the challenges mentioned above has also been confirmed. The contributions of this study are mainly reflected in the following aspects:

(1) The kernel density estimation (KDE) technique is used to construct a new shared hidden space, and it is combined with the original space to construct an expanded space for multi-view learning, thus being able to effectively address the special issue mentioned above on multi-view learning.

(2) By constructing the expanded space and utilizing the information of both the shared hidden space and the original space for learning, thereby fully utilizing the relevant information of samples within and across views, we can effectively solve the problem that the difference between samples of the same class from different views is greater than the difference between samples of different classes from the same view.

(3) During the optimization phase, the proposed model is transformed into a classical Quadratic Programming (QP) problem, allowing for the utilization of pre-existing optimization methods that offer both high effectiveness and theoretical guarantees. This transformation enables the application of readily available optimization techniques, which have proven to be highly efficient in solving QP problems.

The following sections are organized as follows. In ‘Data’, we introduce the EEG data used in this study and the corresponding multiple feature space representation. In ‘Methodology’, we present the proposed model. In ‘Experimental studies’, experimental results are reported and in the last section, the whole study is summarized.

Data

The EEG data of epileptic patients used in this study was authorized and provided by the University of Bonn in Germany (Andrzejak et al., 2001), as shown inTable 1. The dataset included volunteers who could be divided into five groups, namely A, B, C, D, and E. Each group contained 100 single-channel EEG segments lasting 23.6 s, with a sampling rate of 173.6 Hz. The EEG signals of groups A and B were collected from healthy volunteers in a relaxed and conscious state, while the eyes of the volunteers were open during the data collection of group A and closed during the data collection of group B. The remaining three groups’ signals were collected from epileptic volunteers, with group C’s signals collected from the hippocampi of the two brain hemispheres, and group D’s signals collected from the epileptic foci. The signals of groups C and D were measured during periods without epileptic seizures, while group E collected signals during epileptic seizures.Figure 1 provides an example of EEG signals from five groups.

Table 1:
Basic collection information of epilepsy EEG signals.
Group#VolunteersCollection information
A100This group was collected from a group of healthy volunteers who were instructed to keep their eyes open during the recording process. These volunteers did not have any known neurological or psychiatric disorders and were not experiencing any abnormal symptoms at the time of data collection.
B100This group was collected from a group of healthy volunteers under conditions where they kept their eyes closed.
C100This group was collected from the hippocampal formation of the contralateral hemisphere of the brain during seizure-free intervals. These samples were obtained when the patient was not experiencing any epileptic seizures.
D100This group was collected from the epileptogenic zone during periods of seizure freedom. This implies that the recordings were obtained when the patient was not experiencing seizures.
E100The group was collected during seizure activity phase offering a unique opportunity to study the dynamics and temporal dynamics of epileptic seizures, paving the way for the development of more accurate and reliable seizure detection and prediction algorithms.
EEG signals from five groups.

Figure 1:EEG signals from five groups.

Frequency-domain representation extraction

Frequency-domain feature representation originates from the significant changes in energy in EEG during epileptic seizures. To extract frequency-domain representation from EEG signals, the Daubechies4 wavelet coefficients are utilized to decompose the original signals into a series of binary wavelets. The frequency band of each Daubechies4 wavelet coefficient is provided inTable 2. By applying these settings, the EEG signals are divided into six distinct frequency bands. An illustrative example of the decomposed signals from group E is depicted inFig. 2.

Table 2:
Frequency band of each Daubechies4 wavelet coefficient.
CoefficientFrequency band
Daubechies4 (4, 0)0–2 Hz
Daubechies4 (4, 5)2–4 Hz
Daubechies4 (4, 4)4–8 Hz
Daubechies4 (4, 3)8–15 Hz
Daubechies4 (4, 2)16–30 Hz
Daubechies4 (4, 1)31–60 Hz
Example of frequency-domain representation.

Figure 2:Example of frequency-domain representation.

Time-domain feature extraction

Time-domain features are the fundamental features in EEG signal processing, primarily extracted by directly observing and calculating relevant characteristics from the raw signal. Their advantages lie in their simplicity of computation and ease of interpretation for researchers. In this study, we employ kernel principal component analysis (KPCA) (Li et al., 2022b) on the raw EEG signals to enable complex nonlinear mapping. Previous research has shown that KPCA features offer discriminative patterns suitable for pattern recognition. An illustration depicting an example of KPCA features from group E can be observed inFig. 3.

Example of time-domain representation.

Figure 3:Example of time-domain representation.

Time-frequency representation extraction

Pure time-domain or frequency-domain feature representations alone cannot comprehensively characterize an EEG signal, and EEG analysis based on the assumption of stationarity is not rigorous. Therefore, researchers have turned their attention to time-frequency analysis methods, such as time-frequency transformations, to re-represent non-stationary EEG signals and extract corresponding features. To capture time-frequency representation, researchers often employ the short-time Fourier transform (STFT) (Li et al., 2022a). STFT allows for the analysis of how the frequency content of a signal changes over time. It can be formulated as follows:

(1)Ftimefre(time,fre)=inf+infx(time)g(timeu)ej2πfretimed(time).

In the context of EEG signal analysis,Eq. (1) represents the transformation of continuous EEG signals, denoted asx(time), into the time-frequency plane using the functiong(timeu) and a limited width window centered aroundu. This transformation, referred to asFtimefre(time,fre), provides a means to examine the time-varying nature of the EEG signals, revealing local spectrum discrepancies at different time points. To achieve this, the EEG signals undergo partitioning into several segments of local stationary signals using STFT. Through this process, the time-varying characteristics of the EEG signals are captured, highlighting variations in the spectrum. The extraction of six energy bands as features is accomplished usingEq. (1), which takes into account the observed discrepancies. A visualization of these six energy bands, exemplified by group E, is illustrated inFig. 4.

Example of time-frequency representation.

Figure 4:Example of time-frequency representation.

Methodology

In this section, we will design a shared hidden space-driven multi-view learning method to fuse time-frequency representation, frequency-domain representation and time-domain representation.

Construction of shared hidden feature space

Suppose thatΩRr×d is an orthogonal matrix subject toΩΩT=IRr×r,fA={xiA,yi|xiARd,i=1,2,,N} represents one kind of feature space,e.g., time-domain feature space, andfB={xiB,yi|xiBRd,i=1,2,,N} represents another kind of feature space, then the hidden feature space offA andfB can be generated byΩxiARrandΩxiBRr, respectively, wherer represents the number of hidden features. To obtain a consistent hidden feature space betweenΩxiA andΩxiB, it is expected that the difference between them should be minimized as much as possible. Kernel density estimation (KDE), which is one of the non-parametric estimation methods in probability theory, is usually used to estimate the unknown probability density function (Wang, Wang & Chung, 2013). For a training setX={xi,yi|xiRd,i=1,2,,N}, its corresponding kernel density estimation function can be expressed as

(2)P(x)=1Ni=1Nδ2K(xxiδ),whereδ is the kernel width,K() is the kernel function. If the Gaussian kernel function is adopted, thenEq. (2) can be updated asP(x)=1Ni=1N1δ2πexp(12(xxiδ)2). Therefore, the kernel density estimation ofΩxiA andΩxiB can be expressed as follows when using the Gaussian kernel function, respectively,

(3)ΩxΩxiA22δ2PA(x~)=PA(Ωx)=1Nδ2πi=1Ne,

(4)ΩxΩxiB22δ2PB(x~)=PB(Ωx)=1Nδ2πi=1Ne.

In this study, the difference betweenPA(x~) andPB(x~) is measured by the mean square error, that is

(5)J=(PA(x~)PB(x~))2dx.

By minimizingJ, the two-view dataxiA andxiB can be made to have the maximum commonality in the shared hidden space, and thus the challenge of excessive variability between samples from different views can be addressed. In order to solveEq. (6), we suppose thatG(Ωx,Ωxi,δ2)=1δ2πeΩxΩxi22δ2 , thenPA(x~) andPB(x~) can be updated asPA(x~)=1Ni=1NG(ΩxΩxiA,δ2) andPB(x~)=1Ni=1NG(ΩxΩxiB,δ2). Therefore,Eq. (5) can be computed byJ=PA(x~)dx2PA(x~)PB(x~)dx+PB(x~)dx. According toWang, Wang & Chung (2013),Hansen, Jaumard & Xiong (1994), we haveG(x,xi,δ12)G(x,xj,δ22)dx=G(xi,xj,δ12+δ22), Therefore, we have the following equations,

(6)PA2(x~)dx=1N2i=1Nj=1NG(x~iA,x~jA,2δ2)=1Ni=1N[1Nj=1NG(x~iA,x~jA,2δ2)]

(7)PB2(x~)dx=1N2i=1Nj=1NG(x~iB,x~jB,2δ2)=1Ni=1N[1Nj=1NG(x~iB,x~jB,2δ2)]

(8)PA(x~)PB(x~)dx=1N2i=1Nj=1NG(x~iA,x~jB,2δ2)where1Nj=1NG(x~iA,x~jA,2σ2)can be taken as another estimation ofPA(x~iA). Therefore,PA2(x~)dx can be estimated by1Nj=1NPA(x~iA), and further1N. Similarly,PB2(x~)dx can be estimated by1N. Thus, we finally haveJ1N+1N2N2G(x~iA,x~jB,2δ2). Therefore, we have the following objective,

(9)argminΩJargminΩi=1Nj=1NG(x˜iA,x˜jB,2δ2)                                                                s.t.ΩΩT=Ir×r

However, it is difficult to solveEq. (9) directly. Thus, Taylor expansion can be used for getting an approximate solution. Hence, we have

(10)G(x˜iA,x˜jB,2δ2)=12πδeΩxiAΩxjB24σ2 12πδ(1(ΩxiAΩxjB)2)

Therefore,Eq. (9) can be further updated as

(11)argminΩi=1Nj=1N(ΩxiAΩxjB)2,s.t.ΩΩT=Ir×rinEq. (11), implicit feature transformation matrixΩ still cannot be solved directly, but can be solved by gradient descent method. Thus,Eq. (11) can be updated as

(12)J=argminΩi=1Nj=1N((xiA)TΩTΩxiA+(xjB)TΩTΩxjB2(xiA)TΩTΩxjB)s.t.ΩΩT=Ir×r

The partial derivative ofJ w.r.t.Ω is

(13)JΩ=i=1Nj=1N(2ΩxiA(xiA)T+2ΩxjB(xjB)T2Ω(xiA(xiA)T+xjB(xjB)T))

Then the transformation matrixΩ can be solved by gradient descent method, that is,

(14)ΩΩηJΩ(Ir×rΩΩT)=ΩηΩwhereη is the step size that can be solved by

(15)η=i=1Nj=1N((xiA)T(ΩTΩ+ΩTΩ)xiA+(xjB)T(ΩTΩ+ΩTΩ)xjB       2(xiA)T(ΩTΩ+ΩTΩ)xjB)i=1Nj=1N(2(xiA)TΩTΩxiA+(xjB)TΩTΩxjB4(xiA)TΩTΩxjB)

According to the above analysis and derivation, the algorithm for solving implicit feature transformation matrixΩ is described as follows.

Multi-view learning based on shared hidden feature space

After determining the shared hidden space between two views, the extended space can be generated by combining the original space and the shared hidden space. Then, a multi-view classifier based on SVM is designed for multi-view data classification in the extended space. In existing multi-view learning mechanisms, it is generally assumed that each view can provide a classifier containing specific information, and classifiers constructed from different view tend to be consistent. Additionally, since views can provide specific information to each other, the proposed model establishes the objective function by considering the mutual information between two views. In summary, the proposed model, based on SVM, restructures the slack variables on each view, and then narrows the gap between the two views by using the corresponding regularization term. The objective function of multi-view learning based on shared hidden feature space can be formulated as

(16)argminwA,wB,vA,vB,bA,bB12wA2+12wB2+12vA2+12vB2+CAi=1NξiA+CBi=1NξiB+λvAvB2s.t.yi(wATϕ(xiA)+vATϕ(ΩxiA)+bA)1ξiAyi(wBTϕ(xiB)+vBTϕ(ΩxiB)+bB)1ξiBξiA,ξiB0,i=1,2,,Nwhereλ,CA andCB are the regularization parameters. Observe thatEq. (16) consists of three parts: the first four terms reflect the outcome risk in the original feature space and the shared hidden space respectively; the second two terms represent the empirical risk; and the third term reflects the difference between the two views in the shared hidden space. The objective function inEq. (16) strengthens the constraints based on the traditional SVM through the implicit mapping, so that the probability distributions of data from different views in the shared hidden space are as consistent as possible, which can well solve the problem described at the beginning of this study. In order to solveEq. (16) efficiently, the relevant Lagrangian multipliers are introduced according to the Lagrangian optimization theory, henceEq. (16) can be converted into the corresponding dual form as follows. The Lagrangian function corresponding toEq. (16) is

(17)L=12wA2+12wB2+12vA2+12vB2+CAi=1NξiA+CBi=1NξiB+λvAvB2+i=1NαiA(1ξiAyi(wATϕ(xiA)+vATϕ(ΩxiA)+bA))+i=1NαiB(1ξiByi(wBTϕ(xiB)+vBTϕ(ΩxiB)+bB))i=1NμiAξiAi=1NμiBξiBwhereαiA0,αiB0,μiA0, andμiB0 are Lagrangian multipliers. By setting the partial derivatives of Lagrangian functionL with respect towA,wB,vA,vB,bA,bB,ξiA, andξiB to 0, we have

(18)wA=i=1NαiAyiϕ(xiA),wB=i=1NαiByiϕ(xiB),

(19)vA=1+2λ1+4λi=1NαiAyiϕ(xiA)+2λ1+4λi=1NαiByiϕ(xiB),

(20)vB=1+2λ1+4λi=1NαiByiϕ(xiB)+2λ1+4λi=1NαiAyiϕ(xiA),

(21)i=1NαiAyi=0,i=1NαiByi=0,

(22)CA=αiA+uiA,CB=αiB+uiB

By submittingEqs. (1822) toEq. (16), we have the dual problem ofEq. (24), which can be defined as

(23)argmaxα~12α~Tα~+α~T1.s.t.α~Tf=0,f=[yT,yT]Tα~i0,iwhere

(24)α~=[α1A,α2A,,αNA,α1B,α2B,,αNB]T,

(25)KA=K(xA,xA)yyT+1+2λ1+4λK(ΩxA,ΩxA)yyT

(26)KB=K(xB,xB)yyT+1+2λ1+4λK(ΩxB,ΩxB)yyT

(27)KAB=2λ1+4λK(ΩxA,ΩxB)yyT

(28)K= [KAKABKABKB]    

(29)y=[y1,y2,,yN]TandK is the kernel function. It is obvious that the optimization ofEq. (23) can be considered as a QP problem, which can be solved according toDeng et al. (2013). The decision function of the proposed model in this study is defined as

(30)f(x)=12(wATϕ(xA)+vATϕ(ΩxA)+bA+wBTϕ(xB)+vBTϕ(ΩxB)+bB)

The algorithm of multi-view learning based on shared hidden feature space can be obtained, as shown inAlgorithm 2. FromAlgorithm 2, we can find that the time complexity is mainly contributed by steps 1, 3 and 4. The time complexity ofAlgorithm 1 isO(Nrd+r2). The time complexity of step 3 isO((r+d)2). The time complexity of step 4 isO(N2). Therefore, the time complexity ofAlgorithm 2 isO(Nrd+r2+(r+d)2+N2).

Algorithm 1:
Shared hidden feature space generation.
Input:xiA,xiB, andy=[yi]i=1,2,,N
Output:Ω
Procedures:
1. InitializeΩ0Rr×d,t=0,itermax,δ=1e6.
2. Repeat:
3.t=t+1.
4. ComputeJΩ andη byEqs. (13) and(15).
5. UpdateΩ(t)byEq. (14).
6. UntilΩ(t)Ω(t1)δ ort>itermax
Algorithm 2:
Multi-view learning based on shared hidden feature space.
Input: training samples of view-1:{xiA,yi}, training samples of view-2:{xiB,yi}, regularized parametersCA,CB andλ
Output:wAT,wBT,bA,bB,vA andvB
Procedures:
1. UseAlgorithm 1 to obtainΩ
2. UseΩto obtain the shared hidden space
3. Solve theα~i according toEq. (23)
4. Solve thewAT,wBT,bA,bB,vA andvB byEqs. (18)(22)
5. Construct the decision function based onwAT,wBT,bA,bB,vA andvB

Experimental studies

Settings

To observe the merits of the proposed model, k-nearest neighbor (KNN) (Liu & Liu, 2016), support vector machine (SVM) (Liu & Liu, 2016), SVM2K (Farquhar et al., 2005), multi-view L2-SVM (MV-L2-SVM) (Huang, Chung & Wang, 2016), and alternative multi-view MED (AMVMED) (Chao & Sun, 2015) are introduced for comparison studies. Accuracy is used as the evaluation indicator in this study. SVM, SVM2K, MV-L2-SVM, and 2V-SVM-SH are all trained using a Gaussian kernel for experimentation. For all methods, ten-fold cross-validation (CV) is used to determine the optimal parameters.Table 3 provides the specific parameters and ranges used for each method. All experiments are conducted on a PC with a 16-core CPU with a clock speed of 3.40 GHz and 32 GB of memory. The programming environment was Matlab R2016a.

Table 3:
Parameter settings.
MethodParameter settings
KNNk ∈{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
SVMC ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},σ ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8}
SVM-2KCA ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},CB ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},D ∈{2e−5, 2e−4, …, 2e0, 2e1, …, 2e4, 2e5},σ ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8}
MV-L2-SVMCA∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},CB ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},σ ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8}
AMVMEDCA ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},CB∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8}, γ ∈{0.1, 0.2, …, 0.9}
Proposed modelCA ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},CB ∈ {2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},σ ∈{2e−8, 2e−7, …, 2e0, 2e1, …, 2e7, 2e8},λ ∈{0.1, 0.2, …, 0.9, 1};

To construct a two-view learning scenario, based on “Data”, three feature extraction methods, namely wavelet packet decomposition (WPD), short-time Fourier transform (STFT) and kernel principal component analysis (KPCA) are adopted, to extract time-frequency features, frequency-domain features and time-domain features from the original EEG signals, as shown inFig. 2. Finally, 12 datasets are constructed, as shown inTable 4.

Table 4:
Two-view learning scenarios.
DatasetsClassification tasksViews (view-A, view-B)#Sample size
DS1ABvs CDEWPD, STFT500
DS2ABvs CDEWPD, KPCA500
DS3ABvs CDESTFT, KPCA500
DS4ABvs CDWPD, STFT400
DS5ABvs CDWPD, KPCA400
DS6ABvs CDSTFT, KPCA400
DS7ABvs DEWPD, STFT400
DS8ABvs DEWPD, KPCA400
DS9ABvs DESTFT, KPCA400
DS10ABvs CEWPD, STFT400
DS11ABvs DEWPD, KPCA400
DS12ABvs CESTFT, KPCA400

Experimental results and analysis

The experimental results are reported inTable 5. We can see fromTable 5 that the proposed model wins the best performance on most datasets. Only on DS5, DS9, the proposed model performs worse than SVM-2K and MV-L2-SVM. The advantages of the proposed model indicate the promising ability of the shared hidden space. From the promising results, it can be found that by constructing the expanded space and utilizing the information of both the shared hidden space and the original space for learning, thereby fully utilizing the relevant information of samples within and across views, the proposed model effectively solves the problem that the difference between samples of the same class from different views is greater than the difference between samples of different classes from the same view. The experimental results also indicate the power of KDE which is used to construct the shared hidden space.

Table 5:
Classification performance in terms of accuracy on all multi-view learning scenarios.
DatasetsKNN_A (KNN on view-A)KNN_B (KNN on view-B)SVM_A (SVM on view-A)SVM_B (SVM on view-B)SVM-2KMV-L2-SVMAMVMEDProposed model
DS10.9098 (0.0019)0.9176 (0.0045)0.9432 (0.0076)0.9521 (0.0087)0.9754 (0.0063)0.9543 (0.0065)0.9643 (0.0043)0.9876 (0.0023)
DS20.9213 (0.0032)0.9098 (0.0021)0.9583 (0.0065)0.9321 (0.0087)0.9654 (0.0063)0.9431 (0.0065)0.9546 (0.0043)0.9768 (0.0023)
DS30.9223 (0.0034)0.9098 (0.0021)0.9345 (0.0022)0.9321 (0.0087)0.9654 (0.0023)0.9437 (0.0013)0.9554 (0.0063)0.9764 (0.0034)
DS40.9214 (0.0034)0.9097 (0.0011)0.9067 (0.0073)0.9164 (0.0027)0.9567 (0.0032)0.9511 (0.0023)0.9598 (0.0044)0.9690 (0.0036)
DS50.9214 (0.0034)0.9481 (0.0023)0.9875 (0.0046)0.9467 (0.0056)0.9892 (0.0017)0.9564 (0.0054)0.9578 (0.0023)0.9743 (0.0045)
DS60.9324 (0.0052)0.9481 (0.0023)0.9875 (0.0046)0.9467 (0.0056)0.9653 (0.0018)0.9511 (0.0034)0.9587 (0.0033)0.9811 (0.0056)
DS70.9331 (0.0026)0.9325 (0.0026)0.9481 (0.0017)0.9435 (0.0037)0.9563 (0.0032)0.9673 (0.0026)0.9543 (0.0046)0.9781 (0.0015)
DS80.9331 (0.0026)0.9221 (0.0025)0.9481 (0.0017)0.9387 (0.0026)0.9612 (0.0018)0.9671 (0.0056)0.9409 (0.0055)0.9812 (0.0035)
DS90.9631 (0.0015)0.9221 (0.0025)0.9511 (0.0090)0.9387 (0.0026)0.9654 (0.0143)0.9786 (0.0087)0.9765 (0.0049)0.9760 (0.0054)
DS100.9318 (0.0079)0.9543 (0.0056)0.9345 (0.0054)0.9245 (0.0064)0.9534 (0.0048)0.9501 (0.0047)0.9534 (0.0019)0.9756 (0.0087)
DS110.9134 (0.0078)0.9215 (0.0056)0.9381 (0.0054)0.9275 (0.0034)0.9452 (0.0036)0.9517 (0.0045)0.9732 (0.0017)0.9789 (0.0087)
DS120.9532 (0.0035)0.9378 (0.0043)0.9785 (0.0038)0.9634 (0.0014)0.9763 (0.0013)0.9587 (0.0054)0.9661 (0.0064)0.9898 (0.0034)
Average0.93110.93330.94720.94340.96460.95610.95960.9787

Note:

Bold entries indicate the best performance achieved by the corresponding method.

Statistical analysis

We use the Friedman test (Zimmerman & Zumbo, 1993;Sakamoto et al., 2015) to conduct a statistical analysis of the experimental results on all methods across all datasets. The Friedman test is a non-parametric testing method that can be used to analyze whether there are significant differences in performance among multiple methods on multiple datasets. The principle is to first obtain the average ranking of each method’s performance on all datasets, and then compare whether these rankings are the same. If they are the same, it indicates that all methods have the same performance, otherwise it suggests that there are significant differences in performance among all methods. If there are significant differences among all methods, we further use a Holmpost-hoc hypothesis test to specifically analyze which methods and our proposed algorithm have significant differences. FromFig. 5, we see that 2V-SVM-SH wins the best ranking result. Thep-values embedded inFig. 5 computed by Friedman test hint that there are significant differences among different models. FromTable 6, it can be seen that all hypothesis is rejected except the proposed modelvs AMVMED and the proposed modelvs SVM-2K. These results indicate that the proposed model performs significantly better than KNN-A, KNN-B, SVM-B, SVM-A and MV-L2-SVM. Although the hypothesis of the proposed modelvs AMVMED and the proposed modelvs SVM-2K is not reject, the low p-value of the proposed modelvs AMVMED and the proposed modelvs SVM-2K also indicates the reveal the competition of the proposed model.

Friedman rankings of all models.

Figure 5:Friedman rankings of all models.

Table 6:
Holm test results with α = 0.05.
iAlgorithmz=(R0Ri)/SEpHolm=α/iHypothesis
7KNN-A5.58333300.007143Rejected
6KNN-B5.2500.008333Rejected
5SVM-B4.1666670.0000310.01Rejected
4SVM-A3.6666670.0002460.0125Rejected
3MV-L2-SVM2.50.0124190.016667Rejected
2AMVMED2.1250.0335870.025Not rejected
1SVM-2K1.3750.1691310.05Not rejected

Conclusions

In this study, a multi-view support vector machine based on a shared hidden space is constructed using kernel density estimation. The method is designed to address the problem of decreased recognition performance due to the difference in sample characteristics between different view models in multi-view learning. The method involves incorporating SVM into the shared hidden space, resulting in an effective solution to the problem of solving the classic QP problem. Experimental results on EEG-based epilepsy diagnosis demonstrate that our proposed method is better able to extract complementary information between different view models than other methods.

In practical applications, annotating training samples is often a time-consuming task. Therefore, in subsequent research, we intend to extend the multi-view algorithm proposed in this article to transfer learning scenarios, aiming to reduce the reliance on labeled samples.

Supplemental Information

EEG datasets of five group.

Download article

Electroencephalography (EEG) based epilepsy diagnosis via multiple feature space fusion using shared hidden space-driven multi-view learning
Your download will start in a moment...
Close
Subscribe for subject updates
Close
Cancel

Report a problem

Common use cases
Typos, corrections needed, missing information, abuse, etc

Our promise
PeerJ promises to address all issues as quickly and professionally as possible. We thank you in advance for your patience and understanding.

500 characters remaining

Follow this publication for updates

"Following" is like subscribing to any updates related to a publication. These updates will appear in your home dashboard each time you visit PeerJ.

You can also choose to receive updates via daily or weekly email digests. If you are following multiple publications then we will send you no more than one email per day or week based on your preferences.

Note: You are now also subscribed to the subject areas of this publication and will receive updates in the daily or weekly email digests if turned on. You canadd specific subject areas through your profile settings.


Change notification settings or unfollow

Loading ...

Usage since published - updated daily

Top referralsunique visitors

From bookmark or typed URL
535
Google search
60

Share this publication

Metrics

Links

Articles citing this paper

Loading citing articles…
 Visitors Views Downloads

[8]ページ先頭

©2009-2025 Movatter.jp