- Research
- Open access
- Published:
Deep learning-based classification of dementia using image representation of subcortical signals
- Shivani Ranjan1 na1,
- Ayush Tripathi1 na1,
- Harshal Shende1,
- Robin Badal3,
- Amit Kumar2,
- Pramod Yadav3,
- Deepak Joshi2 &
- …
- Lalan Kumar1,4
BMC Medical Informatics and Decision Makingvolume 25, Article number: 113 (2025)Cite this article
678Accesses
7Altmetric
Abstract
Background
Dementia is a neurological syndrome marked by cognitive decline. Alzheimer’s disease (AD) and frontotemporal dementia (FTD) are the common forms of dementia, each with distinct progression patterns. Early and accurate diagnosis of dementia cases (AD and FTD) is crucial for effective medical care, as both conditions have similar early-symptoms. EEG, a non-invasive tool for recording brain activity, has shown potential in distinguishing AD from FTD and mild cognitive impairment (MCI).
Methods
This study aims to develop a deep learning-based classification system for dementia by analyzing EEG derived scout time-series signals from deep brain regions, specifically the hippocampus, amygdala, and thalamus. Scout time series extracted via the standardized low-resolution brain electromagnetic tomography (sLORETA) technique are utilized. The time series is converted to image representations using continuous wavelet transform (CWT) and fed as input to deep learning models. Two high-density EEG datasets are utilized to validate the efficacy of the proposed method: the online BrainLat dataset (128 channels, comprising 16 AD, 13 FTD, and 19 healthy controls (HC)) and the in-house IITD-AIIA dataset (64 channels, including subjects with 10 AD, 9 MCI, and 8 HC). Different classification strategies and classifier combinations have been utilized for the accurate mapping of classes in both data sets.
Results
The best results were achieved using a product of probabilities from classifiers for left and right subcortical regions in conjunction with the DenseNet model architecture. It yield accuracies of 94.17\(\%\) and 77.72\(\%\) on the BrainLat and IITD-AIIA datasets, respectively.
Conclusions
The results highlight that the image representation-based deep learning approach has the potential to differentiate various stages of dementia. It pave the way for more accurate and early diagnosis, which is crucial for the effective treatment and management of debilitating conditions.
Introduction
Background & related work
Dementia represents a neurological syndrome impairing cognitive functioning, behaviour, and daily activities [1]. It leads to nerve cell degeneration and disrupted brain communication [2]. The number of people with dementia is expected to double worldwide by 2050 [3], with Alzheimer’s disease (AD) being the most prevalent form, significantly contributing to this increase. Despite progress in diagnosing and managing AD, no definitive cure for AD exists. Thus, early or timely detection is a global research priority [4].
Mild cognitive impairment (MCI) is an intermediate stage between healthy ageing and dementia, with a 3–15\(\%\) annual conversion rate to AD compared to 1–2\(\%\) in the general population [5]. Frontotemporal dementia (FTD), the second most common form, is characterized by changes in language, behaviour, executive function, and motor symptoms [6]. AD and FTD present similar early symptoms, often leading to misdiagnosis and complicating treatment due to their distinct progression patterns and causes [7].
Diagnostic methods face challenges due to a lack of optimal behavioural tests and the high cost of cerebrospinal fluid (CSF) and blood marker tests [8]. Screening tools such as the Clinical Dementia Rating (CDR) [9], Mini-Mental State Exam (MMSE) [10], Montreal Cognitive Assessment (MoCA) [11], and Addenbrooke’s Cognitive Examination III (ACE-III) [12] are useful but have limitations. These limitations include time-consuming administration, reliance on subjective judgments, influence by education level and premorbid intelligence, and less sensitivity at early stages [12]. There is a growing focus on identifying non-invasive brain markers to detect disease pathology before behavioural symptoms appear [13].
Mainstream early diagnosis relies on pathological biomarkers like\(\beta\)-Amyloid and tau Positron Emission Tomography (PET) neuroimaging [14]. AD stages are primarily associated with\(\beta\)-amyloid plaques and tau tangles [15], while FTD involves tau or TDP-43 protein abnormalities [16]. Imaging methods like Computed Tomography (CT), PET [14], and functional Magnetic Resonance Imaging (fMRI) have been used in literature, with fMRI showing higher sensitivity in some cases [17]. Machine learning and MRI-based differentiation [18,19,20,21] offer high accuracy in distinguishing these conditions [22]. However, the practical utility of these neuroimaging methods is restricted by high infrastructure costs, less favourability in terms of patient tolerance, and brain-computer interface applications.
Electroencephalogram (EEG) has gained significant attention as a noninvasive tool to analyze brain activity [23,24]. It has proven reliable in distinguishing dementia patients from controls [25,26]. The suitability of EEG for repeated studies and patient monitoring makes it useful in early diagnosis and continuous tracking of AD. EEG detects changes in frequency bands, each corresponding to different functional brain alterations. These include\(\delta\) (0.5–4 Hz) for slow activity,\(\theta\) (4–8 Hz) for sleep-wake transitions,\(\alpha\) (8–12 Hz) for resting states,\(\beta\) (12–30 Hz) for attention, and\(\gamma\) (above 30 Hz) for complex cognitive processes [27,28]. This capability aids in defining the neurophysiological profile of AD stages and differentiating it from FTD [29,30].
However, artifacts from physiological and external sources can obscure or distort crucial frequency bands of EEG signals. This affects neuronal information clarity and integrity. Advancements in signal processing and the use of Machine Learning tools have improved the ability of EEG to differentiate AD from other conditions [26,31]. These improvements enhance classification accuracy and artifact removal. These tools may also aid in automation and the discovery of new neurophysiological markers [32].
Previous research on differentiating AD from FTD [22,29,30,31,33] and AD from MCI [26,27,35,36,37] has primarily utilized EEG features such as subbands power, Global Field Power (GFP), spectral ratios, and connectivity features, as detailed in Table1. Despite these insights, diagnosing dementia remains challenging due to the extensive signal analysis required. Effective diagnosis requires a combination of complex features, including time-domain, frequency-domain, and connectivity metrics. As may be noted from Table1, the current studies have primarily targeted sensor information or variations in cortical regions. However, deep brain regions (subcortical regions), especially the hippocampus, are crucial for accurate AD and FTD classification due to their early involvement in disease progression [38,39]. AD often begins with neurodegeneration in subcortical areas like the hippocampus before affecting the cerebral cortex [40]. This early involvement makes deep brain regions essential for early diagnosis and precise differentiation between AD and FTD. Detecting changes in these regions can significantly enhance classification accuracy and provide earlier diagnostic insights [39,41]. Additionally, it has been established that subcortical signals can be detected using surface EEG [42]. Motivated by these, the current study focuses on utilizing time-series signals from deep brain regions, specifically the hippocampus, amygdala, and thalamus for Dementia classification.
Objectives and contributions
In this work, an image representation-based framework has been presented for the classification [43] of three stages of dementia on two different high-density EEG datasets. In the online dataset, the 3-class classification task involves HC and subjects with FTD and AD dementia. The framework has additionally, been validated for in-house collected EEG dataset comprising of subjects with MCI, AD, and HC. The pipeline starts with the extraction of the scout time series corresponding to the left and right regions of the thalamus, hippocampus, and amygdala, using the sLORETA technique. Subsequently, the time series epochs are converted to image representation using a continuous wavelet transform. By utilizing the multi-resolution CWT-based image representation, the time-frequency maps of signals corresponding to the three categories are efficiently learned. In order to learn this mapping, the images corresponding to the left and right regions are fed to standard deep learning model architectures from the computer vision domain such as Xception [44], ResNet [45], InceptionResNet [46], MobileNet [47], NasNetMobile [48], EfficientNet [49], and DenseNet [50]. For classifying the corresponding images, different classification strategies have been adopted. First, the models from the left and right regions are used in isolation for prediction. Subsequently, the sum and product of the posterior probabilities are utilized for the classification task. Additionally, two fusion techniques, namely, early fusion and tensor fusion networks, have also been explored for the purpose of dementia classification on both datasets.
The remainder of the paper is organized as follows: “Materials and methods” section provides the description of the datasets utilized in this study (“Dataset description” section), the scout time series extraction process (“Scout time series extraction” section), preparation of image data (“Image data preparation” section), and the adopted classification strategy (“Classification strategy” section). Experimental details and results are presented in “Experiments and results” section, and “Conclusions” section concludes the paper.
Block diagram depicting the proposed method. The processed EEG signals are utilized to extract scout time series from the hippocampus, amygdala, and thalamus using sLORETA. The signals are segmented and divided into left and right regions. Subsequently, the CWT-based images are fed to separate classifiers for images corresponding to left and right regions.\(z_L\) and\(z_R\) represent the latent representation of the classifiers, while\(\hat{y}_L\) and\(\hat{y}_R\) denote classifier predictions. The latent embeddings are fused using Early and Tensor Fusion, while the individual classifier outputs are fused using probability sum and product
Materials and methods
In this section, the datasets utilized in the study, EEG preprocessing steps, scout time series extraction, image data preparation and classification strategy have been elaborated. A block diagram representing the complete pipeline is presented in Fig.1.
Dataset description
BrainLat dataset
The dataset utilized in this study is the section of publicly available, preprocessed EEG recordings released by the Latin American Brain Health Institute (BrainLat) [51]. The selected subset comprises five-minute EEG recordings from the Latin American population. More specifically, resting-state, eyes-closed recordings from 48 subjects (AD = 16; FTD = 13; HC = 19) were used for the experiments. The recordings were obtained using a 128-channel Biosemi Active II system with pin-type active, sintered Ag-AgCl electrodes referenced to contralateral linked mastoids. External electrodes were also placed periocularly to capture blinks and eye movements. Analog filters with a frequency cutoff of 0.03–100 Hz were used to reduce noise. The EEG was monitored online to detect drowsiness, muscle activity, and sweat artefacts.
The recorded data was processed offline using EEGLab [52]. The first step involved in processing steps was average referencing of the EEG data. Subsequently, a bandpass filter was applied between 0.5 and 40 Hz using a zero-phase shift Butterworth filter of order 8. The data was then downsampled from 2048 to 512 Hz. Independent Component Analysis (ICA) was employed to detect artefacts induced by blinking and eye movements. The components identified as artefacts were then removed to obtain clean EEG data. Malfunctioning channels were identified using a semiautomatic detection method and replaced using weighted spherical interpolation. Finally, the processed EEG signals were stored for subsequent scout time series extraction.
IITD-AIIA dataset
The second dataset utilized in this study consists of in-house collected resting-state, eyes-closed EEG data recorded from 27 (AD = 10; MCI = 9; HC = 8) right-handed participants aged 60–80 years. The data collection protocol was approved by the Institute Ethics Committee, All India Institute of Ayurveda, New Delhi. EEG data was recorded using a 64-channel Ag/AgCl active electrode EEG setup (actiCHamp, Brain Products GmbH, Germany) with Fz as the reference electrode. The signals were recorded at a sampling rate of 1000 Hz, and the 10:10 EEG electrode placement system was adopted. A conductive EEG gel was applied under each electrode to maintain resistance below 10 k\(\Omega\), ensuring a high signal-to-noise ratio. No internal filters were used during the recording. The diagnosis of the AD, MCI, and HC groups were based on the criteria of the MMSE [10] and MoCA screening tools [11]. Only participants whose category was consistent across both scales (MMSE: AD < 18, MCI 18–25, HC > 25; MoCA: AD < 21, MCI 21–26, HC > 26) were included in the analysis. HC group participants reported no history of neurological or psychiatric disorders. All participants provided informed consent prior to the study.
Preprocessing and analysis of EEG data were conducted using EEGLAB [52] and MATLAB 2022b. Five-minute segments of continuous EEG data were bandpass filtered between 0.5 Hz and 40 Hz using a fourth-order Butterworth filter to remove irrelevant noise and enhance the signal of interest. ICA was implemented within EEGLAB to visually identify and remove components associated with blinks, eye movements, and muscle artefacts. The preprocessed data were re-referenced to an average reference. Finally, the EEG data for each subject were downsampled to 512 Hz for further scout time series extraction. For participants whose EEG recordings were shorter than five minutes, the maximum available duration was utilized during preprocessing.
Grand Average EEG Source Localization plots (front view) for AD, FTD, and HC cases from the BrainLat dataset at timestamps 70s, 70.5s, and 71s. These plots are generated using the Brainstorm Toolbox. The activation maps were set to 20\(\%\) amplitude, with the amplitude threshold parameter set to “Maximum: Global” for each case
Scout time series extraction
EEG source localization aims to identify the primary brain current sources that generate the measured scalp potentials [53]. This process involves solving both the forward and inverse problems. The disparity between the number of EEG channels (128 or 64) and the number of current dipoles to be estimated (approximately 30,020) renders the source localization problem severely underdetermined. Nevertheless, in literature, it has been reported that source localization can be reasonably accurate with 64/128 channels [54].
Forward problem
The forward problem defines the relationship between cortical currents and scalp potentials through a lead field matrix [53]. This matrix models the propagation of currents through head tissues using Neumann and Dirichlet boundary conditions [55]. This relationship can be mathematically expressed as:
where\(V\) represents the scalp potentials,\(A\) is the lead field matrix,\(\tilde{S}\) denotes the cortical source currents, and\(Z\) is the sensor noise matrix.
The head model was computed using the Brainstorm toolbox, which involved a mixed model of cortical and deep structures [56,57]. This model included 30,020 vertices, combining 15,002 from the default cortex and 15,018 from the aseg atlas. The ICBM152 MRI template and the aseg atlas were used to compute the head model for both cortical and subcortical structures, employing OpenMEEG with default conductivity parameters. To focus on specific regions, an aseg subatlas was created, including the hippocampus (surface scout), thalamus, and amygdala (volume scouts).
Inverse problem
The inverse problem estimates cortical source currents\(\tilde{S}\) using the lead field matrix\(A\). The standard low-resolution electrical tomography (sLORETA) method was employed for this purpose. sLORETA assumes spatial smoothness and coherence among adjacent brain regions [58].
The source currents are estimated by solving the following optimization problem:
The solution is given by:
where\(H\) is the average reference operator, and\(A_{sLORETA}\) is the inverse kernel relating the recorded scalp potentials\(V\) to the cortical and subcortical source current estimate\(\tilde{S}\). A sample plot of the grand average brain activation for AD, FTD, and HC cases from the BrainLat dataset is depicted in Fig.2.
The brain was parcellated into left and right regions for the hippocampus, amygdala, and thalamus using the created aseg subatlas. The constrained current signals were then computed for these regions. For each of the six regions (hippocampus, amygdala, and thalamus, bilaterally) in AD, FTD, and HC cases from the BrainLat Dataset and AD, MCI and HC from the IITD-AIIA Dataset, the sources current signal belonging to a particular region are averaged out to obtain a 6-dimensional time series matrix denoted by\(\hat{S}\). This averaged time series matrix was subsequently used for image data preparation. The EEG source localization plots for BrainLat Dataset that depict the activation difference in the brain regions specifically hippocampus, thalamus, and amygdala for AD, FTD, and HC cases is illustrated in Fig.2.
Image data preparation
Signals corresponding to the left and right regions of the thalamus, hippocampus and amygdala were extracted using the aforementioned scout time series extraction procedure. The signals are divided into 0.25 seconds epochs which correspond to 128 samples. This corresponds to a\(128 \times 6\) dimensional matrix\(\hat{S}\), where 6 is the number of signals (corresponding to the left and right thalamus, hippocampus and amygdala). The individual time series is then converted into corresponding image representation by using the Continuous Wavelet Transform (CWT). The underlying principle behind CWT is to provide a multi-resolution representation of the time series by varying translation and scale parameters of a mother wavelet [59]. The basis functions for CWT are obtained by scaling and shifting the mother wavelet and can be mathematically expressed as:
Here, the translation is governed by parameter\(\tau\) which shifts the mother wavelet in time while\(\sigma\) is the scale factor. Normalization by\(\frac{1}{\sqrt{\sigma }}\) is done to ensure that the basis function always has unit energy. Once the basis function is defined, the CWT is computed using inner product of the signal with the basis function at different translations and scaling values. For a signals(t), this is represented mathematically as,
Finally, wavelet coefficients are obtained by taking all shifts and scales of the Morlet mother wavelet. The aforementioned procedure results in a\(128 \times 128 \times 6\) dimensional representation\(\mathcal {I}\) of the input dataX. This is subdivided into two parts:\(\mathcal {I}_L\) and\(\mathcal {I}_R\) corresponding to left and right regions, respectively, which are stored subsequently. It is to be noted that each\(\mathcal {I}_L\) and\(\mathcal {I}_R\) have a dimension of\(128 \times 128 \times 3\).
Classification strategy
In order to classify the images into different classes of neurodegenerative disorders (AD, FTD, HC in the BrainLat dataset and AD, MCI, HC in the IITD-AIIA dataset), several different approaches are adopted. In the first approach, two different classifier networks\(C_L(.)\) and\(C_R(.)\), are individually trained to map\(\mathcal {I}_L\) and\(\mathcal {I}_R\) to a 3 dimensional vector representing the probabilistic distributions\(P(\hat{y}_L=c_i|\mathcal {I_L};\Theta _L)\) and\(P(\hat{y}_R=c_i|\mathcal {I_R};\Theta _R)\). Here,\(\hat{y}_L\) and\(\hat{y}_R\) is the predicted class from the left and right classifiers, and\(c_i = \{AD, FTD/MCI, HC\}\) is the set of classes.\(\Theta _L\) and\(\Theta _R\) are the set of parameters for the classifiers corresponding to the left and right sets of images. Furthermore, a combination of posterior probabilities obtained from individual classifiers is also utilized for the classification task. The sum and product of class probabilities from individual classifiers are computed, and the inputX is assigned to class\(\hat{y}_{sum}\) or\(\hat{y}_{mul}\) based on a maximum of the computed probabilities. Mathematically, the probabilities are computed as\(P(\hat{y}_{sum}=c_i|\mathcal {I}_L,\mathcal {I}_R) = P(\hat{y}_L=c_i|\mathcal {I}_L;\Theta _L) + P(\hat{y}_R=c_i|\mathcal {I}_R;\Theta _R)\) and\(P(\hat{y}_{mul}=c_i|\mathcal {I}_L,\mathcal {I}_R) = P(\hat{y}_L=c_i|\mathcal {I}_L;\Theta _L) * P(\hat{y}_R=c_i|\mathcal {I}_R;\Theta _R)\) for the sum and product cases respectively.
Depiction of the Early Fusion and Tensor Fusion Network approaches.\(z_L\) and\(z_R\) represent latent embeddings from the left and right classifiers, respectively
Additionally, two other approaches for using the latent representations of the two classifiers are utilized for the classification task. The latent outputs of the classifiers are fused by using two different strategies: Early Fusion and Tensor Fusion Network [60] to obtain predictions denoted by\(\hat{y}_{ef}\) and\(\hat{y}_{tfn}\) respectively. The two approaches have been pictorially depicted in Fig.3. It is to be noted that both early fusion and tensor fusion networks are trained in an end-to-end manner. Several different standard architectures and their variants were utilized for the classifier block:
Xception [44]: Utilizes depthwise separable convolutions, which improve computational efficiency without sacrificing accuracy.
ResNet [45]: Introduces residual connections to combat the vanishing gradient problem, enabling deeper networks to be trained effectively.
InceptionResNet [46]: Combines the strength of Inception modules for multi-scale feature extraction with residual connections.
MobileNet [47]: Focuses on lightweight architecture through depthwise separable convolutions, suitable for resource-constrained environments.
NASNetMobile [48]: An architecture discovered through Neural Architecture Search, designed for efficient mobile deployment.
EfficientNet [49]: Scales network depth, width, and resolution systematically for optimal performance.
DenseNet [50]: Employs dense connectivity, where each layer is connected to every other layer, promoting feature reuse and improving gradient flow.
For all the classifiers, weights are initialized from the pre-trained models on the ImageNet task. Subsequently, Adam optimizer is used to minimize the cross-entropy loss to learn the final model parameters.
Experiments and results
Experimental details
As elaborated in “Materials and methods” section, the scout time series corresponding to the left and right thalamus, hippocampus and amygdala are segmented to form epochs of 0.25 seconds. For the BrainLat dataset, this segmentation process yields a total of 15408, 12048, and 21648 epochs corresponding to AD, FTD and HC, respectively. Similarly, for the IITD-AIIA dataset, a total of 11088 AD, 10800 MCI and 9600 HC epochs are obtained. For both datasets, from the set of epochs,\(80\%\) are randomly selected for training the deep learning models, while evaluation is done on the remaining\(20\%\) epochs. Since there is a significant class imbalance in both datasets, different class weights are assigned to samples belonging to different classes based on samples in the majority class to the number of samples in a particular class. This leads to class weights of 1.405, 1.797, and 1 for AD, FTD and HC classes, respectively in the BrainLat dataset. Class weights of 1, 1.027, and 1.155 are assigned to AD, MCI and HC samples of the IITD-AIIA dataset. The model parameters are learnt by using the Adam optimizer while minimizing cross-entropy loss. Furthermore, at the end of each training step, the validation accuracy is monitored on a set of randomly selected\(20\%\) samples from the training set. The training process is stopped if the validation accuracy does not improve over a set of 20 continuous training steps.
Results and discussion
The performance of the different model architectures and different approaches utilized for the task of dementia classification are presented in Tables2 and3 for the BrainLat and IITD-AIIA data, respectively. The results of the individual classifiers (for left and right regions), fusion using sum and product of posterior probabilities and the latent embedding fusion approaches (using early and tensor fusion) are presented. Among the different approaches, using the product of posterior probabilities consistently yields the best classification accuracy for most of the model architectures (12 out of 14 for both datasets). The DenseNet201 emerges as the best-performing model architecture, yielding accuracies of\(94.17\%\) and\(77.72\%\) in conjunction with the product of the probabilities approach on the two datasets. The superior performance of DenseNet201 can be attributed to several key architectural features. The dense connectivity of DenseNet201, which connects each layer to every other layer in a feed-forward fashion, ensures that features learned in earlier layers are reused throughout the network. This improves gradient flow and mitigates the vanishing gradient problem, enabling the model to learn more diverse and complex features critical for identifying subtle patterns in EEG-derived images. Despite its depth, DenseNet201 is parameter-efficient, with fewer parameters than models like ResNet, reducing the risk of overfitting, particularly with relatively small datasets. Additionally, its ability to learn hierarchical features across scales-from low-level patterns to high-level abstractions-proves essential in distinguishing neurodegenerative conditions by capturing subtle differences in the activation patterns of subcortical regions.
Confusion matrix using a combination of DenseNet201 and\(\hat{y}_{mul}\) fora BrainLat dataset andb IITD-AIIA dataset
Scatter plot depicting clusters corresponding to each of the classes obtained by applying dimensionality reduction using t-SNE on the latent embedding vector fora BrainLat dataset andb IITD-AIIA dataset
a,c Multi-class Receiver Operator Characteristic, andb,d Multi-class Precision-Recall Curves for the combination of DenseNet201 and\(\hat{y}_{\text {mul}}\) on the BrainLat and IITD-AIIA datasets, respectively
It may be observed that the classification accuracy on the IITD-AIIA dataset is low compared to the BrainLat dataset. This may be attributed to two main factors. First, the number of samples used for training the model is considerably lower in the case of the IITD-AIIA dataset. In order to learn the complex dynamics from image data, a large number of samples is required, which impacts the overall efficacy of the model. Second, the subcortical source localization in this dataset is done based on a lower number of EEG sensors (64 sensors). A lower number of EEG sensors leads to less accurate localization [54] of the subcortical sources, and hence, the consequent image representations lead to a comparatively lower accuracy score. Nevertheless, an accuracy of\(77.72\%\) can be considered a reasonably good performance of the model architecture.
The classification accuracies achieved using IITD-AIIA and BrainLat datasets emphasize the critical role of subcortical regions in the progression and classification of neurodegenerative disorders. Subcortical structures such as the hippocampus, amygdala, and thalamus play a pivotal role in maintaining cognitive functions, including memory, executive functioning, attention, and emotional regulation. Neurodegenerative conditions like AD [19], FTD [21], and MCI [20] are associated with neuronal loss and structural degeneration in these regions, leading to disrupted cognitive functions [1] and altered activation patterns. The pronounced sensitivity of the model to subcortical dynamics, particularly in the right hemisphere, aligns with prior findings of lateralized degeneration. This may be attributed to the fact that the right hippocampus often exhibits greater atrophy in FTD (e.g., 21% vs. 16% tissue loss in FTD) compared to the left hemisphere [38,41]. This asymmetry in degeneration disrupts neural networks and produces distinct patterns of activation that are effectively leveraged by the model for classification. These findings are consistent with existing literature, which highlights the vulnerability of the right hemisphere in FTD and AD neurodegenerative diseases [19,20]. The ability of DenseNet201 to leverage subcortical dynamics, combined with its hierarchical feature-learning capabilities, contributes to the accurate classification of neurodegenerative disorders. This integration underscores the significance of subcortical regions and the importance of advanced architectures in dementia classification.
In Fig.4, the confusion matrices for the three class classification tasks on both datasets are presented by using the best-performing model. It may be noted that for the BrainLat dataset, the model is particularly adept at recognizing the Healthy cases of the three classes. The majority of confusion in model predictions comes from the classification of FTD and AD classes. For the IITD-AIIA dataset, the classification accuracies of the individual classes are almost similar to each other. In order to better understand the classification heuristics, a scatter plot obtained by applying t-SNE [61] for dimensionality reduction on the latent embedding vectors is presented in Fig.5 for both datasets. From the scatter plot, it may be observed that for the BrainLat dataset, the clusters corresponding to AD and FTD (depicted by A and F, respectively) have a significant overlap between them. This is particularly different from the Healthy cases (depicted by H) for which the cluster is significantly different from the other two classes. Subsequently, for the IITD-AIIA dataset, there is a significant overlap between all three clusters (AD, MCI and HC). Therefore, the misclassification trend observed in the confusion matrix is supported by the clusters corresponding to the three classes as presented in Fig.5. In Fig.6, the Receiver Operator Characteristics (ROC) and Precision-Recall Curves are depicted for the BrainLat and IITD-AIIA datasets, respectively. The average curves, along with class-wise curves, are depicted in the figures. Average area under ROC values of 0.99 and 0.92 are obtained for the two datasets. Additionally, the average precision value for the two datasets is 0.99 and 0.86. The observations from the curves complement the confusion matrices and the conclusions drawn from the scatter plots.
To evaluate the effectiveness of the continuous wavelet transform (CWT) compared to the short-time Fourier transform (STFT), we performed a classification analysis using the DenseNet201 architecture. For the Product of Probabilities fusion approach, CWT achieved an accuracy of 94.17%, significantly outperforming STFT, which achieved 91.71%. This may be attributed to the dynamic resolution of CWT which enables it to provide a more comprehensive analysis of transient and localized spectral variations [59], that are critical for distinguishing neurodegenerative conditions. Unlike STFT, which relies on a fixed time-frequency resolution determined by the window size, CWT adapts its resolution dynamically [62]. This adaptability allows CWT to effectively represent high-frequency, short-duration components and low-frequency, long-duration components within the same framework. These capabilities make CWT particularly suitable for capturing the nuanced and complex patterns in subcortical EEG data, where transient dynamics are pivotal in classification tasks.
Conclusions
In this work, a dementia classification framework using time-series signals from deep brain regions, specifically the hippocampus, amygdala, and thalamus, is presented. EEG source localization using sLORETA was leveraged to transform the average scout time series signals into image representations using CWT. The images were fed to standard model architectures from the image domain to learn the complex attributes present in the data for reliable dementia classification. The efficacy of the proposed framework was validated on two high-density EEG datasets. An online BrainLat dataset that includes subjects with AD, FTD, and HC, and an in-house collected IITD-AIIA dataset that comprises of subjects with MCI, AD, and HC, were used for the experiments. Various deep learning models, including Xception, ResNet, InceptionResNet, MobileNet, NasNetMobile, EfficientNet, and DenseNet, were used for classifying the images into one of the three categories for both the datasets. Different classification strategies, including isolated predictions from the left and right brain regions, sum and product of posterior probabilities, early fusion, and tensor fusion networks, were explored to yield optimum classification performance. The experimental results demonstrate that the proposed method achieves high classification accuracy, with the best performance observed using the combination of DenseNet201 and the product of posterior probabilities. The proposed method achieves a classification accuracy of 94.17% on the BrainLat dataset and 77.72% on the IITD-AIIA dataset, highlighting its effectiveness in leveraging deep brain regions for accurate differentiation between AD, FTD, and MCI. By focusing on subcortical regions, which are among the earliest sites of neurodegeneration, our approach facilitates early diagnosis, critical for timely treatment and management. This study also establishes a promising baseline for future research in dementia classification using image representations of subcortical EEG signals. However, the method is limited by the sample size of the datasets and the reliance on high-density EEG recordings, which may not be readily available in all clinical settings. Moreover, the datasets used are geographically limited, which could affect the generalizability of the results to other populations. Future work will focus on increasing the sample size, exploring more sophisticated feature extraction methods, and employing advanced deep learning architectures to enhance system performance further. We believe this methodology paves the way for more accurate and early diagnosis of neurodegenerative conditions, but additional validation across diverse populations is necessary to confirm its broader applicability.
Data availability
The datasets utilized or analyzed in the present study will be accessible upon reasonable request from the corresponding author.
References
Song J, Yang H, Yan H, Lu Q, Guo L, Zheng H, et al. Structural disruption in subjective cognitive decline and mild cognitive impairment. Brain Imaging Behav. 2024;18(6):1536-48.
Maito MA, Santamaría-García H, Moguilner S, Possin KL, Godoy ME, Avila-Funes JA, et al. Classification of Alzheimer’s disease and frontotemporal dementia using routine clinical and cognitive measures across multicentric underrepresented samples: A cross sectional observational study. Lancet Reg Health–Am. 2023;17:100387.
Scheltens P, De Strooper B, Kivipelto M, Holstege H, Chételat G, Teunissen CE, et al. Alzheimer’s disease. Lancet. 2021;397(10284):1577–90.
Shah H, Albanese E, Duggan C, Rudan I, Langa KM, Carrillo MC, et al. Research priorities to reduce the global burden of dementia by 2025. Lancet Neurol. 2016;15(12):1285–94.
Michaud TL, Su D, Siahpush M, Murman DL. The risk of incident mild cognitive impairment and progression to dementia considering mild cognitive impairment subtypes. Dement Geriatr Cogn Disord Extra. 2017;7(1):15–29.
Olney NT, Spina S, Miller BL. Frontotemporal dementia. Neurol Clin. 2017;35(2):339–74.
Musa G, Slachevsky A, Muñoz-Neira C, Méndez-Orellana C, Villagra R, González-Billault C, et al. Alzheimer’s disease or behavioral variant frontotemporal dementia? Review of key points toward an accurate clinical and neuropsychological diagnosis. J Alzheimers Dis. 2020;73(3):833–48.
Olsson B, Lautner R, Andreasson U, Öhrfelt A, Portelius E, Bjerke M, et al. CSF and blood biomarkers for the diagnosis of Alzheimer’s disease: a systematic review and meta-analysis. Lancet Neurol. 2016;15(7):673–84.
Morris JC. The Clinical Dementia Rating (CDR) current version and scoring rules. Neurology. 1993;43(11):2412–2412.
Lacy M, Kaemmerer T, Czipri S. Standardized mini-mental state examination scores and verbal memory performance at a memory center: implications for cognitive screening. Am J Alzheimers Dis Other Dement®. 2015;30(2):145–52.
Freitas S, Simões MR, Alves L, Santana I. Montreal cognitive assessment: validation study for mild cognitive impairment and Alzheimer disease. Alzheimer Dis Assoc Disord. 2013;27(1):37–43.
Bruno D, Schurmann Vignaga S. Addenbrooke’s cognitive examination III in the diagnosis of dementia: a critical review. Neuropsychiatr Dis Treat. 2019;15:441–7.
Vrahatis AG, Skolariki K, Krokidis MG, Lazaros K, Exarchos TP, Vlamos P. Revolutionizing the early detection of Alzheimer’s disease through non-invasive biomarkers: the role of artificial intelligence and deep learning. Sensors. 2023;23(9):4184.
Jack CR Jr, Bennett DA, Blennow K, Carrillo MC, Dunn B, Haeberlein SB, et al. NIA-AA research framework: toward a biological definition of Alzheimer’s disease. Alzheimers Dement. 2018;14(4):535–62.
Ashrafian H, Zadeh EH, Khan RH. Review on Alzheimer’s disease: inhibition of amyloid beta and tau tangle formation. Int J Biol Macromol. 2021;167:382–94.
Goedert M, Ghetti B, Spillantini MG. Frontotemporal dementia: implications for understanding Alzheimer disease. Cold Spring Harb Perspect Med. 2012;2(2):a006254.
Ibrahim B, Suppiah S, Ibrahim N, Mohamad M, Hassan HA, Nasser NS, et al. Diagnostic power of resting-state fMRI for detection of network connectivity in Alzheimer’s disease and mild cognitive impairment: a systematic review. Hum Brain Mapp. 2021;42(9):2941–68.
Li J, Wei Y, Wang C, Hu Q, Liu Y, Xu L. 3-D CNN-based multichannel contrastive learning for Alzheimer’s disease automatic diagnosis. IEEE Trans Instrum Meas. 2022;71:1–11.
Chaddad A, Niazi T. Radiomics analysis of subcortical brain regions related to Alzheimer disease. In: 2018 IEEE Life Sciences Conference (LSC). IEEE; 2018. pp. 203–6.
Kleinerova J, McKenna MC, Finnegan M, Tacheva A, Garcia-Gallardo A, Mohammed R, et al. Clinical, Cortical, Subcortical, and White Matter Features of Right Temporal Variant FTD. Brain Sci. 2024;14(8):806.
Wang J, Liang X, Lu J, Zhang W, Chen Q, Li X, et al. Cortical and subcortical gray matter abnormalities in mild cognitive impairment. Neuroscience. 2024;557:81–8.
Sisodia PS, Ameta GK, Kumar Y, Chaplot N. A review of deep transfer learning approaches for class-wise prediction of Alzheimer’s disease using MRI images. Arch Comput Methods Eng. 2023;30(4):2409–29.
Jain A, Kumar L. Subject-independent trajectory prediction using pre-movement EEG during grasp and lift task. Biomed Signal Process Control. 2023;86:105160.
Saini M, Jain A, Muthukrishnan SP, Bhasin S, Roy S, Kumar L. BiCurNet: Pre-movement EEG based neural decoder for biceps curl trajectory estimation. IEEE Trans Instrum Meas. 2023;71:1-11.
Kongwudhikunakorn S, Kiatthaveephong S, Thanontip K, Leelaarporn P, Piriyajitakonkij M, Charoenpattarawut T, et al. A pilot study on visually stimulated cognitive tasks for EEG-based dementia recognition. IEEE Trans Instrum Meas. 2021;70:1–10.
Kim Mj, Youn YC, Paik J. Deep learning-based EEG analysis to classify normal, mild cognitive impairment, and dementia: Algorithms and dataset. NeuroImage. 2023;272:120054.
Su R, Li X, Li Z, Han Y, Cui W, Xie P, et al. Constructing biomarker for early diagnosis of aMCI based on combination of multiscale fuzzy entropy and functional brain connectivity. Biomed Signal Process Control. 2021;70:103000.
Cammisuli DM, Isella V, Verde F, Silani V, Ticozzi N, Pomati S, et al. Behavioral disorders of spatial cognition in patients with mild cognitive impairment due to Alzheimer’s disease: preliminary findings from the BDSC-MCI project. J Clin Med. 2024;13(4):1178.
Miltiadous A, Tzimourta KD, Giannakeas N, Tsipouras MG, Afrantou T, Ioannidis P, et al. Alzheimer’s disease and frontotemporal dementia: A robust classification method of EEG signals and a comparison of validation methods. Diagnostics. 2021;11(8):1437.
Rostamikia M, Sarbaz Y, Makouei S. EEG-based classification of Alzheimer’s disease and frontotemporal dementia: a comprehensive analysis of discriminative features. Cogn Neurodyn. 2024;18(6):3447-62.
Miltiadous A, Gionanidis E, Tzimourta KD, Giannakeas N, Tzallas AT. DICE-net: a novel convolution-transformer architecture for Alzheimer detection in EEG signals. IEEE Access. 2023;11:71840-58.
Komolovaitė D, Maskeliūnas R, Damaševičius R. Deep convolutional neural network-based visual stimuli classification using electroencephalography signals of healthy and alzheimer’s disease subjects. Life. 2022;12(3):374.
Nishida K, Yoshimura M, Isotani T, Yoshida T, Kitaura Y, Saito A, et al. Differences in quantitative EEG between frontotemporal dementia and Alzheimer’s disease as revealed by LORETA. Clin Neurophysiol. 2011;122(9):1718–25.
Si Y, He R, Jiang L, Yao D, Zhang H, Xu P, Ma X, Yu L, Li F. Differentiating between Alzheimer’s Disease and Frontotemporal Dementia Based on the Resting-State Multilayer EEG Network. IEEE Trans Neural Syst Rehabil Eng. 2023;31:4521–7.
Babiloni C, Del Percio C, Lizio R, Noce G, Lopez S, Soricelli A, et al. Functional cortical source connectivity of resting state electroencephalographic alpha rhythms shows similar abnormalities in patients with mild cognitive impairment due to Alzheimer’s and Parkinson’s diseases. Clin Neurophysiol. 2018;129(4):766–82.
Farina FR, Emek-Savaş D, Rueda-Delgado L, Boyle R, Kiiski H, Yener G, et al. A comparison of resting state EEG and structural MRI for classifying Alzheimer’s disease and mild cognitive impairment. Neuroimage. 2020;215:116795.
Meghdadi AH, Stevanović Karić M, McConnell M, Rupp G, Richard C, Hamilton J, et al. Resting state EEG biomarkers of cognitive decline associated with Alzheimer’s disease and mild cognitive impairment. PLoS ONE. 2021;16(2):e0244180.
Frisoni G, Laakso M, Beltramello A, Geroldi C, Bianchetti A, Soininen H, et al. Hippocampal and entorhinal cortex atrophy in frontotemporal dementia and Alzheimer’s disease. Neurology. 1999;52(1):91–91.
Shukla A, Tiwari R, Tiwari S. Analyzing subcortical structures in Alzheimer’s disease using ensemble learning. Biomed Signal Process Control. 2024;87:105407.
Smith AD. Imaging the progression of Alzheimer pathology through the brain. Proc Natl Acad Sci. 2002;99(7):4135–7.https://doi.org/10.1073/pnas.082107399.
Quattrini G, Pini L, Boscolo Galazzo I, Jelescu IO, Jovicich J, Manenti R, et al. Microstructural alterations in the locus coeruleus-entorhinal cortex pathway in Alzheimer’s disease and frontotemporal dementia. Alzheimers Dement Diagn Assess Dis Monit. 2024;16(1):e12513.
Seeber M, Cantonas LM, Hoevels M, Sesia T, Visser-Vandewalle V, Michel CM. Subcortical electrophysiological activity is detectable with high-density EEG source imaging. Nat Commun. 2019;10(1):753.
Tripathi A, Mondal AK, Kumar L, Prathosh A. ImAiR: Airwriting recognition framework using image representation of IMU signals. IEEE Sensors Lett. 2022;6(10):1–4.
Chollet F. Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. pp. 1251–1258.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. pp. 770–778.
Szegedy C, Ioffe S, Vanhoucke V, Alemi A. Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence, vol. 31. 2017.
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. 2017. arXiv preprintarXiv:1704.04861.
Zoph B, Vasudevan V, Shlens J, Le QV. Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. pp. 8697–8710.
Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR; 2019. pp. 6105–6114.
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. pp. 4700–4708.
Prado P, Medel V, Gonzalez-Gomez R, Sainz-Ballesteros A, Vidal V, Santamaría-García H, et al. The BrainLat project, a multimodal neuroimaging dataset of neurodegeneration from underrepresented backgrounds. Sci Data. 2023;10(1):889.
Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods. 2004;134(1):9–21.https://doi.org/10.1016/j.jneumeth.2003.10.009.
Michel CM, Brunet D. EEG source imaging: a practical review of the analysis steps. Front Neurol. 2019;10:325.https://doi.org/10.3389/fneur.2019.00325.
Song J, Davey C, Poulsen C, Luu P, Turovets S, Anderson E, et al. EEG source localization: sensor density and head surface coverage. J Neurosci Methods. 2015;256:9–21.
Hallez H, Vanrumste B, Grech R, Muscat J, De Clercq W, Vergult A, et al. Review on solving the forward problem in EEG source analysis. J Neuroengineering Rehabil. 2007;4(1):1–29.
Attal Y, Bhattacharjee M, Yelnik J, Cottereau B, Lefèvre J, Okada Y, et al. Modelling and detecting deep brain activity with MEG and EEG. Irbm. 2009;30(3):133–8.
Attal Y, Schwartz D. Assessment of subcortical source localization using deep brain activity imaging model with minimum norm operators: a MEG study. PLoS ONE. 2013;8(3):e59856.
Pascual-Marqui RD, Lehmann D, Koenig T, Kochi K, Merlo MC, Hell D, et al. Low resolution brain electromagnetic tomography (LORETA) functional imaging in acute, neuroleptic-naive, first-episode, productive schizophrenia. Psychiatry Res Neuroimaging. 1999;90(3):169–79.https://doi.org/10.1016/s0925-4927(99)00013-x.
Mallat S. A wavelet tour of signal processing. Elsevier; 1999.
Zadeh A, Chen M, Poria S, Cambria E, Morency LP. Tensor fusion network for multimodal sentiment analysis. 2017. arXiv preprintarXiv:1707.07250.
Van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9(11):2579-2605.
Lee T, Shair E, Abdullah A, Rahman K, Nazmi N. Comparison of Short Fast Fourier Transform and Continuous Wavelet Transform in Study of Stride Interval. J Biosensors and Bioelectronics Res. 2024;2(5):1-5. ISSN: 2976-7466
Acknowledgements
The authors would like to thank Prof. Pradeep Kumar Prajapati, Vice Chancellor from Dr. Sarvepalli Radhakrishnan Rajasthan Ayurved University (DSSRAU), Jodhpur, and Dr. Lokesh Shekhawat, Assistant Professor from Department of Psychiatry, Atal Bihari Vajpayee Institute of Medical Sciences (ABVIMS), and Dr. Ram Manohar Lohia Hospital, New Delhi, for their expert discussion and guidance during the diagnosis and intervention processes. Lastly, authors extend special thank to Dr. G.P. Bhagat, founder of SHEOWS Guru Vishram Vridh Ashram, for his support in facilitating the availability of the patients.
Funding
This work was supported in part by IIT Mandi iHub and HCI Foundation India with project number RP04502G.
Author information
Shivani Ranjan and Ayush Tripathi contributed equally to this work.
Authors and Affiliations
Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, India
Shivani Ranjan, Ayush Tripathi, Harshal Shende & Lalan Kumar
Centre for Biomedical Engineering, Indian Institute of Technology Delhi, New Delhi, India
Amit Kumar & Deepak Joshi
Department of RS and BK, All India Institute of Ayurveda Delhi, New Delhi, India
Robin Badal & Pramod Yadav
Bharti School of Telecommunication and Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, New Delhi, India
Lalan Kumar
- Shivani Ranjan
You can also search for this author inPubMed Google Scholar
- Ayush Tripathi
You can also search for this author inPubMed Google Scholar
- Harshal Shende
You can also search for this author inPubMed Google Scholar
- Robin Badal
You can also search for this author inPubMed Google Scholar
- Amit Kumar
You can also search for this author inPubMed Google Scholar
- Pramod Yadav
You can also search for this author inPubMed Google Scholar
- Deepak Joshi
You can also search for this author inPubMed Google Scholar
- Lalan Kumar
You can also search for this author inPubMed Google Scholar
Contributions
Conceptualization by S.R., A.T., P.Y., and L.K.; Methodology, Software, Formal analysis, Writing – original draft by A.T., and S.R.; Data Curation by S.R., and H.S.; Data acquisition by S.R., H.S., R.B., and A.K.; Clinical diagnosis by R.B. and P.Y.; Writing – review & editing and Supervision by L.K., P.K. and D.J. All authors critically reviewed the manuscript.
Corresponding author
Correspondence toLalan Kumar.
Ethics declarations
Ethics approval and consent to participate
The study was approved by the Institutional Ethics Committee of All India Institute of Ayurveda, New Delhi, India (Ref No. IEC-331/27.06.2023/Rp(E)-12/2023, dated: 17/08/2023). Informed consent was obtained from all participants before the trial commenced. The ethical clearance complies with the Helsinki Declaration.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ranjan, S., Tripathi, A., Shende, H.et al. Deep learning-based classification of dementia using image representation of subcortical signals.BMC Med Inform Decis Mak25, 113 (2025). https://doi.org/10.1186/s12911-025-02924-w
Received:
Accepted:
Published:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative