- Yihang Dong ORCID:orcid.org/0009-0001-2786-21839,10,
- Xuhang Chen10,
- Yanyan Shen9,10,
- Michael Kwok-Po Ng11,
- Tao Qian12 &
- …
- Shuqiang Wang ORCID:orcid.org/0000-0003-1119-320X9,10
Part of the book series:Communications in Computer and Information Science ((CCIS,volume 2183))
Included in the following conference series:
184Accesses
Abstract
Emotion recognition based on Electroencephalography (EEG) has gained significant attention and diversified development in fields such as neural signal processing and affective computing. However, the unique brain anatomy of individuals leads to non-negligible natural differences in EEG signals across subjects, posing challenges for cross-subject emotion recognition. While recent studies have attempted to address these issues, they still face limitations in practical effectiveness and model framework unity. Current methods often struggle to capture the complex spatial-temporal dynamics of EEG signals and fail to effectively integrate multimodal information, resulting in suboptimal performance and limited generalizability across subjects. To overcome these limitations, we develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition that utilizes masked brain signal modeling and interlinked spatial-temporal attention mechanism. The model learns universal latent representations of EEG signals through pre-training on large scale dataset, and employs Interlinked spatial-temporal attention mechanism to process Differential Entropy(DE) features extracted from EEG data. Subsequently, a multi-level fusion layer is proposed to integrate the discriminative features, maximizing the advantages of features across different dimensions and modalities. Extensive experiments on public datasets demonstrate Mood Reader’s superior performance in cross-subject emotion recognition tasks, outperforming state-of-the-art methods. Additionally, the model is dissected from attention perspective, providing qualitative analysis of emotion-related brain areas, offering valuable insights for affective research in neural signal processing.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 9723
- Price includes VAT (Japan)
- Softcover Book
- JPY 12154
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Li, J., Qiu, S., Shen, Y.Y., Liu, C.L., He, H.: Multisource transfer learning for cross-subject EEG emotion recognition. IEEE Trans. Cybern.50(7), 3281–3293 (2019)
Yan, H., Zhang, H., Shi, J., Ma, J., Xu, X.: Inspiration transfer for intelligent design: a generative adversarial network with fashion attributes disentanglement. IEEE Trans. Cons. Electron. (2023)
Li, Y., et al.: GMSS: graph-based multi-task self-supervised learning for EEG emotion recognition. IEEE Trans. Affect. Comput. (2022)
Zhong, P., Wang, D., Miao, C.: EEG-based emotion recognition using regularized graph neural networks. IEEE Trans. Affect. Comput.13(3), 1290–1301 (2020)
Li, J., et al.: Cross-subject EEG emotion recognition combined with connectivity features and meta-transfer learning. Comput. Biol. Med.145, 105519 (2022)
Wang, S., Shen, Y., Zeng, D., Hu, Y.: Bone age assessment using convolutional neural networks. In: 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD), pp. 175–178. IEEE (2018)
Chen, Z., Qing, J., Xiang, T., Yue, W.L., Zhou, J.H.: Seeing beyond the brain: conditional diffusion model with sparse masked modeling for vision decoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22710–22720 (2023)
Ortega Caro, J., et al.: BrainLM: a foundation model for brain activity recordings. bioRxiv (2023)
Luo, S., Chen, X., Chen, W., Li, Z., Wang, S., Pun, C.M.: Devignet: high-resolution vignetting removal via a dual aggregated fusion transformer with adaptive channel expansion. In: AAAI Conference on Artificial Intelligence, pp. 4000–4008 (2024)
Li, Z., Chen, X., Pun, C.M., Cun, X.: High-resolution document shadow removal via a large-scale real-world dataset and a frequency-aware shadow erasing net. In: International Conference on Computer Vision (ICCV), pp. 12449–12458 (2023)
Li, Z., Chen, X., Wang, S., Pun, C.M.: A large-scale film style dataset for learning multi-frequency driven film enhancement. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 1160–1168 (2023)
Du, X., et al.: An efficient LSTM network for emotion recognition from multichannel EEG signals. IEEE Trans. Affect. Comput.13(3), 1528–1540 (2020)
Tao, W., et al.: EEG-based emotion recognition via channel-wise attention and self attention. IEEE Trans. Affect. Comput.14(1), 382–393 (2020)
Shen, X., Liu, X., Hu, X., Zhang, D., Song, S.: Contrastive learning of subject-invariant EEG representations for cross-subject emotion recognition. IEEE Trans. Affect. Comput. (2022)
Liu, W., Qiu, J.L., Zheng, W.L., Lu, B.L.: Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition. IEEE Trans. Cogn. Dev. Syst.14(2), 715–729 (2021)
Li, C., Bao, Z., Li, L., Zhao, Z.: Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition. Inf. Process. Manage.57(3), 102185 (2020)
LeDoux, J.E.: Cognitive-emotional interactions in the brain. Cogn. Emot.3(4), 267–289 (1989)
Jiang, W.B., Liu, X.H., Zheng, W.L., Lu, B.L.: Multimodal adaptive emotion transformer with flexible modality inputs on a novel dataset with continuous labels. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 5975–5984 (2023)
Vazquez-Rodriguez, J., Lefebvre, G., Cumin, J., Crowley, J.L.: Emotion recognition with pre-trained transformers using multimodal signals. In: 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1–8. IEEE (2022)
Jia, Z., Lin, Y., Wang, J., Feng, Z., Xie, X., Chen, C.: HetEmotionNet: two-stream heterogeneous graph recurrent neural network for multi-modal emotion recognition. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 1047–1056 (2021)
Ma, J., Tang, H., Zheng, W.L., Lu, B.L.: Emotion recognition using multimodal residual LSTM network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 176–183 (2019)
Chaparro, V., Gomez, A., Salgado, A., Quintero, O.L., Lopez, N., Villa, L.F.: Emotion recognition from EEG and facial expressions: a multimodal approach. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 530–533. IEEE (2018)
Zheng, W.L., Liu, W., Lu, Y., Lu, B.L., Cichocki, A.: EmotionMeter: a multimodal framework for recognizing human emotions. IEEE Trans. Cybern.49(3), 1110–1122 (2018)
Chen, Z., Qing, J., Zhou, J.H.: Cinematic mindscapes: high-quality video reconstruction from brain activity. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Bai, Y., Wang, X., Cao, Y.P., Ge, Y., Yuan, C., Shan, Y.: DreamDiffusion: generating high-quality images from brain EEG signals. arXiv preprintarXiv:2306.16934 (2023)
Yang, E., et al.: The default network dominates neural responses to evolving movie stories. Nat. Commun.14(1), 4197 (2023)
Rollo, J., Crawford, J., Hardy, J.: A dynamical systems approach for multiscale synthesis of Alzheimer’s pathogenesis. Neuron111(14), 2126–2139 (2023)
You, S., et al.: Fine perceptive GANs for brain MR image super-resolution in wavelet domain. IEEE Trans. Neural Networks Learn. Syst. (2022)
Gong, C., et al.: Generative AI for brain image computing and brain network computing: a review. Front. Neurosci.17, 1203104 (2023)
Wang, S., Wang, H., Cheung, A.C., Shen, Y., Gan, M.: Ensemble of 3D densely connected convolutional network for diagnosis of mild cognitive impairment and Alzheimer’s disease. Deep Learn. Appl., 53–73 (2020)
Hu, B., Zhan, C., Tang, B., Wang, B., Lei, B., Wang, S.Q.: 3-D brain reconstruction by hierarchical shape-perception network from a single incomplete image. IEEE Trans. Neural Netw. Learn. Syst. (2023)
Pan, J., Lei, B., Shen, Y., Liu, Y., Feng, Z., Wang, S.: Characterization multimodal connectivity of brain network by hypergraph GAN for Alzheimer’s disease analysis. In: Ma, H., et al. (eds.) Pattern Recognition and Computer Vision: 4th Chinese Conference, PRCV 2021, Beijing, China, October 29 – November 1, 2021, Proceedings, Part III, pp. 467–478. Springer International Publishing, Cham (2021).https://doi.org/10.1007/978-3-030-88010-1_39
Wang, S.Q.: A variational approach to nonlinear two-point boundary value problems. Comput. Math. Appl.58(11–12), 2452–2455 (2009)
Cherian, A., Wang, J., Hori, C., Marks, T.: Spatio-temporal ranked-attention networks for video captioning. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1617–1626 (2020)
Ahn, D., Kim, S., Hong, H., Ko, B.C.: Star-transformer: a spatio-temporal cross attention transformer for human action recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3330–3339 (2023)
Zhou, Q., Li, X., He, L., Yang, Y., Cheng, G., Tong, Y., Ma, L., Tao, D.: TransVOD: end-to-end video object detection with spatial-temporal transformers. IEEE Trans. Pattern Anal. Mach. Intell. (2022)
Li, Y., Zheng, W., Wang, L., Zong, Y., Cui, Z.: From regional to global brain: a novel hierarchical spatial-temporal neural network model for EEG emotion recognition. IEEE Trans. Affect. Comput.13(2), 568–578 (2019)
Gong, P., Jia, Z., Wang, P., Zhou, Y., Zhang, D.: ASTDF-Net: attention-based spatial-temporal dual-stream fusion network for EEG-based emotion recognition. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 883–892 (2023)
Schalk, G., McFarland, D.J., Hinterberger, T., Birbaumer, N., Wolpaw, J.R.: BCI 2000: a general-purpose brain-computer interface (BCI) system. IEEE Trans. Biomed. Eng.51(6), 1034–1043 (2004)
Zuo, Q., Wu, H., Chen, C.P., Lei, B., Wang, S.: Prior-guided adversarial learning with hypergraph for predicting abnormal connections in Alzheimer’s disease. IEEE Trans. Cybern. (2024)
Zuo, Q., Lei, B., Shen, Y., Liu, Y., Feng, Z., Wang, S.: Multimodal representations learning and adversarial hypergraph fusion for early Alzheimer’s disease prediction. In: Ma, H., et al. (eds.) Pattern Recognition and Computer Vision: 4th Chinese Conference, PRCV 2021, Beijing, China, October 29 – November 1, 2021, Proceedings, Part III, pp. 479–490. Springer International Publishing, Cham (2021).https://doi.org/10.1007/978-3-030-88010-1_40
Song, T., Zheng, W., Song, P., Cui, Z.: EEG emotion recognition using dynamical graph convolutional neural networks. IEEE Trans. Affect. Comput.11(3), 532–541 (2018)
Li, J., Li, S., Pan, J., Wang, F.: Cross-subject EEG emotion recognition with self-organized graph neural network. Front. Neurosci.15, 611653 (2021)
Zhang, J., et al.: Subject-independent emotion recognition based on EEG frequency band features and self-adaptive graph construction. Brain Sci.14(3), 271 (2024)
Dolcos, F., LaBar, K.S., Cabeza, R.: Interaction between the amygdala and the medial temporal lobe memory system predicts better memory for emotional events. Neuron42(5), 855–863 (2004)
Sporns, O.: Structure and function of complex brain networks. Dialogues Clin. Neurosci.15(3), 247–262 (2013)
Acknowledgement
This work was supported in part by the National Natural Science Foundations of China under Grant 62172403, the Distinguished Young Scholars Fund of Guangdong under Grant 2021B1515020019. M. Ng’s research is supported in part by the HKRGC GRF 17201020 and 17300021, HKRGC CRF C7004-21GF, and Joint NSFC and RGC N-HKU769/21.
Author information
Authors and Affiliations
University of Chinese Academy of Sciences, Beijing, China
Yihang Dong, Yanyan Shen & Shuqiang Wang
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Yihang Dong, Xuhang Chen, Yanyan Shen & Shuqiang Wang
Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong
Michael Kwok-Po Ng
Faculty of Innovation Engineering, Macau University of Science and Technology, Cotai, China
Tao Qian
- Yihang Dong
You can also search for this author inPubMed Google Scholar
- Xuhang Chen
You can also search for this author inPubMed Google Scholar
- Yanyan Shen
You can also search for this author inPubMed Google Scholar
- Michael Kwok-Po Ng
You can also search for this author inPubMed Google Scholar
- Tao Qian
You can also search for this author inPubMed Google Scholar
- Shuqiang Wang
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toShuqiang Wang.
Editor information
Editors and Affiliations
Harbin Institute of Technology, Shenzhen, China
Haijun Zhang
Guangxi Normal University, Guilin, China
Xianxian Li
South China Normal University, Guangzhou, China
Tianyong Hao
Technical University of Denmark, Kongens Lyngby, Denmark
Weizhi Meng
Chongqing University, Chongqing, China
Zhou Wu
Guilin University of Electronic Technology, Guilin, China
Qian He
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Dong, Y., Chen, X., Shen, Y., Ng, M.KP., Qian, T., Wang, S. (2025). Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition. In: Zhang, H., Li, X., Hao, T., Meng, W., Wu, Z., He, Q. (eds) Neural Computing for Advanced Applications. NCAA 2024. Communications in Computer and Information Science, vol 2183. Springer, Singapore. https://doi.org/10.1007/978-981-97-7007-6_13
Download citation
Published:
Publisher Name:Springer, Singapore
Print ISBN:978-981-97-7006-9
Online ISBN:978-981-97-7007-6
eBook Packages:Artificial Intelligence (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative