Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition

  • Conference paper
  • First Online:

Abstract

Emotion recognition based on Electroencephalography (EEG) has gained significant attention and diversified development in fields such as neural signal processing and affective computing. However, the unique brain anatomy of individuals leads to non-negligible natural differences in EEG signals across subjects, posing challenges for cross-subject emotion recognition. While recent studies have attempted to address these issues, they still face limitations in practical effectiveness and model framework unity. Current methods often struggle to capture the complex spatial-temporal dynamics of EEG signals and fail to effectively integrate multimodal information, resulting in suboptimal performance and limited generalizability across subjects. To overcome these limitations, we develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition that utilizes masked brain signal modeling and interlinked spatial-temporal attention mechanism. The model learns universal latent representations of EEG signals through pre-training on large scale dataset, and employs Interlinked spatial-temporal attention mechanism to process Differential Entropy(DE) features extracted from EEG data. Subsequently, a multi-level fusion layer is proposed to integrate the discriminative features, maximizing the advantages of features across different dimensions and modalities. Extensive experiments on public datasets demonstrate Mood Reader’s superior performance in cross-subject emotion recognition tasks, outperforming state-of-the-art methods. Additionally, the model is dissected from attention perspective, providing qualitative analysis of emotion-related brain areas, offering valuable insights for affective research in neural signal processing.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9723
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12154
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

References

  1. Li, J., Qiu, S., Shen, Y.Y., Liu, C.L., He, H.: Multisource transfer learning for cross-subject EEG emotion recognition. IEEE Trans. Cybern.50(7), 3281–3293 (2019)

    PubMed  Google Scholar 

  2. Yan, H., Zhang, H., Shi, J., Ma, J., Xu, X.: Inspiration transfer for intelligent design: a generative adversarial network with fashion attributes disentanglement. IEEE Trans. Cons. Electron. (2023)

    Google Scholar 

  3. Li, Y., et al.: GMSS: graph-based multi-task self-supervised learning for EEG emotion recognition. IEEE Trans. Affect. Comput. (2022)

    Google Scholar 

  4. Zhong, P., Wang, D., Miao, C.: EEG-based emotion recognition using regularized graph neural networks. IEEE Trans. Affect. Comput.13(3), 1290–1301 (2020)

    Article  Google Scholar 

  5. Li, J., et al.: Cross-subject EEG emotion recognition combined with connectivity features and meta-transfer learning. Comput. Biol. Med.145, 105519 (2022)

    Article PubMed  Google Scholar 

  6. Wang, S., Shen, Y., Zeng, D., Hu, Y.: Bone age assessment using convolutional neural networks. In: 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD), pp. 175–178. IEEE (2018)

    Google Scholar 

  7. Chen, Z., Qing, J., Xiang, T., Yue, W.L., Zhou, J.H.: Seeing beyond the brain: conditional diffusion model with sparse masked modeling for vision decoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22710–22720 (2023)

    Google Scholar 

  8. Ortega Caro, J., et al.: BrainLM: a foundation model for brain activity recordings. bioRxiv (2023)

    Google Scholar 

  9. Luo, S., Chen, X., Chen, W., Li, Z., Wang, S., Pun, C.M.: Devignet: high-resolution vignetting removal via a dual aggregated fusion transformer with adaptive channel expansion. In: AAAI Conference on Artificial Intelligence, pp. 4000–4008 (2024)

    Google Scholar 

  10. Li, Z., Chen, X., Pun, C.M., Cun, X.: High-resolution document shadow removal via a large-scale real-world dataset and a frequency-aware shadow erasing net. In: International Conference on Computer Vision (ICCV), pp. 12449–12458 (2023)

    Google Scholar 

  11. Li, Z., Chen, X., Wang, S., Pun, C.M.: A large-scale film style dataset for learning multi-frequency driven film enhancement. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 1160–1168 (2023)

    Google Scholar 

  12. Du, X., et al.: An efficient LSTM network for emotion recognition from multichannel EEG signals. IEEE Trans. Affect. Comput.13(3), 1528–1540 (2020)

    Article  Google Scholar 

  13. Tao, W., et al.: EEG-based emotion recognition via channel-wise attention and self attention. IEEE Trans. Affect. Comput.14(1), 382–393 (2020)

    Article  Google Scholar 

  14. Shen, X., Liu, X., Hu, X., Zhang, D., Song, S.: Contrastive learning of subject-invariant EEG representations for cross-subject emotion recognition. IEEE Trans. Affect. Comput. (2022)

    Google Scholar 

  15. Liu, W., Qiu, J.L., Zheng, W.L., Lu, B.L.: Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition. IEEE Trans. Cogn. Dev. Syst.14(2), 715–729 (2021)

    Article  Google Scholar 

  16. Li, C., Bao, Z., Li, L., Zhao, Z.: Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition. Inf. Process. Manage.57(3), 102185 (2020)

    Article  Google Scholar 

  17. LeDoux, J.E.: Cognitive-emotional interactions in the brain. Cogn. Emot.3(4), 267–289 (1989)

    Article  Google Scholar 

  18. Jiang, W.B., Liu, X.H., Zheng, W.L., Lu, B.L.: Multimodal adaptive emotion transformer with flexible modality inputs on a novel dataset with continuous labels. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 5975–5984 (2023)

    Google Scholar 

  19. Vazquez-Rodriguez, J., Lefebvre, G., Cumin, J., Crowley, J.L.: Emotion recognition with pre-trained transformers using multimodal signals. In: 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1–8. IEEE (2022)

    Google Scholar 

  20. Jia, Z., Lin, Y., Wang, J., Feng, Z., Xie, X., Chen, C.: HetEmotionNet: two-stream heterogeneous graph recurrent neural network for multi-modal emotion recognition. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 1047–1056 (2021)

    Google Scholar 

  21. Ma, J., Tang, H., Zheng, W.L., Lu, B.L.: Emotion recognition using multimodal residual LSTM network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 176–183 (2019)

    Google Scholar 

  22. Chaparro, V., Gomez, A., Salgado, A., Quintero, O.L., Lopez, N., Villa, L.F.: Emotion recognition from EEG and facial expressions: a multimodal approach. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 530–533. IEEE (2018)

    Google Scholar 

  23. Zheng, W.L., Liu, W., Lu, Y., Lu, B.L., Cichocki, A.: EmotionMeter: a multimodal framework for recognizing human emotions. IEEE Trans. Cybern.49(3), 1110–1122 (2018)

    Article PubMed  Google Scholar 

  24. Chen, Z., Qing, J., Zhou, J.H.: Cinematic mindscapes: high-quality video reconstruction from brain activity. In: Advances in Neural Information Processing Systems, vol. 36 (2024)

    Google Scholar 

  25. Bai, Y., Wang, X., Cao, Y.P., Ge, Y., Yuan, C., Shan, Y.: DreamDiffusion: generating high-quality images from brain EEG signals. arXiv preprintarXiv:2306.16934 (2023)

  26. Yang, E., et al.: The default network dominates neural responses to evolving movie stories. Nat. Commun.14(1), 4197 (2023)

    Article ADS CAS PubMed PubMed Central  Google Scholar 

  27. Rollo, J., Crawford, J., Hardy, J.: A dynamical systems approach for multiscale synthesis of Alzheimer’s pathogenesis. Neuron111(14), 2126–2139 (2023)

    Article CAS PubMed  Google Scholar 

  28. You, S., et al.: Fine perceptive GANs for brain MR image super-resolution in wavelet domain. IEEE Trans. Neural Networks Learn. Syst. (2022)

    Google Scholar 

  29. Gong, C., et al.: Generative AI for brain image computing and brain network computing: a review. Front. Neurosci.17, 1203104 (2023)

    Article PubMed PubMed Central  Google Scholar 

  30. Wang, S., Wang, H., Cheung, A.C., Shen, Y., Gan, M.: Ensemble of 3D densely connected convolutional network for diagnosis of mild cognitive impairment and Alzheimer’s disease. Deep Learn. Appl., 53–73 (2020)

    Google Scholar 

  31. Hu, B., Zhan, C., Tang, B., Wang, B., Lei, B., Wang, S.Q.: 3-D brain reconstruction by hierarchical shape-perception network from a single incomplete image. IEEE Trans. Neural Netw. Learn. Syst. (2023)

    Google Scholar 

  32. Pan, J., Lei, B., Shen, Y., Liu, Y., Feng, Z., Wang, S.: Characterization multimodal connectivity of brain network by hypergraph GAN for Alzheimer’s disease analysis. In: Ma, H., et al. (eds.) Pattern Recognition and Computer Vision: 4th Chinese Conference, PRCV 2021, Beijing, China, October 29 – November 1, 2021, Proceedings, Part III, pp. 467–478. Springer International Publishing, Cham (2021).https://doi.org/10.1007/978-3-030-88010-1_39

    Chapter  Google Scholar 

  33. Wang, S.Q.: A variational approach to nonlinear two-point boundary value problems. Comput. Math. Appl.58(11–12), 2452–2455 (2009)

    Article ADS MathSciNet  Google Scholar 

  34. Cherian, A., Wang, J., Hori, C., Marks, T.: Spatio-temporal ranked-attention networks for video captioning. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1617–1626 (2020)

    Google Scholar 

  35. Ahn, D., Kim, S., Hong, H., Ko, B.C.: Star-transformer: a spatio-temporal cross attention transformer for human action recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3330–3339 (2023)

    Google Scholar 

  36. Zhou, Q., Li, X., He, L., Yang, Y., Cheng, G., Tong, Y., Ma, L., Tao, D.: TransVOD: end-to-end video object detection with spatial-temporal transformers. IEEE Trans. Pattern Anal. Mach. Intell. (2022)

    Google Scholar 

  37. Li, Y., Zheng, W., Wang, L., Zong, Y., Cui, Z.: From regional to global brain: a novel hierarchical spatial-temporal neural network model for EEG emotion recognition. IEEE Trans. Affect. Comput.13(2), 568–578 (2019)

    Article  Google Scholar 

  38. Gong, P., Jia, Z., Wang, P., Zhou, Y., Zhang, D.: ASTDF-Net: attention-based spatial-temporal dual-stream fusion network for EEG-based emotion recognition. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 883–892 (2023)

    Google Scholar 

  39. Schalk, G., McFarland, D.J., Hinterberger, T., Birbaumer, N., Wolpaw, J.R.: BCI 2000: a general-purpose brain-computer interface (BCI) system. IEEE Trans. Biomed. Eng.51(6), 1034–1043 (2004)

    Article PubMed  Google Scholar 

  40. Zuo, Q., Wu, H., Chen, C.P., Lei, B., Wang, S.: Prior-guided adversarial learning with hypergraph for predicting abnormal connections in Alzheimer’s disease. IEEE Trans. Cybern. (2024)

    Google Scholar 

  41. Zuo, Q., Lei, B., Shen, Y., Liu, Y., Feng, Z., Wang, S.: Multimodal representations learning and adversarial hypergraph fusion for early Alzheimer’s disease prediction. In: Ma, H., et al. (eds.) Pattern Recognition and Computer Vision: 4th Chinese Conference, PRCV 2021, Beijing, China, October 29 – November 1, 2021, Proceedings, Part III, pp. 479–490. Springer International Publishing, Cham (2021).https://doi.org/10.1007/978-3-030-88010-1_40

    Chapter  Google Scholar 

  42. Song, T., Zheng, W., Song, P., Cui, Z.: EEG emotion recognition using dynamical graph convolutional neural networks. IEEE Trans. Affect. Comput.11(3), 532–541 (2018)

    Article  Google Scholar 

  43. Li, J., Li, S., Pan, J., Wang, F.: Cross-subject EEG emotion recognition with self-organized graph neural network. Front. Neurosci.15, 611653 (2021)

    Article PubMed PubMed Central  Google Scholar 

  44. Zhang, J., et al.: Subject-independent emotion recognition based on EEG frequency band features and self-adaptive graph construction. Brain Sci.14(3), 271 (2024)

    Article PubMed PubMed Central  Google Scholar 

  45. Dolcos, F., LaBar, K.S., Cabeza, R.: Interaction between the amygdala and the medial temporal lobe memory system predicts better memory for emotional events. Neuron42(5), 855–863 (2004)

    Article CAS PubMed  Google Scholar 

  46. Sporns, O.: Structure and function of complex brain networks. Dialogues Clin. Neurosci.15(3), 247–262 (2013)

    Article PubMed PubMed Central  Google Scholar 

Download references

Acknowledgement

This work was supported in part by the National Natural Science Foundations of China under Grant 62172403, the Distinguished Young Scholars Fund of Guangdong under Grant 2021B1515020019. M. Ng’s research is supported in part by the HKRGC GRF 17201020 and 17300021, HKRGC CRF C7004-21GF, and Joint NSFC and RGC N-HKU769/21.

Author information

Authors and Affiliations

  1. University of Chinese Academy of Sciences, Beijing, China

    Yihang Dong, Yanyan Shen & Shuqiang Wang

  2. Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China

    Yihang Dong, Xuhang Chen, Yanyan Shen & Shuqiang Wang

  3. Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong

    Michael Kwok-Po Ng

  4. Faculty of Innovation Engineering, Macau University of Science and Technology, Cotai, China

    Tao Qian

Authors
  1. Yihang Dong

    You can also search for this author inPubMed Google Scholar

  2. Xuhang Chen

    You can also search for this author inPubMed Google Scholar

  3. Yanyan Shen

    You can also search for this author inPubMed Google Scholar

  4. Michael Kwok-Po Ng

    You can also search for this author inPubMed Google Scholar

  5. Tao Qian

    You can also search for this author inPubMed Google Scholar

  6. Shuqiang Wang

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toShuqiang Wang.

Editor information

Editors and Affiliations

  1. Harbin Institute of Technology, Shenzhen, China

    Haijun Zhang

  2. Guangxi Normal University, Guilin, China

    Xianxian Li

  3. South China Normal University, Guangzhou, China

    Tianyong Hao

  4. Technical University of Denmark, Kongens Lyngby, Denmark

    Weizhi Meng

  5. Chongqing University, Chongqing, China

    Zhou Wu

  6. Guilin University of Electronic Technology, Guilin, China

    Qian He

Rights and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dong, Y., Chen, X., Shen, Y., Ng, M.KP., Qian, T., Wang, S. (2025). Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition. In: Zhang, H., Li, X., Hao, T., Meng, W., Wu, Z., He, Q. (eds) Neural Computing for Advanced Applications. NCAA 2024. Communications in Computer and Information Science, vol 2183. Springer, Singapore. https://doi.org/10.1007/978-981-97-7007-6_13

Download citation

Publish with us

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9723
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12154
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only


[8]ページ先頭

©2009-2025 Movatter.jp