Movatterモバイル変換

Yihang Dong ORCID:orcid.org/0009-0001-2786-2183^9,10,
Xuhang Chen¹⁰,
Yanyan Shen^9,10,
Michael Kwok-Po Ng¹¹,
Tao Qian¹² &
…
Shuqiang Wang ORCID:orcid.org/0000-0003-1119-320X^9,10

Part of the book series:Communications in Computer and Information Science ((CCIS,volume 2183))

Included in the following conference series:

International Conference on Neural Computing for Advanced Applications

184Accesses
6Citations

Abstract

Emotion recognition based on Electroencephalography (EEG) has gained significant attention and diversified development in fields such as neural signal processing and affective computing. However, the unique brain anatomy of individuals leads to non-negligible natural differences in EEG signals across subjects, posing challenges for cross-subject emotion recognition. While recent studies have attempted to address these issues, they still face limitations in practical effectiveness and model framework unity. Current methods often struggle to capture the complex spatial-temporal dynamics of EEG signals and fail to effectively integrate multimodal information, resulting in suboptimal performance and limited generalizability across subjects. To overcome these limitations, we develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition that utilizes masked brain signal modeling and interlinked spatial-temporal attention mechanism. The model learns universal latent representations of EEG signals through pre-training on large scale dataset, and employs Interlinked spatial-temporal attention mechanism to process Differential Entropy(DE) features extracted from EEG data. Subsequently, a multi-level fusion layer is proposed to integrate the discriminative features, maximizing the advantages of features across different dimensions and modalities. Extensive experiments on public datasets demonstrate Mood Reader’s superior performance in cross-subject emotion recognition tasks, outperforming state-of-the-art methods. Additionally, the model is dissected from attention perspective, providing qualitative analysis of emotion-related brain areas, offering valuable insights for affective research in neural signal processing.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 9723; Price includes VAT (Japan)

Softcover Book: JPY 12154; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Li, J., Qiu, S., Shen, Y.Y., Liu, C.L., He, H.: Multisource transfer learning for cross-subject EEG emotion recognition. IEEE Trans. Cybern.50(7), 3281–3293 (2019)
PubMed Google Scholar
Yan, H., Zhang, H., Shi, J., Ma, J., Xu, X.: Inspiration transfer for intelligent design: a generative adversarial network with fashion attributes disentanglement. IEEE Trans. Cons. Electron. (2023)
Google Scholar
Li, Y., et al.: GMSS: graph-based multi-task self-supervised learning for EEG emotion recognition. IEEE Trans. Affect. Comput. (2022)
Google Scholar
Zhong, P., Wang, D., Miao, C.: EEG-based emotion recognition using regularized graph neural networks. IEEE Trans. Affect. Comput.13(3), 1290–1301 (2020)
Article Google Scholar
Li, J., et al.: Cross-subject EEG emotion recognition combined with connectivity features and meta-transfer learning. Comput. Biol. Med.145, 105519 (2022)
Article PubMed Google Scholar
Wang, S., Shen, Y., Zeng, D., Hu, Y.: Bone age assessment using convolutional neural networks. In: 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD), pp. 175–178. IEEE (2018)
Google Scholar
Chen, Z., Qing, J., Xiang, T., Yue, W.L., Zhou, J.H.: Seeing beyond the brain: conditional diffusion model with sparse masked modeling for vision decoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22710–22720 (2023)
Google Scholar
Ortega Caro, J., et al.: BrainLM: a foundation model for brain activity recordings. bioRxiv (2023)
Google Scholar
Luo, S., Chen, X., Chen, W., Li, Z., Wang, S., Pun, C.M.: Devignet: high-resolution vignetting removal via a dual aggregated fusion transformer with adaptive channel expansion. In: AAAI Conference on Artificial Intelligence, pp. 4000–4008 (2024)
Google Scholar
Li, Z., Chen, X., Pun, C.M., Cun, X.: High-resolution document shadow removal via a large-scale real-world dataset and a frequency-aware shadow erasing net. In: International Conference on Computer Vision (ICCV), pp. 12449–12458 (2023)
Google Scholar
Li, Z., Chen, X., Wang, S., Pun, C.M.: A large-scale film style dataset for learning multi-frequency driven film enhancement. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 1160–1168 (2023)
Google Scholar
Du, X., et al.: An efficient LSTM network for emotion recognition from multichannel EEG signals. IEEE Trans. Affect. Comput.13(3), 1528–1540 (2020)
Article Google Scholar
Tao, W., et al.: EEG-based emotion recognition via channel-wise attention and self attention. IEEE Trans. Affect. Comput.14(1), 382–393 (2020)
Article Google Scholar
Shen, X., Liu, X., Hu, X., Zhang, D., Song, S.: Contrastive learning of subject-invariant EEG representations for cross-subject emotion recognition. IEEE Trans. Affect. Comput. (2022)
Google Scholar
Liu, W., Qiu, J.L., Zheng, W.L., Lu, B.L.: Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition. IEEE Trans. Cogn. Dev. Syst.14(2), 715–729 (2021)
Article Google Scholar
Li, C., Bao, Z., Li, L., Zhao, Z.: Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition. Inf. Process. Manage.57(3), 102185 (2020)
Article Google Scholar
LeDoux, J.E.: Cognitive-emotional interactions in the brain. Cogn. Emot.3(4), 267–289 (1989)
Article Google Scholar
Jiang, W.B., Liu, X.H., Zheng, W.L., Lu, B.L.: Multimodal adaptive emotion transformer with flexible modality inputs on a novel dataset with continuous labels. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 5975–5984 (2023)
Google Scholar
Vazquez-Rodriguez, J., Lefebvre, G., Cumin, J., Crowley, J.L.: Emotion recognition with pre-trained transformers using multimodal signals. In: 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1–8. IEEE (2022)
Google Scholar
Jia, Z., Lin, Y., Wang, J., Feng, Z., Xie, X., Chen, C.: HetEmotionNet: two-stream heterogeneous graph recurrent neural network for multi-modal emotion recognition. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 1047–1056 (2021)
Google Scholar
Ma, J., Tang, H., Zheng, W.L., Lu, B.L.: Emotion recognition using multimodal residual LSTM network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 176–183 (2019)
Google Scholar
Chaparro, V., Gomez, A., Salgado, A., Quintero, O.L., Lopez, N., Villa, L.F.: Emotion recognition from EEG and facial expressions: a multimodal approach. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 530–533. IEEE (2018)
Google Scholar
Zheng, W.L., Liu, W., Lu, Y., Lu, B.L., Cichocki, A.: EmotionMeter: a multimodal framework for recognizing human emotions. IEEE Trans. Cybern.49(3), 1110–1122 (2018)
Article PubMed Google Scholar
Chen, Z., Qing, J., Zhou, J.H.: Cinematic mindscapes: high-quality video reconstruction from brain activity. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Bai, Y., Wang, X., Cao, Y.P., Ge, Y., Yuan, C., Shan, Y.: DreamDiffusion: generating high-quality images from brain EEG signals. arXiv preprintarXiv:2306.16934 (2023)
Yang, E., et al.: The default network dominates neural responses to evolving movie stories. Nat. Commun.14(1), 4197 (2023)
Article ADS CAS PubMed PubMed Central Google Scholar
Rollo, J., Crawford, J., Hardy, J.: A dynamical systems approach for multiscale synthesis of Alzheimer’s pathogenesis. Neuron111(14), 2126–2139 (2023)
Article CAS PubMed Google Scholar
You, S., et al.: Fine perceptive GANs for brain MR image super-resolution in wavelet domain. IEEE Trans. Neural Networks Learn. Syst. (2022)
Google Scholar
Gong, C., et al.: Generative AI for brain image computing and brain network computing: a review. Front. Neurosci.17, 1203104 (2023)
Article PubMed PubMed Central Google Scholar
Wang, S., Wang, H., Cheung, A.C., Shen, Y., Gan, M.: Ensemble of 3D densely connected convolutional network for diagnosis of mild cognitive impairment and Alzheimer’s disease. Deep Learn. Appl., 53–73 (2020)
Google Scholar
Hu, B., Zhan, C., Tang, B., Wang, B., Lei, B., Wang, S.Q.: 3-D brain reconstruction by hierarchical shape-perception network from a single incomplete image. IEEE Trans. Neural Netw. Learn. Syst. (2023)
Google Scholar
Pan, J., Lei, B., Shen, Y., Liu, Y., Feng, Z., Wang, S.: Characterization multimodal connectivity of brain network by hypergraph GAN for Alzheimer’s disease analysis. In: Ma, H., et al. (eds.) Pattern Recognition and Computer Vision: 4th Chinese Conference, PRCV 2021, Beijing, China, October 29 – November 1, 2021, Proceedings, Part III, pp. 467–478. Springer International Publishing, Cham (2021).https://doi.org/10.1007/978-3-030-88010-1_39
Chapter Google Scholar
Wang, S.Q.: A variational approach to nonlinear two-point boundary value problems. Comput. Math. Appl.58(11–12), 2452–2455 (2009)
Article ADS MathSciNet Google Scholar
Cherian, A., Wang, J., Hori, C., Marks, T.: Spatio-temporal ranked-attention networks for video captioning. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1617–1626 (2020)
Google Scholar
Ahn, D., Kim, S., Hong, H., Ko, B.C.: Star-transformer: a spatio-temporal cross attention transformer for human action recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3330–3339 (2023)
Google Scholar
Zhou, Q., Li, X., He, L., Yang, Y., Cheng, G., Tong, Y., Ma, L., Tao, D.: TransVOD: end-to-end video object detection with spatial-temporal transformers. IEEE Trans. Pattern Anal. Mach. Intell. (2022)
Google Scholar
Li, Y., Zheng, W., Wang, L., Zong, Y., Cui, Z.: From regional to global brain: a novel hierarchical spatial-temporal neural network model for EEG emotion recognition. IEEE Trans. Affect. Comput.13(2), 568–578 (2019)
Article Google Scholar
Gong, P., Jia, Z., Wang, P., Zhou, Y., Zhang, D.: ASTDF-Net: attention-based spatial-temporal dual-stream fusion network for EEG-based emotion recognition. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 883–892 (2023)
Google Scholar
Schalk, G., McFarland, D.J., Hinterberger, T., Birbaumer, N., Wolpaw, J.R.: BCI 2000: a general-purpose brain-computer interface (BCI) system. IEEE Trans. Biomed. Eng.51(6), 1034–1043 (2004)
Article PubMed Google Scholar
Zuo, Q., Wu, H., Chen, C.P., Lei, B., Wang, S.: Prior-guided adversarial learning with hypergraph for predicting abnormal connections in Alzheimer’s disease. IEEE Trans. Cybern. (2024)
Google Scholar
Zuo, Q., Lei, B., Shen, Y., Liu, Y., Feng, Z., Wang, S.: Multimodal representations learning and adversarial hypergraph fusion for early Alzheimer’s disease prediction. In: Ma, H., et al. (eds.) Pattern Recognition and Computer Vision: 4th Chinese Conference, PRCV 2021, Beijing, China, October 29 – November 1, 2021, Proceedings, Part III, pp. 479–490. Springer International Publishing, Cham (2021).https://doi.org/10.1007/978-3-030-88010-1_40
Chapter Google Scholar
Song, T., Zheng, W., Song, P., Cui, Z.: EEG emotion recognition using dynamical graph convolutional neural networks. IEEE Trans. Affect. Comput.11(3), 532–541 (2018)
Article Google Scholar
Li, J., Li, S., Pan, J., Wang, F.: Cross-subject EEG emotion recognition with self-organized graph neural network. Front. Neurosci.15, 611653 (2021)
Article PubMed PubMed Central Google Scholar
Zhang, J., et al.: Subject-independent emotion recognition based on EEG frequency band features and self-adaptive graph construction. Brain Sci.14(3), 271 (2024)
Article PubMed PubMed Central Google Scholar
Dolcos, F., LaBar, K.S., Cabeza, R.: Interaction between the amygdala and the medial temporal lobe memory system predicts better memory for emotional events. Neuron42(5), 855–863 (2004)
Article CAS PubMed Google Scholar
Sporns, O.: Structure and function of complex brain networks. Dialogues Clin. Neurosci.15(3), 247–262 (2013)
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgement

This work was supported in part by the National Natural Science Foundations of China under Grant 62172403, the Distinguished Young Scholars Fund of Guangdong under Grant 2021B1515020019. M. Ng’s research is supported in part by the HKRGC GRF 17201020 and 17300021, HKRGC CRF C7004-21GF, and Joint NSFC and RGC N-HKU769/21.

Author information

Authors and Affiliations

University of Chinese Academy of Sciences, Beijing, China
Yihang Dong, Yanyan Shen & Shuqiang Wang
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Yihang Dong, Xuhang Chen, Yanyan Shen & Shuqiang Wang
Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong
Michael Kwok-Po Ng
Faculty of Innovation Engineering, Macau University of Science and Technology, Cotai, China
Tao Qian

Authors

Yihang Dong
View author publications
You can also search for this author inPubMed Google Scholar
Xuhang Chen
View author publications
You can also search for this author inPubMed Google Scholar
Yanyan Shen
View author publications
You can also search for this author inPubMed Google Scholar
Michael Kwok-Po Ng
View author publications
You can also search for this author inPubMed Google Scholar
Tao Qian
View author publications
You can also search for this author inPubMed Google Scholar
Shuqiang Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toShuqiang Wang.

Editor information

Editors and Affiliations

Harbin Institute of Technology, Shenzhen, China
Haijun Zhang
Guangxi Normal University, Guilin, China
Xianxian Li
South China Normal University, Guangzhou, China
Tianyong Hao
Technical University of Denmark, Kongens Lyngby, Denmark
Weizhi Meng
Chongqing University, Chongqing, China
Zhou Wu
Guilin University of Electronic Technology, Guilin, China
Qian He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dong, Y., Chen, X., Shen, Y., Ng, M.KP., Qian, T., Wang, S. (2025). Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition. In: Zhang, H., Li, X., Hao, T., Meng, W., Wu, Z., He, Q. (eds) Neural Computing for Advanced Applications. NCAA 2024. Communications in Computer and Information Science, vol 2183. Springer, Singapore. https://doi.org/10.1007/978-981-97-7007-6_13

Download citation

DOI:https://doi.org/10.1007/978-981-97-7007-6_13
Published:22 September 2024
Publisher Name:Springer, Singapore
Print ISBN:978-981-97-7006-9
Online ISBN:978-981-97-7007-6
eBook Packages:Artificial Intelligence (R0)

Publish with us

Policies and ethics

Movatterモバイル変換

Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Access this chapter

Subscribe and save

Buy Now