Movatterモバイル変換

Part of the book series:IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 519))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

3368Accesses
41Citations
7Altmetric

Abstract

Emotion recognition is an essential function to realize human-machine interaction devices. Physiological signals which can be collected easily and continuously by wearable sensors are good inputs for emotion analysis. How to effectively process physiological signals, extract critical features, and choose machine learning model for emotion classification has been a big challenge. In this paper, an entropy-based processing scheme for emotion recognition framework is proposed, which includes entropy domain feature extraction and prediction by XGBoost classifier. We experiment on AMIGOS database and the experimental results show that the proposed scheme for multi-modal analysis outperforms conventional processing approaches. It achieves approximately 80% and 68% accuracy of prediction for two affect dimensions, valence and arousal. For one modality case, we found that galvanic skin response (GSR) channel is the most potential modality for prediction, which leads to best performances.

You have full access to this open access chapter, Download conference paper PDF

Multimodal Emotion Classification: Implications for Cognitive Science and Human Behaviour

Evaluation of Galvanic Skin Response (GSR) Signals Features for Emotion Recognition

GSR Signals Features Extraction for Emotion Recognition

Keywords

1Introduction

Affective computing is a key technology for human–computer interaction (HCI) in the future era of Internet of thing (IoT) [1], which makes it possible for machines and computers to realize human’s emotion and mentality in real time. Furthermore, it can give appropriate responses and services based-on human’s current mental status. Related databases have been developed for related researches on affective computing. Among the open-source databases available on the Internet, FEEDTUM [2] and Berlin Database of Emotional Speech (EmoDB) [3], released in 2005, used facial video and speech for emotion recognition, respectively. They predicted basic emotion of people, such as anger and happy, by single type of signal.

As intelligent IoT develops, more and more wearable devices are equipped with different kinds of sensors. We can effectively get various signals sensed from subjects and have the access for continuously monitoring. For this scenario, physiological signals can be good inputs for affective computing framework, since we do not express our emotions on faces and make sounds every time. Integrating multiple-channel of physiological signals for emotion recognition and aggregating machine learning classifiers for ensemble learning have gain attention in recent years. DEAP database [4] proposed in 2012 has been experimented in many works [5,6], it contains several types of physiological signals which were recorded when subjects were watching music videos. In DEAP, emotions perceived by subjects are self-assessed on two affect dimensions, valence and arousal. Valence tells how positive or negative the emotion is and arousal indicates the intensity of emotion. In 2016, ACERTAIN database [7] was proposed that first used commercial wearable sensor to collect data and added personality into experiments. AMIGOS database [8] proposed in 2017 contains stable data, several types of videos, and elements for other related research, such as social context and mood. These all indicate that multi-modality analysis of physiological signals is a trend of affective computing. Therefore, how to effectively process different physiological signals and further extract critical features from different channels for emotion classification have been a big challenge.

The aim of this paper is to establish reliable emotion recognition framework based on physiological signals. To improve the accuracy of emotion recognition, we firstly extract the entropy domain features to quantify the regularity and randomness of signal, which has high potential to represent the different levels of emotion. Next, we apply the XGBoost [9] to enhance the performance of classification which has high scalability and efficiency for training and has the ability to learn from high dimension data without using high complexity feature selection algorithms.

The rest of the paper is organized as follows. Section 2 describes common methods for emotion recognition task. Section 3 illustrates the enhanced methods for feature extraction and machine learning engine. Section 4 shows the experimental results and Sect. 5 concludes.

2Common Approaches of Emotion Recognition Framework

The flow chart of the emotion recognition framework in this study is shown in Fig. 1. Three kinds of physiological signals are inputs and the output is the prediction for high or low of affect dimensions. Three other blocks including pre-processing, feature extraction, and machine learning engine will be introduced as below.

2.1Pre-processing

We use the processed data provided by the official website of AMIGOS dataset. In AMIGOS database, each subject watched both long and short videos as stimulus for emotion. Because long-length video data contain multiple values for valence and arousal, only short-length video data are used in this work to reduce the uncertainty. There are total 40 subjects and 16 short-length videos in database. Each video of each subject is seen as one data. Each of data has 14 channels for Electroencephalography (EEG), 2 channels for Electrocardiography (ECG), and 1 channel for galvanic skin response (GSR). 7 subjects whose physiological signals of videos contain some missing values are removed (ID number: 9, 12, 21, 22, 23, 24, 33). Therefore, there are totally (40–7) (subjects) × 16 (videos) = 528 data. We further checked the data stability. First Channel of ECG signals suffer from fewer noises compared to second channel and are used for further analyzing. In addition, GSR signals are filtered by the low-pass filter with cut-off frequency 3 Hz to remove abnormal high frequency noises.

In this work, we do binary classification for valence and arousal. Self-assessments of each subject on two affect dimensions provided in AMIGOS database is used for labels. They are originally values between [1,9] and needed to be transformed to either positive class or negative class label, which represents high and low, respectively. Figure 2 shows two different ways to process. Directly cut by threshold 5 is intuitive, but hard to distinguish one’s relative high and low emotion. In this way, values may highly accord to personal tendency to rate high or low scores, which results in imbalance of labels and decrease in performances. Our method is to define labels by subject-dependent mean value among all 16 videos of each person. Using mean value instead of median value can avoid ambiguity whether the median is labeled positive class or negative class.

2.2Feature Extraction

Feature extraction is implemented on all three kinds of signals with different ways to reduce the input dimension before using machine learning methods. Features illustrated in this Section are slightly different from the feature set in AMIGOS database. Features extracted among physiological signals are shown in Table 1.

Table 1. Features extracted among physiological signals

Full size table

GSR Signal:

Partly follow methods in [8,10], skin response (SR) related features (18) are first extracted and skin conductance (SC) signal is got by computing SR signal’s reciprocal. SC signal is then normalized and used to extract features (4). Next, skin conductance slow response (SCSR) and skin conductance very slow response (SCVSR) is got by low-pass filtering normalized SC signal at 0.2 and 0.08 Hz, respectively. Afterwards, some SCSR related features are extracted (4). Last, after de-trending SCSR and SCVSR by using empirical mode decomposition (EMD) methods [11] to remove the last half intrinsic mode functions (IMFs), some related features are extracted.

ECG Signal:

Partly follow methods in [8,10], 0–6 Hz spectral power features are first extracted (60, per 0.1). Then, R-R interval (RRI) series is calculated after detecting R peaks in ECG. Heart rate variability (HRV) and heart rare (HR) time series can be computed by using RRI. Finally, HRV related features (11) and HR related features (6) are extracted.

EEG Signal:

Follow the methods in [4], average power spectral density (PSD) of theta band (4–7 Hz), slow alpha band (8–10 Hz), alpha band (8–13 Hz), beta band (14–30 Hz), gamma band (31–47 Hz) of each EEG channel are extracted (5 × 14 = 70). Also, asymmetry of PSD of 5 bands between 7 pairs of EEG channel are extracted (5 × 7 = 35).

2.3Machine Learning Engine

This block consists of a feature selection method and a machine learning classifier, as shown in Fig. 3a. Feature space which contains 214 features is comparatively big while we only have 528 (16 subjects * 33 videos) data in AMIGOS database. Thus, feature selection is applied to eliminate redundant features and reduce the model complexity of machine learning algorithm in next stage. This is helpful for enhancing overall performance and sparing lots of computation resources.

In Fig. 3b, overview of machine engine block in AMIGOS database is shown. For feature selection method in original paper of Amigos database [8], Fisher’s linear discriminant (FLD)J [12] is calculated over each feature, which is defined as

$$ J\mathcal{(}f\mathcal{) = }\frac{{\left| {\mu_{1} - \mu_{0} } \right|}}{{\sigma_{1}^{2} + \sigma_{0}^{2} }}. $$

(1)

Afterward, user can decide how much discriminant features they would like to select based on validation sets and pick up features with the highestJ values. For machine learning classifier, Gaussian Naïve Bayes (GaussianNB) classifier is used. Assuming features are independent and Gaussian distributed, GaussianNB is given as

$$ G\mathcal{(}f_{1} \mathcal{,} \ldots \mathcal{,}f_{n} \mathcal{) = }argmax_{c} \,p\mathcal{(}C = c\mathcal{)}\prod\nolimits_{i = 1}^{n} {p\mathcal{(}F_{i} = f_{i} |C = c\mathcal{)}} , $$

(2)

and calculate the probability of each sample for each class, whereF is feature set andC is the classes set. These two methods are not powerful and lead to relatively low f1-score shown in [8].

In this work, we first implement support vector machine (SVM) [13,14] based approach as the benchmark, as shown in Fig. 3c. SVM is a famous tool in the field of machine leaning and it has good performance on classification and prediction over many applications. Compared with other classifier, SVM can deal with small-size datasets well, since it uses only support vectors to construct hyperplane. When it comes to feature selection, recursive feature elimination (RFE) algorithm [15] can efficiently remove irrelevant features and be suit for SVM. RFE-SVM eliminates features with smallest weight in SVM model on a sequential backward selection (SBS) based process. The progress continues until one feature is left and the features combined with SVM model with the highest performance is outputted.

3Proposed Framework with Entropy Domain Features and XGBoost Classifier

3.1Entropy Domain Features

Non-linear entropy domain features, such as sample entropy, permutation entropy based features, are widely used on physiological signals. Extracted entropy value can help quantify the regularity of signal and thus be applied on medical diagnosis. For this emotion recognition work, three types of entropy domain features, including refined composite multiscale entropy (RCMSE), turning point ratio (TRP), and Shannon entropy, are applied to measure the complexity of physiological signals. The details are shown as follows.

Refine Composite Multiscale Entropy (RCMSE) [16]:

Multiscale entropy (MSE) [17] has been used widely to evaluate physiological control mechanisms, such as atrial fibrillation and Alzheimer’s disease [18]. RCMSE is an improved version of MSE. It reduces the possibility of undefined value problem when the signal length is short and has a better accuracy for entropy estimation. The concept of RCMSE is using different scales of local matching pattern to compute the regularity of the signal. There are two steps of RCMSE. First step: time series of signal is coarse-grained into multiscale series. For each scale factor$ \tau $,$ \tau $ series are generated by average$ \tau $ points in non-overlapping windows and each of series overlap$ \tau $−1 points with neighbor series. Thej-th point ofk-th coarse-grained series,$ \varvec{y}_{k}^{{\mathcal{(}\tau \mathcal{)}}} = \left\{ {y_{k,1}^{{\mathcal{(}\tau \mathcal{)}}} y_{k,2}^{{\mathcal{(}\tau \mathcal{)}}} y_{k,3}^{{\mathcal{(}\tau \mathcal{)}}} \ldots y_{k,p}^{{\mathcal{(}\tau \mathcal{)}}} } \right\} $, with scale factor$ \tau $ of signalx is defined:

$$ \varvec{y}_{k,j}^{{\mathcal{(}\tau \mathcal{)}}} = \frac{1}{\tau }\sum\limits_{{i = \mathcal{(}j - 1\mathcal{)}\tau + k}}^{j\tau + k - 1} {x_{i} ,\quad 1 \le j \le \frac{N}{\tau },\quad 1 \le k \le \tau } . $$

(3)

In the conventional MSE algorithm, output of coarse graining is only first series of each scale factor. Second step: averaged sample entropy of each$ \tau $ is calculated as below:

$$ RCMSE\mathcal{(}\varvec{x},\tau ,m,r) = - ln\mathcal{(}\frac{{\sum\nolimits_{k = 1}^{\tau } {n_{k,\tau }^{m + 1} } }}{{\sum\nolimits_{k = 1}^{\tau } {n_{k,\tau }^{m} } }}\mathcal{),} $$

(4)

wherem is the matching pattern length andr is the similarity criterion. In (4),$ n_{k,\tau }^{m} $ is the number of two sets of simultaneous data points of lengthm inkth series have the difference < r. When the ratio of$ \sum\nolimits_{k = 1}^{\tau } {n_{k,\tau }^{m + 1} } $ and$ \sum\nolimits_{k = 1}^{\tau } {n_{k,\tau }^{m} } $ is small, sample entropy would be large, which means high complexity of signal. Value of RCMSE is undefined only when$ \sum\nolimits_{k = 1}^{\tau } {n_{k,\tau }^{m + 1} } $ = 0, which means$ n_{k,\tau }^{m} $ ofk series are all zero. By summation of matching pattern of multiple series, RCMSE has less probability to be undefined than conventional MSE.

In this work, RCMSE is applied to RRI series of ECG signals. Because the length of ECG signals are relatively short (each video 55 s–155 s), the scale factor$ \tau $ s set up to be 3. In addition, we set the matching patternm up to be three and the similarity criterion r is set to be 0.2 of standard deviation.

Turning Points Ratio (TPR):

TPR is proposed on the basis of nonparametric “Runs Test” to evaluate the randomness in a time-series [19] and the idea was used in RRI of ECG signal [20]. The concept of TPR is to measure the complexity of the signal by number of turning points compared to total points. Turning point is found by comparing each point with left and right neighbor points and TPR is calculated as follows:

$$ TPR = \frac{{\sum\nolimits_{i = 2}^{N - 1} {\left[ {\mathcal{(}x_{i} \mathcal{ - }x_{i - 1} \mathcal{)(}x_{i} \mathcal{ - }x_{i + 1} \mathcal{) > }0} \right]} }}{N - 2}, $$

(5)

whereN is the length of the signalx. Besides original TPR, we extracted modified TPR (MTPR) on signals. The procedure of MTPR is to use EMD methods to extract trend of the signal first, in order to remove trivial peaks of the signal. Next, compute TPR as (3) on the extracted trend.

In this work, TPR and MTPR are calculated on RRI of ECG signals and GSR signals. EEG signal has less information on time series and is skipped.

Shannon Entropy:

Shannon entropy is commonly calculated in the domain of information theory. It is defined for a given discrete probability distribution, using probability of each symbol to measure the uncertainty or randomness of the data. However, almost every point has different values. We can’t see each of different points as a new symbol, otherwise almost the same value of Shannon entropy we would get from different signals. In other way, we classify every points into one group of the group set. First, outliers that have larger differences than three standard deviations with mean are removed. Second, we sort rest data points into N groups, which are equally divided between max and min value. Last,$ p_{i} $ of eachith group is calculated and Shannon entropy we can get by:

$$ Shannon\,Entropy = - \sum\limits_{i = 1}^{N} {p_{i} \log \mathcal{(}p_{i} \mathcal{)}} . $$

(6)

The concept of Shannon entropy is to observe the complexity of the signal by overall distribution. Each group is seen as a symbol.

In this work, Shannon entropy is calculated on RRI of ECG signals and GSR signals. Since the optimal group number is data dependent, we apply total 4, 8, 16, 32, 64 groups for simulation.

For these three enhanced feature extraction methods, total 26 features are added into original feature space. The details are listed in Table 2.

Table 2. Enhanced features extracted among physiological signals

Full size table

3.2Extreme Gradient Boosting (XGBoost)

In this part, we would like to change the content of the machine learning blocks mentioned before in Sect. 2.3, where RFE based SVM approach was used. Among all the machine learning algorithms, gradient boosting tree based model [21] has shown in many applications in different domains. XGBoost [9] is an efficient and scalable gradient boosting machine, which has won lots of machine competitions in recent years [22,23]. It is an ensemble model consisting of sets of classification and regression tree (CART). While XGB is used for supervised learning problems and we use training data$ x_{i} $ to predict a target variable$ y_{i} $, the model can be described in the form:

$$ \hat{y}_{i} = \sum\limits_{k = 1}^{K} {f_{k} \mathcal{(}x_{i} \mathcal{)}} \mathcal{ , }f_{k} \in F\mathcal{ ,} $$

(7)

whereK is the total number of trees,f_k forkth tree is a function in the functional spaceF, andF is the set of all possible CARTs. In the training, each of new-trained CART will try to complement the so-far residual. Objective function been optimized at (t + 1)th CART is described:

$$ obj = \sum\limits_{i = 1}^{n} {l\mathcal{(}y_{i} ,\hat{y}_{i}^{{\mathcal{(}t\mathcal{)}}} \mathcal{) + }\sum\limits_{i = 1}^{t} {\Omega \mathcal{(}f_{i} \mathcal{)}} } , $$

(8)

Wherel() denotes the training loss function,$ y_{i} $ the is ground truth, and$ \hat{y}_{i}^{{\mathcal{(}t\mathcal{)}}} $ is the prediction value at stept.$ \Omega \left( {} \right) $ given by

$$ \Omega \mathcal{(}f\mathcal{) = }\gamma T + \frac{1}{2}\lambda \sum\limits_{j = 1}^{T} {w_{j}^{2} } $$

(9)

is the regularization term, whereT are the number of leaves and$ w_{j} $ is the score onjth leaf. When (9) is optimized, Taylor’s expansion is used so that gradient descent can be used for different loss functions. Furthermore, feature selection is no need when we use XGBoost approach. During training period of XGBoost, good features would be chosen as node in trees, which means features not used are abandoned.

In this work, we use the scikit-learn API for XGBoost classification. The inputs of XGBoost are total 240 features (214 traditional features + 26 entropy domain feature) and the outputs are prediction results for valence or arousal. For loss function in (8), logistic loss function is set. The details of used features and performances are shown in next Section.

4Experimental Settings and Results

Single trial classification for two affect dimensions, arousal and valence, is experimented and the flow is shown in Fig. 1. After signal is pre-processed and transformed into features, all features are normalized to [1]. Leave-one-subject-one approach is conducted to evaluate performance. That is, every time one of the 33 subjects is leaved as test set and machine learning engine is trained using remaining 32 subjects’ features. When the procedure is repeated 33 times, final performance is calculated by averaging 33 values in each procedure.

There are 3 processing schemes that are compared.

Scheme_1: Using the feature set and the method described in original AMIGOS database [8], where FLD and GaussianNB were used for machine learning engine block.
Scheme_2: Using the feature set illustrated in Sect. 2.2 and RFE-SVM for machine learning engine block.
Scheme_3: Using new feature set with entropy domain features and XGBoost for machine learning engine block.

Table 3 shows the comparison of f1-score and accuracy over three schemes, F1-score is the harmonic mean of precision and recall and here we average f1-scores of positive and negative class as final value. Four scenarios were experimented: using only GSR, ECG, EEG, and using all modalities. Table 4 shows the features used in trees of XGBoost classifier for classification of valence and arousal.

Table 3. Performance of scheme_1 - scheme_3 on emotion recognition framework. (F1-score is mean for positive and negative class. Red value indicates the highest accuracy or f1-score in each affect dimension. Scheme_1: Amigos feature set + FLD + GaussianNB; Scheme_2: commom feature set + RFE-SVM; Scheme_3: new feature set + XGBoost)

Full size table

Table 4. Features selected in XGBoost classifier for classification of two affect dimensions. (Entropy domain features are marked in red color)

Full size table

F1-score for Scheme_1 is directly obtained from [8]. For dimension valence and arousal, best accuracy is about 80% and 68%, respectively. Scheme_3 outperforms Scheme_1 and Scheme_2 in almost every scenario on both affect dimensions, especially in valence dimension, outperforms by more than 10% of accuracy. Using all-modalities for prediction can have highest performances except for f1-score in arousal. In our experiments (Scheme_2 and Scheme_3), using GSR channel to predict can have the best performance for both valence and arousal dimensions, when only one channel is used. However, the features extracted in GSR channel is the least among three modalities, which indicates that the quality but not the quantity of features lead to better prediction ability.

5Conclusion

In this paper, we proposed the scheme including both entropy domain features and XGBoost. We enhance feature extraction methods which are entropy domain and helpful for evaluating the complexity of physiological signals. On the other hand, XGBoost classifier which gains popularity in recent years is used for learning and prediction. The proposed scheme can reach the performance of approximately 80% and 68% accuracy on valence and arousal dimension, respectively. It outperforms the processing scheme in original paper of AMIGOS database and the scheme that contains common features and traditional SVM model.

References

Picard, R.: Affective computing for HCI. In: International Conference on Human-Computer Interaction, pp. 829–833. Lawrence Erlbaum Associates Inc., Munich, Germany (1999)
Google Scholar
Wallhoff, F.: Facial expressions and emotion database (2005).http://www.mmk.ei.tum.de/_waf/fgnet/feedtum.html
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of german emotional speech. In: Proceedings of lnterspeech, pp. 1517–1520 (2005)
Google Scholar
Koelstra, S., et al.: DEAP: a database for emotion analysis; using physiological signals. IEEE Trans. Affect. Comput.3(1), 18–31 (2012)
Article Google Scholar
Chen, M., Han, J., Guo, L., Wang, J., Patras, I.: Identifying valence and arousal levels via connectivity between EEG channels. In: 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), Xi’an, pp. 63–69 (2015)
Google Scholar
Wu, S., Xu, X., Shu, L., Hu, B.: Estimation of valence of emotion using two frontal EEG channels. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, pp. 1127–1130 (2017)
Google Scholar
Subramanian, R., Wache, J., Abadi, M., Vieriu, R., Winkler, S., Sebe, N.: ASCERTAIN: emotion and personality recognition using commercial sensors. IEEE Trans. Affect. Comput.PP(99), 1 (2016)
Article Google Scholar
Miranda-Correa, J.A., Abadi, M.K., Sebe, N., Patras, I.: AMIGOS: a dataset for mood, personality and affect research on individuals and groups. ArXiv e-prints (2017)
Google Scholar
Chen, T., Guestrin, C.: XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794 (2016)
Google Scholar
Kim, J., Andr, E.: Emotion recognition based on physiological changes in music listening. IEEE Trans. Pattern Anal. Mach. Intell.30(12), 2067–2083 (2008)
Article Google Scholar
Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.-C., Tung, C.C., Liu, H.H.: The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. Roy. Soc. Lond. A Math. Phys. Eng. Sci.454(1971), 903–995 (1998)
Article MathSciNet Google Scholar
Song, F., Mei, D., Li, H.: Feature selection based on linear discriminant analysis. In: 2010 International Conference on Intelligent System Design and Engineering Application (ISDEA), vol. 1, pp. 746–749 (2010)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)
Book Google Scholar
Luts, J., Ojeda, F., van de Plas, R., de Moor, B., van Huffel, S., Suykens, J.A.K.: A tutorial on support vector machine-based methods for classification problems in chemometrics. Anal. Chim. Acta665, 129–145 (2010)
Article Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn.46(1), 389–422 (2002)
Article Google Scholar
Wu, S., Wu, C., Lin, S., Lee, K., Peng, C.K.: Analysis of complex time series using refined composite multiscale entropy. Phys. Lett. A378(20), 1369–1374 (2014)
Article Google Scholar
Costa, M., Goldberger, A.L., Peng, C.K.: Multiscale entropy analysis of biological signals. Phys. Rev. E71, 1–17 (2005)
MathSciNet Google Scholar
Tsai, P.H., Lin, C., Tsao, J.: Empirical mode decomposition based detrended sample entropy in electroencephalography for Alzheimer’s disease. J. Neurosci. Methods210, 230–237 (2012)
Article Google Scholar
Wallis, W.A., Moore, G.H.: A significance test for time series analysis. J. Am. Stat. Assoc.36, 401–409 (1946)
Article Google Scholar
Dash, S., Raeder, E., Merchant, S., Chon, K.: A statistical approach for accurate detection of atrial fibrillation and flutter. In: Proceedings of the Annual Computers in Cardiology Conference (CinC), pp. 137–140 (2009)
Google Scholar
Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat.29(5), 1189–1232 (2001)
Article MathSciNet Google Scholar
Adam-Bourdarios, C., Cowan, G., Germain-Renaud, C., Guyon, I., Kégl, B., Rousseau, D.: The higgs machine learning challenge. J. Phys: Conf. Ser.664, 072015 (2015)
Google Scholar
Phoboo, A.E.: Machine learning wins the higgs challenge. CERN Bull. (2014).http://cds.cern.ch/journal/CERNBulletin/2014/49/News%20Articles/1972036. Accessed 24 Apr 2016

Download references

Acknowledgements

This work was supported by the Ministry of Science and Technology of Taiwan (MOST 106-2221-E-002-205-MY3 and MOST 106-2622-8-002-013-TA), National Taiwan University and Pixart Imaging Inc.

Author information

Authors and Affiliations

Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan
Sheng-Hui Wang, Huai-Ting Li, En-Jui Chang & An-Yeu (Andy) Wu

Authors

Sheng-Hui Wang
View author publications
Search author on:PubMed Google Scholar
Huai-Ting Li
View author publications
Search author on:PubMed Google Scholar
En-Jui Chang
View author publications
Search author on:PubMed Google Scholar
An-Yeu (Andy) Wu
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence toSheng-Hui Wang.

Editor information

Editors and Affiliations

School of Engineering, Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Piraeus, Piraeus, Greece
Ilias Maglogiannis
University of Thessaly, Lamia, Greece
Vassilis Plagianakos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, SH., Li, HT., Chang, EJ., Wu, AY.(. (2018). Entropy-Assisted Emotion Recognition of Valence and Arousal Using XGBoost Classifier. In: Iliadis, L., Maglogiannis, I., Plagianakos, V. (eds) Artificial Intelligence Applications and Innovations. AIAI 2018. IFIP Advances in Information and Communication Technology, vol 519. Springer, Cham. https://doi.org/10.1007/978-3-319-92007-8_22

Download citation

DOI:https://doi.org/10.1007/978-3-319-92007-8_22
Published:22 May 2018
Publisher Name:Springer, Cham
Print ISBN:978-3-319-92006-1
Online ISBN:978-3-319-92007-8
eBook Packages:Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Movatterモバイル変換

Entropy-Assisted Emotion Recognition of Valence and Arousal Using XGBoost Classifier

Abstract

Similar content being viewed by others

Multimodal Emotion Classification: Implications for Cognitive Science and Human Behaviour

Evaluation of Galvanic Skin Response (GSR) Signals Features for Emotion Recognition

GSR Signals Features Extraction for Emotion Recognition

Keywords

1Introduction

2Common Approaches of Emotion Recognition Framework

2.1Pre-processing

2.2Feature Extraction

GSR Signal:

ECG Signal:

EEG Signal:

2.3Machine Learning Engine

3Proposed Framework with Entropy Domain Features and XGBoost Classifier

3.1Entropy Domain Features

Refine Composite Multiscale Entropy (RCMSE) [16]:

Turning Points Ratio (TPR):

Shannon Entropy:

3.2Extreme Gradient Boosting (XGBoost)

4Experimental Settings and Results

5Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships