Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
NCBI home page
Search in PMCSearch
  • View on publisher site icon
As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health.
Learn more:PMC Disclaimer | PMC Copyright Notice
Social Cognitive and Affective Neuroscience logo

‘Inner voices’: the cerebral representation of emotional voice cues described in literary texts

Carolin Brück1,2,,Benjamin Kreifelts1,Christina Gößling-Arnold2,3,Jürgen Wertheimer2,3,Dirk Wildgruber1,2
1Department of Psychiatry and Psychotherapy, Eberhard Karls University, Tübingen 72076, Germany,2Werner Reichardt Centre for Integrative Neuroscience (CIN), Tübingen 72076, Germany and3Department of Comparative Literature, Eberhard Karls University, Tübingen 72074, Germany

Correspondence should be addressed to Carolin Brück, Department of Psychiatry and Psychotherapy, Eberhard Karls University, Calwerstraße 14, 72076 Tübingen, Germany. E-mail:Carolin.Brueck@med.uni-tuebingen.de

Corresponding author.

Received 2012 Dec 13; Revised 2013 Sep 12; Accepted 2013 Dec 28; Issue date 2014 Nov.

© The Author (2014). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com
PMCID: PMC4221224  PMID:24396008

Abstract

While non-verbal affective voice cues are generally recognized as a crucial behavioral guide in any day-to-day conversation their role as a powerful source of information may extend well beyond close-up personal interactions and include other modes of communication such as written discourse or literature as well. Building on the assumption that similarities between the different ‘modes’ of voice cues may not only be limited to their functional role but may also include cerebral mechanisms engaged in the decoding process, the present functional magnetic resonance imaging study aimed at exploring brain responses associated with processing emotional voice signals described in literary texts. Emphasis was placed on evaluating ‘voice’ sensitive as well as task- and emotion-related modulations of brain activation frequently associated with the decoding of acoustic vocal cues. Obtained findings suggest that several similarities emerge with respect to the perception of acoustic voice signals: results identify the superior temporal, lateral and medial frontal cortex as well as the posterior cingulate cortex and cerebellum to contribute to the decoding process, with similarities to acoustic voice perception reflected in a ‘voice’-cue preference of temporal voice areas as well as an emotion-related modulation of the medial frontal cortex and a task-modulated response of the lateral frontal cortex.

Keywords: emotion, fMRI, literature, non-verbal communication, voice

INTRODUCTION

With only a slight change in the sound of their voices human beings are able to communicate a wealth of information: modulations of voice characteristics such as pitch, loudness, voice quality or tempo allow listeners to uncover attitudes, intentions or emotions behind spoken words (Banse and Scherer, 1996;Szameitatet al., 2009;Sauteret al., 2010)—crucial knowledge that aids ‘survival’ in our social environment.

For decades now research has aimed at understanding how voice signals are used to decipher affective meaning encoded in the sound of a voice. Particularly studies regarding the processing of affective voice cues such as speech prosody or laughter have contributed greatly to our current understanding of how the brain analyses, integrates and evaluates vocal expressions of emotions (Ackermannet al., 2004;Schirmer and Kotz, 2006;Wildgruberet al., 2006;Meyeret al., 2007;Wildgruberet al., 2009;Brücket al., 2011b).

However, vocal signals may not only serve as a valuable guide in any day-to-day conversation, rather their role may also translate to other forms of communication as well. Considering narrative literature, for example, writtendescriptions of affective voice cues—just as their acoustic counterparts—may frequently be used to convey emotions and may similarly lead the readers to a better understanding of the characters described to send these signals.

Given the suggested similarities in the functional roles of vocal emotional cues both in literature and day-to-day interactions one might ask whether such similarities emerge with respect to cerebral mechanisms employed in the decoding process as well.

To address this question, this study aimed at exploring brain responses associated with the decoding of voice signals described in literary texts. Building on the hypothesis that similarities may emerge with respect to task- or emotion-driven as well as voice-sensitive brain responses, frequently reported for the decoding of affective voice cues (Wildgruberet al., 2006,2009;Brücket al., 2011b), particular emphasis was placed on the evaluation of task- and emotion-related as well as voice-sensitive modulations of brain activation.

Analyses of voice-sensitive effects focused on the temporal voice area (TVA), a brain region located in the superior temporal cortex suggested to exhibit a preferential responding to human voices (Belinet al., 2000;Belinet al., 2002;Bestelmeyeret al., 2011) and to play a role in a broad range of voice-related abilities including the perception of affective voice cues (Ethoferet al., 2009b).

Considering emotion-related effects on the other hand, published data suggest emotion-driven modulations of activation for several structures implicated in emotional voice decoding such as the amygdalae (Wiethoffet al., 2009), the anterior rostral medial frontal cortex (arMFC) (Brücket al., 2011a) and the TVA (Ethoferet al., 2012). While, particularly with respect to TVA activation, such reported emotion-driven increases in responding may reflect effects unique to the decoding of voice-based acoustic information, modulations of the amygdalae and arMFC resemble results documented for a variety of emotion perception tasks (e.g. facial emotion processing;Kesler-Westet al., 2001;Fusar-Poliet al., 2009;Sabatinelliet al., 2011). Latter findings, in turn, may outline a cue-independent contribution of both structures to perceptual mechanisms more commonly involved in deciphering other people’s states of the mind (Zald, 2003;Amodio and Frith, 2006;Peelenet al., 2010).

Aside effects of emotion, however, brain responses to affective voice cues have also been described to differ depending on the task instructions they are presented with. Compared to a more implicit processing of emotions encoded in a voice (e.g. via task instructions that distract attention away from expressed emotions), instructions to focus on the explicit evaluation of emotional information have been documented to increases activation within the lateral frontal lobe (Wildgruberet al., 2004,2005;Ethoferet al., 2006,2009a), a brain region assumed to contribute to meaning analysis across a variety of emotion-related tasks (Koberet al., 2008;Wageret al., 2010;Lindquistet al., 2012).

METHODS

Participants

Twenty-two volunteers (11 female, all right-handed, all native speakers of German, meanage = 24.95, s.d. = 3.70) consented to participate. Participants were screened to exclude hearing or vision impairments as well as past or present psychiatric or neurological disorders, or current medical treatment that might affect brain function.

Ethics statement

The experiment was conducted in accordance with the ethical principles expressed in the Declaration of Helsinki, and the study protocol was reviewed and approved by the local ethics committee. All participants received detailed information about the purpose and procedure of the study, and gave written consent prior to involvement in this research.

MRI data acquisition

Magnetic resonance imaging (MRI) data were acquired on a 3 T MRI scanner (Tim Trio, Siemens, Erlangen, Germany) equipped with a 12-channel head coil. All functional images were obtained using a BOLD-sensitive echo planar imaging sequence covering the whole brain with 30 slices (slice thickness: 4 mm thickness + 1 mm gap, FoV = 192 mm × 192 mm, 64 × 64 matrix, voxel size 3 × 3 × 4 mm3, TR = 1700 ms, TE = 30 ms and flip angle = 90°). In addition to functional data, high-resolution structural images were collected from each participant as anatomical reference (magnetization prepared rapid gradient echo: TR = 2300 ms, TE = 2.96 ms, 176 slices, slice thickness: 1 mm, FoV = 256 mm × 256 mm).

Stimulus material, tasks and procedure

Main experiment

Stimulus material comprised a set of 78 text samples. Text samples were selected from an original pool of 212 face and voice descriptions gathered from novels and narratives published in German. The selection was based on a series of behavioral experiments and text analyses conducted to evaluate key features of each text sample such as the emotional valence, emotional arousal, aesthetic value, text length, lexical complexity or syntactic complexity. The selection procedure aimed at composing two subsets of text stimuli, one set of voice descriptions, one set of face descriptions, each matched with respect to the key characteristics reported above. Of the 78 texts chosen in the process 39 described vocal expressions and 39 facial expressions. Within each sub-set one-third of the text samples conveyed emotionally neutral expressions, whereas the remaining two-thirds communicated either a positive (n = 13) or a negative (n = 13) emotional state. Examples of texts included in the sub-sets are provided inTable 1. Ratings of the respective text characteristics are summarized inTable 2.

Table 1.

Examples of text stimuli employed in the main experiment

CueEmotional stateSample textEnglish translation
FacialPositiveSie zeigte, für die Dauer eines Herzschlags, ihr Lachen von einer Wange zur anderen.For the duration of a heartbeat, she smiled from cheek to cheek.
Kirchhoff, Bodo (2002) Die Weihnachtsfrau. Frankfurt a. M.: Fischer Taschenbuch, p. 32.English translation provided by the authors.
Neutral[Er] bedeckte die Oberlippe mit der Unterlippe und behielt diese Stellung […].[H]e pursed his lips, and retained this attitude unchanged […].
Gogol, Nikolai Vasilevich (n.d., 2nd ed., approx. 1960). Tschitschikows Abenteuer oder Tote Seelen. German translation by Elisabeth and Wladimir Wonsiatsky. München: Simon-Herold-Verlag, p. 20.Gogol, Nikolai Vasilevich (2008) Dead Souls. translated by D. J. Hogarth, p. 15.
NegativeKrespel schnitt ein Gesicht, als wenn jemand in eine bittere Pomeranze beißt, und dabei aussehen will, als wenn er süßes genossen; aber bald verzog sich dies Gesicht zur graulichen Maske […].Krespel made a face like someone biting into a sour orange who wants to look as if it were a sweet one; but soon his expression changed into a horrifying mask […].
Hoffmann, Ernst Theodor Amadeus (2001) Rat Krespel. In: Steinecke, H. & Segebrecht, W. (Eds.) Sämtliche Werke. Frankfurt a. M.: Deutscher Klassiker Verlag. Bd. 4, p. 44.Hoffmann, E.T.A. (1972) Councillor Krespel. In: Tales by E.T.A. Hoffmann. Edited and translated by Leonard J. Kent and Elizabeth C. Knight. University of Chicago Press, p. 129f.
VocalPositiveAls sie sprach, klang ihre Stimme sanft und kehlig und mit einem italienischen Akzent behaftet.When she spoke, her voice was smooth – a throaty, accented English.
Brown, Dan (2003) Illuminati. German translation by Axel Merz. Bergisch Gladbach: Bastei Lübbe, p. 75.Brown, Dan (2001) Angels and Demons. London: Corgi, p. 70.
NeutralDie Frau […] sprach langsam und leise in einer Sprache, die Julie sich nicht erinnern konnte, jemals gehört zu haben.The women […] spoke slowly and quietly in a language Julie couldn’t remember to have ever heard before.
Hoffmann, Ernst Theodor Amadeus (1992) Lebens-Ansichten des Katers Murr. In: Steinecke, H. & Segebrecht, W. (Eds.) Sämtliche Werke. Frankfurt a. M.: Deutscher Klassiker Verlag. Bd. 5, p. 218f.English translation provided by the authors.
NegativeEr sprach sehr langsam, und die Worte schienen ihm gegen seinen Willen entpreßt zu werden.He spoke very slowly, and the words seemed wrung out of him almost against his will.
Wilde, Oscar (1985). Das Bildnis des Dorian Gray. German translation by Hedwig Lachmann and Gustav Landauer. Frankfurt a. M.: Insel, p. 25.Wilde, Oscar (2011). The Picture of Dorian Gray. London: Harvard University Press, p. 87.
Table 2.

Summary of key characteristics of text samples describing facial and vocal cues

CueValencea
ArousalbAesthetic valuecText length (no. of characters)Word frequencyd (a.u.)Syntactic complexityd (a.u.)
AllPosNeuNeg
Facial
    Mean5.013.104.966.974.174.44104.828.175.31
    s.d.1.650.500.260.411.090.5457.341.370.77
Vocal
    Mean5.063.355.096.754.424.51101.798.255.27
    s.d.1.460.420.470.321.040.4359.011.520.77

aValence ratings measured on a scale ranging from 1—very positive to 9—very negative; categorization: neutral = 4.5–5.5, positive <4.5, negative >5.5.

bArousal ratings measured on a scale ranging from 1—very low to 9—very high.

cAesthetic value measured on a scale ranging from 1—very poorly written to 9—very well written.

dDetermined for each word using the German Reference Corpus (DeReKo,Kupietzet al., 2010) and averaged among words within each text sample.

The selected text samples were used to devise two different tasks: Task 1 targeted explicit emotion processing by asking participants to focus on emotions expressed in the texts and rate each of the 78 text samples with respect to the valence of the communicated emotions (from here on referred to as emotion judgment task or emo). Participants were offered a choice of four different response alternatives: ++ (highly positive), + (positive), − (negative) and −− (highly negative). Task 2 targeted a more implicit processing of emotional information by diverting attention away from expressed emotions through instructions to focus on the evaluation of a text’s aesthetic value (from here on referred to as aesthetics judgment task or aes). Participants were asked to give their opinion on how well they thought each of the presented 78 text samples was written choosing one of four answers alternatives: ++ (very well), + (well), − (poorly) and −− (very poorly).

Text samples were back-projected onto a translucent screen placed in the back of the scanner bore and viewed by the participants via a mirror system mounted on the head coil. Texts were displayed centered in the middle of the screen in a 20 pt black Arial font against a light gray background. Text lengths ranged between one and five lines. Participants were asked to refrain from reading aloud and instructed to indicate their answers by pressing one of four buttons on a fiber optic response pad (LumiTouch, Photon Control, Burnaby, Canada) placed in their right hand (−− index finger, − middle finger, + ring finger, ++ little finder, reversed key arrangement for half of the participants). Stimulus presentation was controlled using the software packagePresentation 14.2 (Neurobehavioral Systems Inc., Albany, CA, USA) installed on a standard personal computer. Trial onset was synchronized with scan onset with each trial starting with a fixation cross displayed for either 1700, 2125, 2550, 2975 or 3400 ms (i.e. TR + ¼ steps of the TR) allowing to jitter stimulus onset relative to scan onset. The fixation interval was followed by the presentation of the respective text sample to read. As far as the reading period is concerned, no time limitations were imposed. Trials continued only after the reader had indicated an answer. Each trial was concluded by a second fixation interval with a fixed timeframe of 6800 ms (= 4 scans) separating consecutive trials. Moreover, fixation periods ranging from 10 200 to 11 900 ms were included as null events and randomly interspersed between stimulus presentations (= 8 null trials per task). Measurements for each task were obtained in separate runs, and the corresponding task instructions were provided immediately before starting each run. Text order within each task was fully randomized, and task order was balanced among participants.

Functional localizer of temporal voice areas

To allow comparisons between activation patterns obtained in the main experiment and the TVA implicated in the direct perception of voice signals, all participants completed a functional localizer scan (adapted fromBelinet al., 2000) aimed at defining voice-sensitive brain areas: Participants were instructed to close their eyes and listen carefully to a series of sound stimuli presented to them. Acoustic stimulations included 12 blocks of human vocal sounds (VS), 6 blocks of environmental sounds (ES), 6 blocks of animal sounds (AS) and 12 blocks of silence. Each block measured 10 s in duration (i.e. 8 s of auditory stimulation plus 2 s of silence). Sound stimuli within the respective blocks were normalized to the same mean acoustic intensity, and block order was randomized among participants. Stimulus presentation was controlled using the software packagePresentation 14.2 (Neurobehavioral Systems Inc., Albany, CA, USA), and sounds stimuli were delivered via MRI compatible headphones (Sennheiser Electronic GmbH & Co. KG, Wedemark-Wennebostel, Germany; in-house modified).

Data analysis: behavioral data

Ratings (i.e. emotional valence/aesthetic value) and reading durations (i.e. time between stimulus onset and button press) were analyzed as behavioral data. To this end, obtained ratings were re-coded to numeric values (emo: ++ = 1, + = 2, − = 3, −− = 4; aes: ++ = 4, + = 3, − = 2, −− = 1) and averaged among text samples pertaining to the same valence categories and type of cue, resulting in six single measures obtained for each participant within each task condition: meanface_pos, meanface_neu, meanface_neg, meanvoice_pos, meanvoice_neu and meanvoice_neg. Reading durations were averaged in a similar fashion.

Data analysis: lmaging data

All images were processed and analysed using the software package SPM8 (http://www.fil.ion.ucl.ac.uk/spm/).

Preprocessing

EPI raw data were realigned to correct for head motion, unwarped using a static field map, co-registered with obtained anatomical images, normalized to MNI space and smooth with an isotropic Gaussian kernel of 8 mm full-width at half maximum. The first five images of each run were discarded from further analyses to exclude measurements preceding T1 equilibrium.

Statistical analysis: main experiment

Based on the research questions outlined earlier, statistical analyses aimed at evaluating cue-independent task and emotion effects as well as ‘voice’-related effects on brain activation associated with the processing of the presented text samples. The respective analyses were based on a general linear model with each event modeled as a separate regressor convolved with the canonical HRF. Events were time-locked to the onset of each stimulus and modeled durations corresponded to the individual reading durations obtained for the respective text samples. Time series were high-pass filtered (cut-off frequency: 1/128 Hz) to remove low-frequency noise. Serial autocorrelations within the data were accounted for by modeling the error term as an autoregressive process. Estimated beta values were used to definet-contrasts for each subject corresponding to the main effect of each of the 12 different experimental factor level combinations (i.e. combinations of task, cue and valence). Computed contrasts then were subjected to a second-level group analysis of variance employing a full-factorial design with task (emo/aes), emotional valence (positive/negative/neutral) and type of cue (facial/vocal) specified as within-subject factors and unequal variances assumed for measurements in each level. Resulting main effects and interactions were assessed for significance at cluster level using a cluster-defining threshold ofP < 0.001 uncorrected, and a cluster-wise significance levels ofP < 0.05 corrected for multiple comparisons (across the whole brain) as criterion. Corrected cluster-levelP-values were determined using the NS Toolbox (http://fmri.wfubmc.edu/cms/software#NS). Additionally, analyses were conducted to explore relationships between regional brain activation and behavioral responses given by the participants (seeSupplementary Data).

Statistical analysis: functional localizer

Analyses of localizer data relied on a general linear model with each of the 3 stimulation blocks VS, AS and ES modeled as a separate regressor using a boxcar function of 8 s in duration convolved with the HRF. Voice-sensitive brain activation was evaluated by contrasting brain responses to VS with activation elicited by both ES and AS (t-contrast: VS > AS, ES). The respective contrasts then were subjected to a second-level random effects analysis. Results were assessed for cluster-wise significance using a cluster-defining threshold ofP < 0.001 uncorrected, and a cluster-wise significance levels ofP < 0.05 corrected for multiple comparisons (across the whole brain) as criterion.

Aiming to evaluate the contribution of the TVA to the processing of literary voice description, TVA masks were generated based on the results of the functional localizer scans and used to explore brain activation within these regions during reading. To this end, beta values (estimated for each event) were extracted from all voxel within the left or right TVA and subsequently averaged among voxels within the same hemisphere. Aiming to further explore activation differences related to task, emotional valence or type of cue, the respective mean beta values were subjected to separate repeated-measures analyses of variance (i.e. one for the right and one for the left TVA). Moreover, to evaluate the role of writing style, a second exploratory analysis was conducted to explore effects of the use of direct speech on TVA activation during reading. The motivation to test for effects of direct speech was derived from recent research findings suggesting differences in reporting style modulate reading-related TVA responses (Yaoet al., 2011). Of the 39 voice descriptions employed in the current experiment, 13 utilized direct speech quotations (e.g. ‘Das ist nicht zu ertragen’, sprach die Fürstin leise mit zitterender Stimme1). To infer differences between the two different reporting styles, beta estimates corresponding to ‘direct-speech’ or ‘no-direct speech’ text samples were extracted from the TVA and compared by means of paired-samplest-test.

RESULTS

Behavioral data

On average, judgments of emotional valence replicated valence categories assigned to the texts employed in the study (Figure 1): On a four-point scale ranging from 1—highly positive to 4—highly negative, text samples selected to represent positive states of the mind received average ratings of meanpos = 1.73 (±0.06 s.e.m.), while mean ratings obtained for neutral and negative text samples averaged to values of meanneu = 2.50 (±0.04 s.e.m.) and meanneg = 3.28 (±0.05 s.e.m.), respectively.

Fig. 1.

Fig. 1

Behavioral data: ratings of emotional valence and aesthetic values as well as corresponding mean reading durations observed for each valence category (positive = pos, neutral = neu, negative = neg) and type of cue (facial cues = dark gray bars, vocal cues = white bars). Results are shown as mean values ± 1 s.e.m.

Considering judgments of aesthetic value (Figure 1), ratings obtained on a four-point scale ranging from 1—very poorly written to 4—very well written indicated that overall the highest aesthetic value was assigned to text samples expressing positive emotions (meanpos = 2.83 ± 0.08 s.e.m.), followed by texts expressing neutral (meanneu = 2.70 ± 0.06 s.e.m.) and negative states of the mind (meanneg = 2.59 ± 0.08 s.e.m.).

Reading durations obtained during the emotion judgement and aesthetics judgment task (Figure 1), revealed that participants took longest to read text samples expressing a neutral as compared to an emotional state of the mind (meanneu_emo = 6455 ms ± 376 ms s.e.m.; meanneu_aes = 6518 ms ± 412 ms s.e.m.; meanpos_emo = 5753 ms ± 326 ms s.e.m.; meanpos_aes = 6143 ms ± 412 ms s.e.m.; meanneg_emo = 5764 ms ± 317 ms s.e.m.; meanneg_aes = 6270 ms ± 426 ms s.e.m.).

fMRI data: analysis of variance

Significant results are summarized inTable 3 andFigure 2.

Table 3.

Significant results obtained from an analysis of variance computed on brain activation data

Anatomical definitionaxyzzmaxkePcorrb
Main effect of taskLFrontal mid/frontal inf−4815364.381680.000
Main effect of emotional valenceL/RCerebellum−9−42−454.551910.000
L/RFrontal medial/cingulum ant048−64.372940.000
L/RCingulum post0−30244.371410.001
Main effect of cue typeLTemporal mid/temporal sup/temporal pole−57−9−155.721940.000
LTemporal mid−60−36−34.43590.048
RTemporal sup/temporal mid63−6−154.351990.000
RParietal sup/occipital sup/occipital mid30−69363.77640.037

aAnatomical definitions are based on labels obtained using the cluster labeling tool provided by the SPM toolbox Automated Anatomical Labeling (AAL;Tzourio-Mazoyeret al., 2002).

bCorrected for multiple comparisons across the whole brain at cluster level.

Fig. 2.

Fig. 2

Significant results obtained from the analysis of variance computed on brain activation data. Displayed are renderings of significant activation clusters (cluster-levelP-value < 0.05 corrected for multiple comparisons across the whole brain) as well as beta estimates plotted as bar diagrams to further detail the findings (error bars: ±1 s.e.m., **P < 0.01, *P < 0.05). Green colors show activations corresponding to the main effect of task, whereas red colors depict activations corresponding to the main effect of cue type, and blue colors indicate activations reflecting the main effect of emotion.

Main effect of task

Analyses of fMRI data indicated a significant main effect of task on cerebral responses within the left lateral frontal cortex (i.e. left middle and inferior frontal cortex).Post hoc comparisons computed on beta values extracted from this activation cluster revealed that this main effect was driven by an increased activation of the lateral frontal cortex during the emotion judgment task as compared to aesthetics judgment task.

Main effect of valence category

A significant main effect of valence category on brain activation was observed within the arMFC (including the anterior cingulate cortex), the cerebellum and the posterior cingulate cortex (PCC).Post hoc comparisons computed for each activation cluster indicated that valence-related effects observed within the medial frontal cortex were driven by increasing responses of this region to text samples expressing positive (as relative to negative or neutral) emotions, while effects observed for the cerebellum and PCC were explained by stronger responses of these regions to text samples conveying both positive and negative (as compared to neutral) emotional states.

Main effect of cue type

Moreover, analyses indicated cue-related activation differences in the left posterior and mid and superior temporal cortex as well as the right superior temporal cortex and right parietal cortex extending into the superior occipital cortex.Post hoc inspections of the observed main effect of cue type evidenced that all the reported regions responded more strongly to descriptions of vocal as compared to facial cues.

Interactions

As far as the modeled interaction terms are concerned, no significant findings emerged at the chosen statistical thresholds.

fMRI data: TVA activation

Comparisons between cue-related activation patterns obtained in the reading experiment and voice-sensitive brain activation (as determined by the functional localizer) indicated a substantial overlap between temporal brain structures implicated in the processing of voicedescriptions and the TVA: 83% (= 165 of 199) of the voxels activated within the right superior temporal cortex as well as 13% (= 25 of 194) of the voxels activated within the left mid and superior temporal cortex, and 22% (= 13 of 59) of the voxels activated within the left posterior superior temporal cortex proved to overlap with voice-sensitive brain structures located within the right and left hemisphere (right TVA: activation peak: 60, −18, −3,ke: 510,Pcorr = 0.000; left TVA: activation peak: −60, −9, −0,ke: 289,Pcorr = 0.000;Figure 3).

Fig. 3.

Fig. 3

Diagrams depicting contrast estimates of cue-type related activation differences obtained for each individual within the left (upper panel) and right (lower panel) TVA. Printed on the right are renderings showing the overlap between TVAs (outlined in blue) and ‘voice-sensitive’ activation clusters obtained in the reading experiment (main effect of cue type, voices > faces).

Analyses conducted on beta values extracted from both the right and left TVA indicated a significant main effect of cue type on brain activation observed within these regions explained by increased responses to descriptions of voices as relative to descriptions of faces [left TVA:F(1,21) = 24.92,P < 0.001; right TVA:F(1,21) = 9.46,P = 0.006]. Estimates of cue-type related activation differences obtained for each individual within the left and right TVA are displayed inFigure 3. Task instructions or emotional valence, on the other hand, did not influence TVA responses (all main effects and interactions involving task or valenceP ≥ 0.115).

As far as effects of writing style are concerned, beta values extracted from the TVA indicated higher mean activation to text samples including direct speech statements for both the right (meanno_direct = 0.40 ± 0.05 s.e.m.; meandirect = 0.53 ± 0.07 s.e.m.) and left (meanno_direct = 0.59 ± 0.09 s.e.m.; meandirect = 0.69 ± 0.11 s.e.m.) TVA. However, only activation differences observed within the right TVA reached statistical significance at a conventional threshold ofP < 0.05 [right TVA:t(21) = −2.55,P = 0.019; left TVA:t(21) = −1.93,P = 0.068].

DISCUSSION

Building on the assumption that similarities between written and acoustic representations of vocal cues may not only emerge with respect to their functional role in communication but may also extend to the cerebral mechanisms involved in the decoding process, this study sought to explore brain responses associated with processing emotional voice signals described in literary texts.

In line with the latter assumption obtained results suggest that the decoding of literary descriptions of non-verbal affective signals may indeed (partly) rely on a set of brain regions previously implicated in the auditory perception of emotional voices.

Similarities between the perception of acoustic and described emotional cues, for instance, emerged with respect to a task-dependent recruitment of lateral frontal brain structures, particularly the inferior frontal cortex, in situations that require the explicit evaluation of the emotional information expressed in a given cue.

Considering the frontal cortex’s role in emotion processing, results of recent meta-analyses (Koberet al., 2008;Wageret al., 2010) implicate the lateral frontal cortex, particularly the inferior frontal cortex, in a functional group of brain regions essential to information selection and meaning analysis (Koberet al., 2008;Wageret al., 2010;Lindquistet al., 2012) across a wide range of emotion-related tasks. Similar suggestions are also reflected in models of affective information processing which assume a key role of the lateral frontal cortex in high-order stages of the decoding process related to the appraisal, interpretation and conceptual categorization of expressed emotions (e.g.Schirmer and Kotz, 2006;Wildgruberet al., 2009;Brücket al., 2011b). Given research findings linking particularly the inferior frontal cortex to the mirror system of the human brain (Iacoboni and Dapretto, 2006;Van Overwalle and Baetens, 2009), increasing activation of this brain area in response to the emotional judgment task might be driven by increased efforts to mirror or simulate described facial or vocal expressions, a process assumed to aid or facilitate our understanding of others.

However, as far as task effects are concerned, it should be noted that the processing of the emotional connotations and the associated mirroring of described emotional expressions may have partially contributed to aesthetic judgments as well. In other words, valence judgments may actually be a part of the decision process that leads to judgments of aesthetic value with text samples expressing a positive emotion receiving slightly higher judgments of aesthetic values as compared to text samples expressing neutral or negative emotions (see ‘Behavioral data’ section;Figure 1). Analyses conducted to explore relationships between brain activation and aesthetic or valence ratings given by each participant revealed significant relationships with activation of the arMFC (seeSupplementary Data) for both rating types extending suggestions of an overlap between both rating tasks to the level of brain activation as well.

Aside task instructions and the associated shift in the focus of attention, reading-related brain responses proved to be affected by the emotionality of described communication signals: Compared to neutral text samples, the processing of cue descriptions conveying emotional states more strongly engaged a set of midline structures including aspects of the arMFC, PCC and cerebellum. While the role of the cerebellum remains elusive, hypotheses regarding the contribution of the PCC and arMFC may again be derived from several research reports tying the respective brain regions to different sub-functions and cognitive operations involved in emotion perception.

Activation of the medial frontal cortex has frequently been related to social cognitive processing with anterior rostral aspects linked to ‘mentalizing’ (Amodio and Frith, 2006)—processes by which inferences about the mental states of others are made (Frith and Frith, 2006). Enhanced arMFC activation observed in response to the presentation of emotional face and voice descriptions might thus be assumed to reflect mentalizing that further appeared to be modulated by the emotional salience of the respective signals (as reflected in increasing arMFC responses to emotional, particularly positive cues).

Considering the contribution of posterior cingulate brain structures, research findings linking the PCC to both the processing of stimuli carrying affective meaning (e.g.Maddock and Buonocore, 1997;Maddocket al., 2003) as well as to episodic memory (e.g.Hensonet al., 1999;Maddocket al., 2001) lead to assume a role of the PCC in the interaction between memory and emotions (Maddock, 1999). The term interaction in this case is used to describe a regulation of memory by emotion (Maddock, 1999) that perhaps may be most commonly expressed in a more efficient encoding, and thus an enhancement of memory, for emotional events. However, such interaction effects may also involve memory search and retrieval: One could assume that observed emotions cue the recall of similar emotional states or events (personally experienced in the past), and that the recall of such memories may in turn serve as a reference to interpret current observations (Lindquistet al., 2012). In other words, literary descriptions of emotional expressions provided in this study could have cued in the reader an emotion-related memory search and retrieval mediated by the PCC.

While obtained responses of the PCC as well as of the lateral and medial frontal cortex may be considered to reflect brain responses reported across a wide range of emotion-related tasks and phenomena, temporal activation observed in this study appears to reveal a voice-sensitive modulation of activation. Latter assumptions are corroborated by observations of increased responding of this structure to voice descriptions that furthermore appeared to be enhanced by the use of direct speech quotations mimicking speech acts (Yaoet al., 2011). Considering the localization of observed ‘voice’-sensitive modulations of brain activation, comparisons conducted between reading-related brain responses and functional localizer data revealed a substantial overlap between the identified ‘voice’-sensitive activation clusters and areas specialized for the perception of human voices termed the TVA (Belinet al., 2000,2002).

In analogy to face-sensitive structures reported for the human visual system (i.e. fusiform face area;Kanwisheret al., 1997), the TVA has been suggested to represent a processing module that subserves the auditory analysis of voices (Campanella and Belin, 2007;Belinet al., 2011) relevant to a rich set of voice cognition abilities including the extraction of emotional information encoded in a voice. However, the role of the TVA may not be limited to the auditory analysis and decoding of acoustic voice cues alone. Rather recent research reports as well as results obtained in this study demonstrate activation of voice-sensitive brain areas even in the absence of acoustic stimulation: Research published on the cerebral structures recruited during (non-clinical) auditory verbal hallucinations (Lindenet al., 2011) or the silent reading of text samples depicting different speech acts (Yaoet al., 2011) may serve as examples to illustrate activation of voice-sensitive brain structures that is not driven by acoustic stimulation.

However, a common denominator among experiments aimed at investigating verbal hallucinations and reading studies (including the current experiment) may be the shared experience of an ‘inner voice’ in the process. As far as reading is concerned, anecdotal reports as well as observations obtained in behavioral experiments identify occurrences of an inner voice, or the perceptual simulation of voice characteristic while reading, to be a commonplace phenomenon frequently observed among individuals (Alexander and Nygaard, 2008;Kurbyet al., 2009;Yao and Scheepers, 2011). Recent neuroimaging findings, moreover, link these perceptual simulations to a top-down activation of the TVA that occurs even when readers are not explicitly instructed to imagine the sound of a voice (Yaoet al., 2011). Latter findings connecting TVA activation with processes of auditory mental imagery, in turn, may be interpreted to suggest that the TVA may not only represent a processing site integral to the acoustic analysis of voice information but may also store acoustic information related to different vocal sounds that is re-activated during perceptual simulations. Recalling, recombining and modifying these stored sound information may not only give rise to auditory imagery, and thus the experience of hearing an inner voice (Kosslynet al., 2001) rather it may also facilitate the voice decoding process in the sense that mental representations formed on the basis of previous experiences may help in assigning meaning to presented voice descriptions. Activation of the TVA observed in the reading process may thus reflect the access of voice-related memories and the formation of mental images used in the process. Aside the more general observation of an increased TVA activation in response to voice descriptions, the link between TVA responses and auditory mental imagery is further substantiated by the observation that stylistic devices such as direct speech quotations, aimed at increasing mental imagery during reading, even further enhance reading-related TVA responses. As reasoned byYaoet al. (2011) direct speech quotations are ‘assumed to entail a demonstration of the reported utterance’ rather than a ‘mere description’ thus providing the reader with a more vivid and perceptually engaging exemplar of a speech act that raises the likelihood readers will ‘activate “audible speech”-like representations’ during reading (p. 3146).Yaoet al. (2011) continue to link these direct-speech related simulation processes to increases in TVA activation strengthening suggestions that the TVA contributes to processes of auditory mental imagery.

CONCLUSION

Whether an acoustic phenomenon in a day-to-day conversation or a vivid description in a book, emotional voice cues may share common characteristics that may not only relate to their functional role as valuable source of information but may also include cerebral mechanisms associated with the decoding processes. Similarities emerge with respect to the recruitment of both specialized voice perception areas as well as brain regions such as the posterior cingulate, or lateral and medial frontal cortex assumed to subserve functions relevant to emotion perception in a broader context. Observed similarities, in turn, may suggest a common perceptual mechanism that underlies the ability to decode emotional voice cues across a wide range of tasks or forms of presentation.

SUPPLEMENTARY DATA

Supplementary data are available atSCAN online.

Supplementary Data

Acknowledgments

The authors would like to thank Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn of the Institute for the German Language (IDS) Mannheim for their support in determining measures of lexical complexity and syntactic complexity employed in the stimulus selection process.

This work was supported by the Werner Reichardt Centre for Integrative Neuroscience (CIN), Tübingen (CIN 2009-17).

Footnotes

1 English translation: ‘I cannot take this any longer’ the baroness said quietly with a quivering voice. Hoffmann, E.T.H. (1912).Lebensansichten des Katers Murr. Hamburg: Verlag Alfred Janssen, p. 194, English translation provided by the authors.

REFERENCES

  1. Ackermann H, Hertrich I, Grodd W, Wildgruber D. Das Hören von Gefühlen: Funktionell-neuroanatomische Grundlage der Verarbeitung affektiver Prosodie. Aktuelle Neurologie. 2004;31:449–60. [Google Scholar]
  2. Alexander JD, Nygaard LC. Reading voices and hearing text: talker-specific auditory imagery in reading. Journal of Experimental Psychology. Human Perception and Performance. 2008;34(2):446–59. doi: 10.1037/0096-1523.34.2.446. [DOI] [PubMed] [Google Scholar]
  3. Amodio DM, Frith CD. Meeting of minds: the medial frontal cortex and social cognition. Nature Reviews Neuroscience. 2006;7(4):268–77. doi: 10.1038/nrn1884. [DOI] [PubMed] [Google Scholar]
  4. Banse R, Scherer KR. Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology. 1996;70(3):614–36. doi: 10.1037//0022-3514.70.3.614. [DOI] [PubMed] [Google Scholar]
  5. Belin P, Bestelmeyer PE, Latinus M, Watson R. Understanding voice perception. British Journal of Psychology. 2011;102(4):711–25. doi: 10.1111/j.2044-8295.2011.02041.x. [DOI] [PubMed] [Google Scholar]
  6. Belin P, Zatorre RJ, Ahad P. Human temporal-lobe response to vocal sounds. Brain Research. Cognitive Brain Research. 2002;13(1):17–26. doi: 10.1016/s0926-6410(01)00084-2. [DOI] [PubMed] [Google Scholar]
  7. Belin P, Zatorre RJ, Lafaille P, Ahad P, Pike B. Voice-selective areas in human auditory cortex. Nature. 2000;403(6767):309–12. doi: 10.1038/35002078. [DOI] [PubMed] [Google Scholar]
  8. Bestelmeyer PE, Belin P, Grosbras MH. Right temporal TMS impairs voice detection. Current Biology. 2011;21(20):R838–9. doi: 10.1016/j.cub.2011.08.046. [DOI] [PubMed] [Google Scholar]
  9. Brück C, Kreifelts B, Kaza E, Lotze M, Wildgruber D. Impact of personality on the cerebral processing of emotional prosody. Neuroimage. 2011a;58(1):259–68. doi: 10.1016/j.neuroimage.2011.06.005. [DOI] [PubMed] [Google Scholar]
  10. Brück C, Kreifelts B, Wildgruber D. Emotional voices in context: a neurobiological model of multimodal affective information processing. Physics of Life Reviews. 2011b;8(4):383–403. doi: 10.1016/j.plrev.2011.10.002. [DOI] [PubMed] [Google Scholar]
  11. Campanella S, Belin P. Integrating face and voice in person perception. Trends in Cognitive Science. 2007;11(12):535–43. doi: 10.1016/j.tics.2007.10.001. [DOI] [PubMed] [Google Scholar]
  12. Ethofer T, Anders S, Erb M, et al. Cerebral pathways in processing of affective prosody: a dynamic causal modeling study. Neuroimage. 2006;30(2):580–7. doi: 10.1016/j.neuroimage.2005.09.059. [DOI] [PubMed] [Google Scholar]
  13. Ethofer T, Bretscher J, Gschwind M, Kreifelts B, Wildgruber D, Vuilleumier P. Emotional voice areas: anatomic location, functional properties, and structural connections revealed by combined fMRI/DTI. Cerebral Cortex. 2012;22(1):191–200. doi: 10.1093/cercor/bhr113. [DOI] [PubMed] [Google Scholar]
  14. Ethofer T, Kreifelts B, Wiethoff S, et al. Differential influences of emotion, task, and novelty on brain regions underlying the processing of speech melody. Journal of Cognitive Neuroscience. 2009a;21(7):1255–68. doi: 10.1162/jocn.2009.21099. [DOI] [PubMed] [Google Scholar]
  15. Ethofer T, Van De Ville D, Scherer K, Vuilleumier P. Decoding of emotional information in voice-sensitive cortices. Current Biology. 2009b;19(12):1028–33. doi: 10.1016/j.cub.2009.04.054. [DOI] [PubMed] [Google Scholar]
  16. Frith CD, Frith U. The neural basis of mentalizing. Neuron. 2006;50(4):531–4. doi: 10.1016/j.neuron.2006.05.001. [DOI] [PubMed] [Google Scholar]
  17. Fusar-Poli P, Placentino A, Carletti F, et al. Functional atlas of emotional faces processing: a voxel-based meta-analysis of 105 functional magnetic resonance imaging studies. Journal of Psychiatry and Neuroscience. 2009;34(6):418–32. [PMC free article] [PubMed] [Google Scholar]
  18. Henson RN, Rugg MD, Shallice T, Josephs O, Dolan RJ. Recollection and familiarity in recognition memory: an event-related functional magnetic resonance imaging study. Journal of Neuroscience. 1999;19(10):3962–72. doi: 10.1523/JNEUROSCI.19-10-03962.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Iacoboni M, Dapretto M. The mirror neuron system and the consequences of its dysfunction. Nature Reviews Neuroscience. 2006;7(12):942–51. doi: 10.1038/nrn2024. [DOI] [PubMed] [Google Scholar]
  20. Kanwisher N, McDermott J, Chun MM. The fusiform face area: a module in human extrastriate cortex specialized for face perception. Journal of Neuroscience. 1997;17(11):4302–11. doi: 10.1523/JNEUROSCI.17-11-04302.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kesler-West ML, Andersen AH, Smith CD, et al. Neural substrates of facial emotion processing using fMRI. Brain Research. Cognitive Brain Research. 2001;11(2):213–26. doi: 10.1016/s0926-6410(00)00073-2. [DOI] [PubMed] [Google Scholar]
  22. Kober H, Barrett LF, Joseph J, Bliss-Moreau E, Lindquist K, Wager TD. Functional grouping and cortical-subcortical interactions in emotion: a meta-analysis of neuroimaging studies. Neuroimage. 2008;42(2):998–1031. doi: 10.1016/j.neuroimage.2008.03.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kosslyn SM, Ganis G, Thompson WL. Neural foundations of imagery. Nature Reviews Neuroscience. 2001;2(9):635–42. doi: 10.1038/35090055. [DOI] [PubMed] [Google Scholar]
  24. Kupietz, M., Belica, C., Keibel, H., Witt, A. (2010). The German Reference Corpus DEREKO: a primordial sample for linguistic research. In: Calzolari, N., Choukri, K., Maegaard, B., et al., editors. Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010), pp. 1848–54.
  25. Kurby CA, Magliano JP, Rapp DN. Those voices in your head: activation of auditory images during reading. Cognition. 2009;112(3):457–61. doi: 10.1016/j.cognition.2009.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Linden DE, Thornton K, Kuswanto CN, Johnston SJ, van de Ven V, Jackson MC. The brain's voices: comparing nonclinical auditory hallucinations and imagery. Cerebral Cortex. 2011;21(2):330–7. doi: 10.1093/cercor/bhq097. [DOI] [PubMed] [Google Scholar]
  27. Lindquist KA, Wager TD, Kober H, Bliss-Moreau E, Barrett LF. The brain basis of emotion: a meta-analytic review. Behavioral and Brain Sciences. 2012;35(3):121–43. doi: 10.1017/S0140525X11000446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Maddock RJ. The retrosplenial cortex and emotion: new insights from functional neuroimaging of the human brain. Trends in Neuroscience. 1999;22(7):310–6. doi: 10.1016/s0166-2236(98)01374-5. [DOI] [PubMed] [Google Scholar]
  29. Maddock RJ, Buonocore MH. Activation of left posterior cingulate gyrus by the auditory presentation of threat-related words: an fMRI study. Psychiatry Research. 1997;75(1):1–14. doi: 10.1016/s0925-4927(97)00018-8. [DOI] [PubMed] [Google Scholar]
  30. Maddock RJ, Garrett AS, Buonocore MH. Remembering familiar people: the posterior cingulate cortex and autobiographical memory retrieval. Neuroscience. 2001;104(3):667–76. doi: 10.1016/s0306-4522(01)00108-7. [DOI] [PubMed] [Google Scholar]
  31. Maddock RJ, Garrett AS, Buonocore MH. Posterior cingulate cortex activation by emotional words: fMRI evidence from a valence decision task. Human Brain Mapping. 2003;18(1):30–41. doi: 10.1002/hbm.10075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Meyer M, Baumann S, Wildgruber D, Alter K. How the brain laughs. Comparative evidence from behavioral, electrophysiological and neuroimaging studies in human and monkey. Behavioural Brain Research. 2007;182(2):245–60. doi: 10.1016/j.bbr.2007.04.023. [DOI] [PubMed] [Google Scholar]
  33. Peelen MV, Atkinson AP, Vuilleumier P. Supramodal representations of perceived emotions in the human brain. Journal of Neuroscience. 2010;30(30):10127–34. doi: 10.1523/JNEUROSCI.2161-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Sabatinelli D, Fortune EE, Li Q, et al. Emotional perception: meta-analyses of face and natural scene processing. Neuroimage. 2011;54(3):2524–33. doi: 10.1016/j.neuroimage.2010.10.011. [DOI] [PubMed] [Google Scholar]
  35. Sauter DA, Eisner F, Calder AJ, Scott SK. Perceptual cues in nonverbal vocal expressions of emotion. Quarterly Journal of Experimental Psychology. 2010;63(11):2251–72. doi: 10.1080/17470211003721642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Schirmer A, Kotz SA. Beyond the right hemisphere: brain mechanisms mediating vocal emotional processing. Trends in Cognitive Science. 2006;10(1):24–30. doi: 10.1016/j.tics.2005.11.009. [DOI] [PubMed] [Google Scholar]
  37. Szameitat DP, Alter K, Szameitat AJ, Wildgruber D, Sterr A, Darwin CJ. Acoustic profiles of distinct emotional expressions in laughter. The Journal of the Acoustical Society of America. 2009;126(1):354–66. doi: 10.1121/1.3139899. [DOI] [PubMed] [Google Scholar]
  38. Tzourio-Mazoyer N, Landeau B, Papathanassiou D, et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage. 2002;15(1):273–89. doi: 10.1006/nimg.2001.0978. [DOI] [PubMed] [Google Scholar]
  39. Van Overwalle F, Baetens K. Understanding others' actions and goals by mirror and mentalizing systems: a meta-analysis. Neuroimage. 2009;48(3):564–84. doi: 10.1016/j.neuroimage.2009.06.009. [DOI] [PubMed] [Google Scholar]
  40. Wager TD, Feldman Barrett L, Bliss-Moreau E, et al. The neuroimaging of emotion. In: Lewis M, Haviland-Jones JM, Feldman Barrett L, editors. Handbook of Emotions. Vol. 3. New York: Guilfod Press; 2010. pp. 249–71. [Google Scholar]
  41. Wiethoff S, Wildgruber D, Grodd W, Ethofer T. Response and habituation of the amygdala during processing of emotional prosody. Neuroreport. 2009;20(15):1356–60. doi: 10.1097/WNR.0b013e328330eb83. [DOI] [PubMed] [Google Scholar]
  42. Wildgruber D, Ackermann H, Kreifelts B, Ethofer T. Cerebral processing of linguistic and emotional prosody: fMRI studies. Progress in Brain Research. 2006;156:249–68. doi: 10.1016/S0079-6123(06)56013-3. [DOI] [PubMed] [Google Scholar]
  43. Wildgruber D, Ethofer T, Grandjean D, Kreifelts B. A cerebral network model of speech prosody comprehension. International Journal of Speech-Language Pathology. 2009;11(4):277–81. [Google Scholar]
  44. Wildgruber D, Hertrich I, Riecker A, et al. Distinct frontal regions subserve evaluation of linguistic and emotional aspects of speech intonation. Cerebral Cortex. 2004;14(12):1384–9. doi: 10.1093/cercor/bhh099. [DOI] [PubMed] [Google Scholar]
  45. Wildgruber D, Riecker A, Hertrich I, et al. Identification of emotional intonation evaluated by fMRI. Neuroimage. 2005;24(4):1233–41. doi: 10.1016/j.neuroimage.2004.10.034. [DOI] [PubMed] [Google Scholar]
  46. Yao B, Belin P, Scheepers C. Silent reading of direct versus indirect speech activates voice-selective areas in the auditory cortex. Journal of Cognitive Neuroscience. 2011;23(10):3146–52. doi: 10.1162/jocn_a_00022. [DOI] [PubMed] [Google Scholar]
  47. Yao B, Scheepers C. Contextual modulation of reading rate for direct versus indirect speech quotations. Cognition. 2011;121(3):447–53. doi: 10.1016/j.cognition.2011.08.007. [DOI] [PubMed] [Google Scholar]
  48. Zald DH. The human amygdala and the emotional evaluation of sensory stimuli. Brain Research. Brain Research Reviews. 2003;41(1):88–123. doi: 10.1016/s0165-0173(02)00248-5. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Social Cognitive and Affective Neuroscience are provided here courtesy ofOxford University Press

ACTIONS

RESOURCES


[8]ページ先頭

©2009-2026 Movatter.jp