- Letter
- Published:
Prolonged dopamine signalling in striatum signals proximity and value of distant rewards
Naturevolume 500, pages575–579 (2013)Cite this article
30kAccesses
275Altmetric
Abstract
Predictions about future rewarding events have a powerful influence on behaviour. The phasic spike activity of dopamine-containing neurons, and corresponding dopamine transients in the striatum, are thought to underlie these predictions, encoding positive and negative reward prediction errors1,2,3,4,5. However, many behaviours are directed towards distant goals, for which transient signals may fail to provide sustained drive. Here we report an extended mode of reward-predictive dopamine signalling in the striatum that emerged as rats moved towards distant goals. These dopamine signals, which were detected with fast-scan cyclic voltammetry (FSCV), gradually increased or—in rare instances—decreased as the animals navigated mazes to reach remote rewards, rather than having phasic or steady tonic profiles. These dopamine increases (ramps) scaled flexibly with both the distance and size of the rewards. During learning, these dopamine signals showed spatial preferences for goals in different locations and readily changed in magnitude to reflect changing values of the distant rewards. Such prolonged dopamine signalling could provide sustained motivational drive, a control mechanism that may be important for normal behaviour and that can be impaired in a range of neurologic and neuropsychiatric disorders.
This is a preview of subscription content,access via your institution
Access options
Subscription info for Japanese customers
We have a dedicated website for our Japanese customers. Please go tonatureasia.com to subscribe to this journal.
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
References
Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area.Nature482, 85–88 (2012)
Matsumoto, M. & Hikosaka, O. Two types of dopamine neuron distinctly convey positive and negative motivational signals.Nature459, 837–841 (2009)
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward.Science275, 1593–1599 (1997)
Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory.Nature412, 43–48 (2001)
Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal.Neuron47, 129–141 (2005)
Schultz, W. Getting formal with dopamine and reward.Neuron36, 241–263 (2002)
Redgrave, P. & Gurney, K. The short-latency dopamine signal: a role in discovering novel actions?Nature Rev. Neurosci.7, 967–975 (2006)
Zweifel, L. S. et al. Disruption of NMDAR-dependent burst firing by dopamine neurons provides selective assessment of phasic dopamine-dependent behavior.Proc. Natl Acad. Sci. USA106, 7281–7288 (2009)
Wang, L. P. et al. NMDA receptors in dopaminergic neurons are crucial for habit learning.Neuron72, 1055–1066 (2011)
Day, J. J., Jones, J. L., Wightman, R. M. & Carelli, R. M. Phasic nucleus accumbens dopamine release encodes effort- and delay-related costs.Biol. Psychiatry68, 306–309 (2010)
Day, J. J., Roitman, M. F., Wightman, R. M. & Carelli, R. M. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens.Nature Neurosci.10, 1020–1028 (2007)
Gan, J. O., Walton, M. E. & Phillips, P. E. Dissociable cost and benefit encoding of future rewards by mesolimbic dopamine.Nature Neurosci.13, 25–27 (2010)
Phillips, P. E., Stuber, G. D., Heien, M. L., Wightman, R. M. & Carelli, R. M. Subsecond dopamine release promotes cocaine seeking.Nature422, 614–618 (2003)
Braun, A. A., Graham, D. L., Schaefer, T. L., Vorhees, C. V. & Williams, M. T. Dorsal striatal dopamine depletion impairs both allocentric and egocentric navigation in rats.Neurobiol. Learn. Mem.97, 402–408 (2012)
Salamone, J. D., Correa, M., Farrar, A. & Mingote, S. M. Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits.Psychopharmacology (Berl.)191, 461–482 (2007)
Whishaw, I. Q. & Dunnett, S. B. Dopamine depletion, stimulation or blockade in the rat disrupts spatial navigation and locomotion dependent upon beacon or distal cues.Behav. Brain Res.18, 11–29 (1985)
Salamone, J. D. & Correa, M. The mysterious motivational functions of mesolimbic dopamine.Neuron76, 470–485 (2012)
Niv, Y., Daw, N. D., Joel, D. & Dayan, P. Tonic dopamine: opportunity costs and the control of response vigor.Psychopharmacology (Berl.)191, 507–520 (2007)
Barnes, T. D., Kubota, Y., Hu, D., Jin, D. Z. & Graybiel, A. M. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories.Nature437, 1158–1161 (2005)
Niv, Y., Duff, M. O. & Dayan, P. Dopamine, uncertainty and TD learning.Behav. Brain Funct.1, 6 (2005)
Tobler, P. N., Fiorillo, C. D. & Schultz, W. Adaptive coding of reward value by dopamine neurons.Science307, 1642–1645 (2005)
Fiorillo, C. D., Tobler, P. N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons.Science299, 1898–1902 (2003)
Hikosaka, O., Sakamoto, M. & Usui, S. Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward.J. Neurophysiol.61, 814–832 (1989)
Derdikman, D. & Moser, E. I. A manifold of spatial maps in the brain.Trends Cogn. Sci.14, 561–569 (2010)
Flagel, S. B. et al. A selective role for dopamine in stimulus-reward learning.Nature469, 53–57 (2011)
Berridge, K. C. The debate over dopamine’s role in reward: the case for incentive salience.Psychopharmacology (Berl.)191, 391–431 (2007)
Clark, J. J. et al. Chronic microsensors for longitudinal, subsecond dopamine detection in behaving animals.Nature Methods7, 126–129 (2010)
Keithley, R. B., Heien, M. L. & Wightman, R. M. Multivariate concentration determination using principal component regression with residual analysis.Trends Analyt. Chem.28, 1127–1136 (2009)
Acknowledgements
We thank G. Telian, L. Li, T. Lechina, S. Ng-Evans and N. Hollon for help, and M. J. Kim, K. S. Smith, T. J. Sejnowski, L. G. Gibb and Y. Kubota for their comments. This work was supported by US National Institutes of Health (NIH) grant R01 MH060379 (A.M.G.), National Parkinson Foundation (A.M.G.), CHDI Foundation grant A-5552, the Stanley H. and Sheila G. Sydney Fund (A.M.G.), a Mark Gorenberg fellowship (M.W.H.), and NIH grants R01 DA027858 and R01 MH079292 (P.E.M.P.).
Author information
Authors and Affiliations
McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, 02139, Massachusetts, USA
Mark W. Howe, Patrick L. Tierney & Ann M. Graybiel
Department of Psychiatry and Behavioral Sciences, and Department of Pharmacology, University of Washington, Seattle, 98195, Washington, USA
Stefan G. Sandberg & Paul E. M. Phillips
- Mark W. Howe
You can also search for this author inPubMed Google Scholar
- Patrick L. Tierney
You can also search for this author inPubMed Google Scholar
- Stefan G. Sandberg
You can also search for this author inPubMed Google Scholar
- Paul E. M. Phillips
You can also search for this author inPubMed Google Scholar
- Ann M. Graybiel
You can also search for this author inPubMed Google Scholar
Contributions
M.W.H. and A.M.G. designed the experiments, analysed the data, and wrote the manuscript. M.W.H. conducted the experiments. P.L.T. helped to develop the FSCV chronic recording setup. S.G.S. and P.E.M.P provided technical training on FSCV and advice on data analysis. All authors contributed to discussion and interpretation of the findings.
Corresponding author
Correspondence toAnn M. Graybiel.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Extended data figures and tables
Extended Data Figure 1 Schematic drawing of recording locations.
Coloured dots indicate positions of probe tips as determined by post-experiment probe length measurements and histology (see Methods). Each colour represents probes from one rat.
Extended Data Figure 2 Comparisons of dopamine release in the VMS measured in behaving rats and evoked by electrical MFB stimulation in anaesthetized rats.
a, Current measured at each potential during successive scans with electrical MFB stimulation (red line) in an anaesthetized rat.b, Current–voltage plot from the 5 scans during the 0.5-s period after MFB stimulation. The large increase in current around 0.6 V (dotted line, peak) corresponds to the dopamine redox potential measuredin vitro.c, Time course of current at the dopamine redox potential around electrical MFB stimulation (red line).d–f, Plots, as ina–c, illustrating dopamine response in the same rat, now awake, to room lights being turned on.g, Current changes averaged over all trials (n = 4,418 trials) in which ramping occurred (see Methods) during T-maze running.h, Average current–voltage plot for all identified ramping trials for the time period (−3 to −2 s) indicated by the brackets in the colour plot ing.i, Average current changes induced by tonic MFB stimulation in anaesthetized rats (n = 3).j, Average current–voltage plot from the bracketed time range (10 to 11 s) ini following the onset of stimulation.
Extended Data Figure 3 Patterns of dopamine signals in the VMS and DLS.
a, b, Dopamine concentration recorded in the VMS (a,n = 9 rats) and in the DLS (b,n = 8 rats). Data were first averaged across trials to yield session average traces for each probe in each session. These traces were averaged within rat to obtain one average trace per rat, which were then averaged across rats. These plots differ from those inFig. 1e, f, which considered session averages for each probe to be an independent measure. Shading represents s.e.m across rats.c, d, Distribution of average peak dopamine values for all recordings in VMS (c) and DLS (d). Each colour corresponds to an average peak dopamine concentration measured by a single probe in different sessions.e, f, Proportion of trial averaged dopamine recordings in the VMS (e, out of 300) and DLS (f, out of 262) that displayed a positive (blue) or negative (grey) ramping response during maze running, and an unclassified dopamine profile (red).g, h, Average dopamine concentration in the VMS (g) and DLS (h) for the positive ramping traces (top), the negative ramping traces (middle), and the unclassified traces (bottom). Shading represents s.e.m.
Extended Data Figure 4 Transient responses to cues and rewards occur during the T-maze task performance and are distinct from the sustained ramping response.
a, Dopamine concentration in a representative trial that included both a phasic response to warning click and a sustained ramping response to goal-reaching.b, c, Average normalized dopamine traces from VMS (b) and DLS (c) probes from all trials (n = 890 and 640, respectively) that showed identified transients after warning click (see Methods). In the VMS, note the sharp increase in dopamine around warning click superimposed on the ramping response that followed the phasic click response. Shading represents s.e.m. calculated across trials.d, Model for dopamine release profiles in the T-maze task. Sharper transient responses are present at the start of maze running (red) and after goal-reaching (cyan). These responses can be superimposed on and modulated independently of the slower ramping signal related to goal proximity (dark blue).
Extended Data Figure 5 Dopamine signals in response to free reward outside of the task.
a, Average peak dopamine to unexpected chocolate milk delivery outside the task is positively correlated with peak ramping dopamine measured from the same probes during preceding behavioural training in the maze (n = 146 sessions; Pearson’sR = 0.45,P < 0.0001).b, Average peak dopamine concentration induced by unexpected free reward outside the task (blue) and peak amplitude of dopamine ramping during maze performance just before reward (red; pairedt-test,P < 0.001).
Extended Data Figure 6 Peak ramping dopamine values do not vary as a function of trial time.
a, Dopamine release modelled as a summation of four weighted transients in response to fixed maze events on short trials (purple) and long trials (orange). The overlap of the transients is reduced on the long trials, resulting in a lower peak dopamine level at the end of the maze run. Thick lines indicate overall average dopamine, and thin lines indicate the averages of each of the 4 transients across 100 simulation runs.b, Relative predicted peak dopamine levels on short and long trials calculated as a linear decay function of trial duration for the simulated model shown ina (black), for the spatial proximity model (light blue), and for the actual experimental data (dark blue).c, An alternative multi-transient model in which the transients (3, inset) are heavily weighted towards the goal location, are highly variable in their time of occurrence, and display a long decay time-course. In this model, the difference between short and long trials is within the noise range of the data. The average of individual transients (inset) across multiple simulations is a smeared version of the single transients that is weighted towards the goal location.d, Data plotted as inb, for the alternative multi-transient model.e, f, Average run speed (e) and acceleration (f) during short (purple) and long (orange) trials, as shown inFig. 2, for all animals and sessions.
Extended Data Figure 7 Examples of trials with pausing or exploration near the choice point of the T-maze.
a–c, Video tracker traces (a), relative proximity to reward (b), and dopamine concentration (c) measured during a single trial in which a rat paused near the choice point.d–f, Video tracker traces (d), relative proximity to reward (e) and dopamine concentration (f) measured during another trial from a different rat.
Extended Data Figure 8 Value manipulation experiments in the T-maze, M-maze, and S-maze for all rats.
a–c, Average peak dopamine levels for M-maze sessions for three individual rats (M31, M36, and M47) in the left (blue) and right (red) end-arms. Blue and red shading indicates sessions in which left and right arm contained the larger reward, respectively. Error bars indicate s.e.m.d, e, Average peak dopamine levels, as ina–c, for value bias T-maze sessions for two rats (M36 and M47).f, g, Average peak dopamine levels, as ina–c, for value bias S-maze sessions for 2 rats (M47 and M35).h–j, Average normalized dopamine levels measured in the high reward arm (light green) and low reward arm (dark green) as rats performed the M-maze (h), T-maze (i) and S-maze (j) tasks.k, l, Dopamine concentration relative to left (k) and right (l) goal-reaching during the first session following a reversal of reward values (session 17 of M47, indicated by an asterisk inc).
Extended Data Figure 9 Dopamine signals recorded in two rats trained on the S-maze task.
a, Average normalized dopamine concentrations measured from VMS probes in one rat (n = 5 sessions) performing the S-maze task as inFig. 3j. Light green line indicates runs to the higher reward goal, and dark green lines to the lower reward goal. Shading indicates s.e.m. Red vertical lines indicate turns.b, Traces, as ina, for the second rat trained on the S-maze task (n = 4 sessions).
Extended Data Figure 10 Dopamine end-arm preferences become positively correlated with arm choices with training, but ramping dopamine signals do not change significantly with performance improvement.
a, b, Distribution of selectivity indices, as inFig. 4d, for all probes implanted in the left (a,n = 5 rats) and right (b,n = 3 rats) hemispheres. Note the bias in both groups of the selectivity preference towards negative selectivity indices (right bias, red) relative to the shuffled data (blue).c–e, Biases in average run time (c), percentage of correct responses (d), and arm choices (e) across training blocks. Negative values indicate biases towards the right end-arm.f, Raw average dopamine selectivity indices across training blocks. Note emergence of right bias with training.g, Correlation coefficients (Pearson’sR) computed for each training block between arm choice selectivity indices and dopamine selectivity indices. Error bars indicate confidence limits of the correlations.h, Normalized peak magnitudes of dopamine signals averaged in a 0.5-s window before goal-reaching in sessions with significant (Mann–Whitney U-test,P < 0.05) pre-goal increases. Data are averaged across rats for each training block (for which, left to right,n = 48, 101, 113 and 179 trial averaged recordings, respectively).i, Average (± s.e.m.) dopamine concentration from ramping dopamine sessions in which percentage of correct trials fell above (red,n = 179) or below (blue,n = 92) the learning criterion for T-maze task acquisition (72.5% correct, chi-square test,P < 0.05).j, Average (± s.e.m.) peak dopamine levels from the sessions plotted ini, showing no significant difference between pre- and post-learning periods (t-test,P = 0.44).
Supplementary information
Supplementary Information
This file contains a Supplementary Discussion and additional references. (PDF 265 kb)
Rights and permissions
About this article
Cite this article
Howe, M., Tierney, P., Sandberg, S.et al. Prolonged dopamine signalling in striatum signals proximity and value of distant rewards.Nature500, 575–579 (2013). https://doi.org/10.1038/nature12475
Received:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative