Part ofAdvances in Neural Information Processing Systems 13 (NIPS 2000)
Sham Kakade, Peter Dayan
Substantial data support a temporal difference (TO) model of dopamine (OA) neuron activity in which the cells provide a global error signal for reinforcement learning. However, in certain cir(cid:173) cumstances, OA activity seems anomalous under the TO model, responding to non-rewarding stimuli. We address these anoma(cid:173) lies by suggesting that OA cells multiplex information about re(cid:173) ward bonuses, including Sutton's exploration bonuses and Ng et al's non-distorting shaping bonuses. We interpret this additional role for OA in terms of the unconditional attentional and psy(cid:173) chomotor effects of dopamine, having the computational role of guiding exploration.
Requests for name changes in the electronic proceedings will be accepted with no questions asked. However name changes may cause bibliographic tracking issues. Authors are asked to consider this carefully and discuss it with their co-authors prior to requesting a name change in the electronic proceedings.
Use the "Report an Issue" link to request a name change.