A neural substrate of prediction and reward
The RPE hypothesis of dopamine is grounded in three main observations about the activity of dopamine neurons in animals learning to associate a stimulus with a reward:
- Dopamine neurons are initially activated by rewards, but this response shifts to an earlier reward predicting stimulus over the course of learning.
- After an association between stimulus and reward is established, omitting an expected reward causes a dip in dopamine neuron activity.
- Unexpected positive stimuli, such as surprise rewards or unpredictable stimuli already associated with a future reward, reliably activate dopamine neurons at any point in learning.
To see how the RPE model captures these observations, use the buttons below to add and remove rewards in a simulation of TD learning.