Dopamine

A neural substrate of prediction and reward

The RPE hypothesis of dopamine is grounded in three main observations about the activity of dopamine neurons in animals learning to associate a stimulus with a reward:

  1. Dopamine neurons are initially activated by rewards, but this response shifts to an earlier reward predicting stimulus over the course of learning.
  2. After an association between stimulus and reward is established, omitting an expected reward causes a dip in dopamine neuron activity.
  3. Unexpected positive stimuli, such as surprise rewards or unpredictable stimuli already associated with a future reward, reliably activate dopamine neurons at any point in learning.

To see how the RPE model captures these observations, use the buttons below to add and remove rewards in a simulation of TD learning.

−0.15−0.10−0.050.000.050.100.15TD error ↑ 020406080100120140160180200220240260Time since last reward (steps) →0.00.20.40.60.81.0Value ↑ 010203040506070Time since trial start →