Reward functions are a fundamental part of reinforcement learning for machines. Based partly on Pavlovian, or classical conditioning, exemplified by the pairing of ringing a bell (conditioned stimulus) with the presentation of food (unconditioned stimulus) to a dog repeatedly, resulting in the ringing of the bell alone to cause the dog to salivate (conditioned response).
More recently, developments in reinforcement learning, particularly temporal difference learning, have been compared to the function of reward learning parts of the brain. Pathologies of these reward producing parts of the brain, particularly Parkinson’s disease and Huntington’s disease, show the importance of the reward neurotransmitter dopamine in brain functions for controlling movement and impulses, as well as seeking pleasure.
The purpose and function of these reward centres in the basal ganglia of the brain, could have important implications in way in which we apply reinforcement learning. Especially in autonomous agents and robots. An understanding of the purpose of rewards, and their impact on the development of values in machines and people, also has some interesting philosophical implications that will be discussed
This post introduces what may become a spiral of related posts on concepts of rewards and values covering:
- How rewards occur in machines and in brain – are they internal or external?
- Representing the world in reinforcement learning – the states and actions.
- Reinforcement learning as a machine learning process for autonomous robots – machines that reward themselves.
- Evolving rewards and values – which actually comes first in animals and machines?
- Valuing pleasure – hijacking our evolved reward function with supernormal stimuli.
- Choosing values – engineering morality and friendly artificial intelligence.
Hopefully this narrowing of post topics results in giving me focus to write and some interesting discourse on the each of the themes of this blog. Suggestions and comments are welcome!