Learning algorithms for people: Supervised learning

Access to education is widely considered a human right, and, as such, many people spend years at school learning. Many of these people also spend a lot of time practising sport, musical instruments and other hobbies and skills. But how exactly do people go about trying to learn? In machine learning, algorithms are clearly defined procedures for learning. Strangely, though the human brain is a machine of sorts, we don’t really consider experimenting with “algorithms” for our own learning. Perhaps we should.

Machine learning is typically divided into three paradigms: supervised learning, reinforcement learning, and unsupervised learning. These roughly translate into “learning with detailed feedback”, “learning with rewards and punishments” and “learning without any feedback” respectively. These types of learning have some close relationships to the learning that people and animals already do.

Many people already do supervised learning, although probably much more haphazardly than a machine algorithm might dictate. Supervised learning  is good when the answers are available. So when practising for a quiz, or practising a motor skill, we make attempts, then try to adjust based on error we observe. A basic algorithm for people to perform supervised learning to memorise discrete facts could be written as:

given quiz questions, Q, correct answers, A, and stopping criteria, S
    do
        for each quiz question q in Q
            record predicted answer p
        for each predicted answer p
            compare p with correct answer, a
            record error, e
    while stopping criteria, S, are not met

Anyone could use this procedure for rote memorisation of facts, using a certain percentage of correct answers and a set time as the stopping criteria. However, this algorithm supposes the existence of questions associated with the facts to memorise. Memorisation can be difficult without a context to prompt recall and questions can also help links these facts together. Much like it being common for people to find recall better when knowledge is presented visually, aurally and in tactile formats. The machine learning equivalent would be adding extra input dimensions to associate with the output. Supervised learning also makes sense for trying to learn motor skills, this is roughly what many people do already when practising skills for sports or musical instruments.

It makes sense to use slightly different procedures for practising motor skills compared to doing quizzes. In addition to getting the desired outcome, gaining proficiency also requires the practising the technique of the skill.  Good outcomes can often be achieved with poor technique, and poor outcomes might occur with good technique. But to attain a high proficiency, technique is very important. To learn a skill well, it is necessary to pay attention not only to errors in the outcome, but also errors in the technique. For this reason, it is good to first spend time focusing practise on the technique. Once the technique is correct, focus can then be more effectively directed toward achieving the desired outcome.

given correct skill technique, T, and stopping criteria, S
    do
        attempt skill
        compare attempt technique to correct technique, T
        note required adjustments to technique
     while stopping criteria, S, not met

given desired skill outcome, O, and stopping criteria, S
     do
         attempt skill
         compare attempt outcome to desired outcome, O
         note required adjustments to skill
     while stopping criteria, S, are not met

These basic, general algorithms spell out the obvious of what many people already do: learn through repetition of phases of attempts, evaluations and adjustments. It’s possible to continue to describe current methods of teaching and learning as algorithms. And it’s also possible to search for optimal learning processes, characterising the learning algorithms we use, and the structure of education, to discover what is most effective. It may be that different people learn more effectively using different algorithms, or that some people could benefit from practising these algorithms to get better at learning. In future, I will try to write some further posts about learning topics and skills, and applications for different paradigms of learning, as well as algorithms describing systems of education.

Values and Artificial Super-Intelligence

Sorry, robot. Once the humans are gone, bringing them back will be hard.This is the sixth and final post on the current series on rewards and values. The topic discussed is the assessment of values as they could be applied to an artificial super-intelligence; what might be the final outcome and how might this help us choose “scalable” moral values.

First of all we should get acquainted with the notion of the technological singularity. One version of the idea goes: should we develop an artificial general intelligence that is capable of making itself more intelligent, it could do so repeatedly at accelerating speed. Before long the machine is vastly more intelligent and powerful than any person or organisation and essentially achieves god-like power. This version of the technological singularity appears to be far from a mainstream belief in the scientific community; however, anyone that believes consciousness and intelligence is solely the result of physical processes in the brain and body could rationally believe that this process could be simulated in a computer. It could logically follow that such a simulated intelligence could go about acquiring more computing resources to scale up some aspects of its intelligence and try to improve upon, and add to, the structure underlying its intelligence.

Many people that believe the that this technological singularity will occur are concerned that this AI could eliminate the human race, and potentially all life on Earth, for no more reason than we happen to be in the way of it achieving some goal. A whole non-profit organisation is devoted to trying to negate this risk. These people might be right in saying we can’t predict the actions of a super-intelligent machine – with the ability to choose what it would do, predicting its actions could require the same or a greater amount of intelligence. But the assumption usually goes that the machine will have some value-function that it will be trying to operate under and achieve the maximum value possible. This has been accompanied by interesting twists in how some people define a “mind” and there not being an obvious definition of “intelligence”. Nonetheless this concern has apparently caused some people at least one researcher being threatened with death. (People do crazy things in the name of their beliefs.)

A favourite metaphor for an unwanted result is the “paperclip maximiser“: a powerful machine devoted to turning all the material of the universe into paperclips. The machine may have wanted to increase the “order” of the universe and thought paperclips were especially useful to that end, and settled on turning everything into paperclips. Other values could result in equally undesirable outcomes, the same article describes another scenario where an alternative of utilitarianism might have the machine maximising smiles by turning the world and everything else into smiley faces. This is a rather unusual step for an “intelligent” machine; somehow the machine skipped any theory of mind and went straight to equating happiness with smiley faces. Nonetheless, other ideas of what we should value might not fare much better. It makes some sense to replace the haphazard way we seek pleasure with electrodes in our brains, if pleasure is our end goal. By some methods of calculation, suffering could best be minimised by euthanising all life. Of course, throughout this blog series I’ve been painting rewards and values (including their proxies pleasure and pain)  not as ends, but feedback (or feelings) we’ve evolved for the sake of learning how to survive.

If we consider the thesis of Sam Harris, that there is a “moral landscape“, then choosing a system of morals and values is a matter of optimisation. Sam Harris thinks we should be maximising the well-being of conscious creatures. And this belief of well-being as an optimisable quantity could lead us to consider morality as an engineering problem. Well-being, however, might be a little bit too vague for a machine to develop a function for calculating the value of all actions. Our intuitive human system of making approximate mental calculations of the moral value of actions might be very difficult for a computer to reproduce without simulating a human mind. And humans are notoriously irregular in their beliefs of what is moral behaviour.

In the last post I raised the idea of valuing what pleasure and pain teach us about ourselves and the world. This could be generalised to valuing all learning and information – valuing taking part in and learning the range of human experience such as music, sights, food, personal relationships, as well as learning about and making new scientific observations and discovering the physical laws of the universe. Furthermore, the physical embodiment of information within the universe as structures of matter and energy such as living organisms could also lead us to consider all life as inherently valuable too. Now this raises plenty of questions. What actually counts as information and how would we measure it? Can there really be any inherent value in information? Can we really say that all life and some non-living structures are the embodiment of information? How might valuing information and learning in as ends in themselves suggest we should live? What would an artificial super-intelligence do under this system of valuing information? Questions such as these could be fertile grounds for discussion. In starting a new series of blog posts I hope to explore the ideas and hopefully receive some feedback from anyone who reads this blog.

And thus ends this first blog series on rewards and values. A range of related topics were covered: the origin of felt rewards being within the brain, the representation of the world as an important aspect of associating values, the self-rewarding capabilities that might benefit autonomous robots, the likely evolutionary origin of rewards in biological organisms, and the development of morality and ethics as a process of maximising that which is valued. The ideas in some of these posts may not have been particularly rigorously argued or cited, so everything written should be taken with a grain of salt. Corrections and suggestions are most certainly welcome! I hope you will join me exploring more ideas and taking a few more mental leaps in future.

Mind the Leap: Introduction

BlogIntroIt’s been a long time since I created this blog.  I wrote a lot of draft posts, but never edited or posted them; until now.  The best place to start is probably a more detailed description of the things that I want to cover in this space.  Hopefully it will not only inform potential readers of what they might expect from this blog, but also keep me on track to writing on the main topics I want to share ideas on.

First: My day job (although I’m not currently getting paid) is postgraduate research on robot intelligence.  As one of the few PhD students who hasn’t become jaded after working on the same research topic for years, I still find studying robotics and artificial intelligence really engaging and enjoyable.  A part of this blog will be devoted to talking about these topics, but usually at a non-technical, conceptual level.

Second: Intelligence is such a fraught term though, that I have spent a lot of time looking into the underlying neuroscience and thinking about biological intelligence, consciousness, the mind and the brain.  This continues to be a big influence on my approach to robot intelligence.  While the some additions in the path to the evolution of the human brain might not be necessary for functional robot intelligence, people are the primary example of the general intelligence we want in our robots.  Some of this blog will discuss how neuroscience and cognitive science might translate into AI and robotics.

Third: As the brain becomes less of a mystery, the soul is no longer a necessary hypothesis.  Physicalism, the belief that the world is only matter and energy and without a spiritual dimension, is a starting point a lot of my thoughts about the world.  A significant amount of what I would like to discuss is more philosophical in nature.  While I usually try to have a scientific underpinning—or use a thought experiment as an intuition pump—philosophical, moral and ethical issues often remain disputable.  Nonetheless, I think about these issues, and I think they are important enough that another voice can’t hurt.

Those are the main themes and topics this blog will cover.  The style of writing is something I want to be conscious of too.  There are a fine lines between entertaining and obfuscating; informative and long-winded; and concise and plain.  Many of my drafts were possibly drifting towards long-winded attempts to be entertaining.  With a personal credo of trying to improve at all things I do, I’ll look for a balance.  Humour, like morality, is subjective.  But that doesn’t mean there aren’t ways of doing these things better.  Potential readers beware: there’s no telling what you’ll be subjected to.  Even, sentences that a preposition they end in.  Yoda would be proud.  Or really disappointed.  Or just confused… I’m not sure.  ( Lame grammar joke, Star Wars reference, and smiley face: check. 😀 )