Values and Artificial Super-Intelligence

Sorry, robot. Once the humans are gone, bringing them back will be hard.This is the sixth and final post on the current series on rewards and values. The topic discussed is the assessment of values as they could be applied to an artificial super-intelligence; what might be the final outcome and how might this help us choose “scalable” moral values.

First of all we should get acquainted with the notion of the technological singularity. One version of the idea goes: should we develop an artificial general intelligence that is capable of making itself more intelligent, it could do so repeatedly at accelerating speed. Before long the machine is vastly more intelligent and powerful than any person or organisation and essentially achieves god-like power. This version of the technological singularity appears to be far from a mainstream belief in the scientific community; however, anyone that believes consciousness and intelligence is solely the result of physical processes in the brain and body could rationally believe that this process could be simulated in a computer. It could logically follow that such a simulated intelligence could go about acquiring more computing resources to scale up some aspects of its intelligence and try to improve upon, and add to, the structure underlying its intelligence.

Many people that believe the that this technological singularity will occur are concerned that this AI could eliminate the human race, and potentially all life on Earth, for no more reason than we happen to be in the way of it achieving some goal. A whole non-profit organisation is devoted to trying to negate this risk. These people might be right in saying we can’t predict the actions of a super-intelligent machine – with the ability to choose what it would do, predicting its actions could require the same or a greater amount of intelligence. But the assumption usually goes that the machine will have some value-function that it will be trying to operate under and achieve the maximum value possible. This has been accompanied by interesting twists in how some people define a “mind” and there not being an obvious definition of “intelligence”. Nonetheless this concern has apparently caused some people at least one researcher being threatened with death. (People do crazy things in the name of their beliefs.)

A favourite metaphor for an unwanted result is the “paperclip maximiser“: a powerful machine devoted to turning all the material of the universe into paperclips. The machine may have wanted to increase the “order” of the universe and thought paperclips were especially useful to that end, and settled on turning everything into paperclips. Other values could result in equally undesirable outcomes, the same article describes another scenario where an alternative of utilitarianism might have the machine maximising smiles by turning the world and everything else into smiley faces. This is a rather unusual step for an “intelligent” machine; somehow the machine skipped any theory of mind and went straight to equating happiness with smiley faces. Nonetheless, other ideas of what we should value might not fare much better. It makes some sense to replace the haphazard way we seek pleasure with electrodes in our brains, if pleasure is our end goal. By some methods of calculation, suffering could best be minimised by euthanising all life. Of course, throughout this blog series I’ve been painting rewards and values (including their proxies pleasure and pain)  not as ends, but feedback (or feelings) we’ve evolved for the sake of learning how to survive.

If we consider the thesis of Sam Harris, that there is a “moral landscape“, then choosing a system of morals and values is a matter of optimisation. Sam Harris thinks we should be maximising the well-being of conscious creatures. And this belief of well-being as an optimisable quantity could lead us to consider morality as an engineering problem. Well-being, however, might be a little bit too vague for a machine to develop a function for calculating the value of all actions. Our intuitive human system of making approximate mental calculations of the moral value of actions might be very difficult for a computer to reproduce without simulating a human mind. And humans are notoriously irregular in their beliefs of what is moral behaviour.

In the last post I raised the idea of valuing what pleasure and pain teach us about ourselves and the world. This could be generalised to valuing all learning and information – valuing taking part in and learning the range of human experience such as music, sights, food, personal relationships, as well as learning about and making new scientific observations and discovering the physical laws of the universe. Furthermore, the physical embodiment of information within the universe as structures of matter and energy such as living organisms could also lead us to consider all life as inherently valuable too. Now this raises plenty of questions. What actually counts as information and how would we measure it? Can there really be any inherent value in information? Can we really say that all life and some non-living structures are the embodiment of information? How might valuing information and learning in as ends in themselves suggest we should live? What would an artificial super-intelligence do under this system of valuing information? Questions such as these could be fertile grounds for discussion. In starting a new series of blog posts I hope to explore the ideas and hopefully receive some feedback from anyone who reads this blog.

And thus ends this first blog series on rewards and values. A range of related topics were covered: the origin of felt rewards being within the brain, the representation of the world as an important aspect of associating values, the self-rewarding capabilities that might benefit autonomous robots, the likely evolutionary origin of rewards in biological organisms, and the development of morality and ethics as a process of maximising that which is valued. The ideas in some of these posts may not have been particularly rigorously argued or cited, so everything written should be taken with a grain of salt. Corrections and suggestions are most certainly welcome! I hope you will join me exploring more ideas and taking a few more mental leaps in future.

Simulating stimuli and moral values

This is the fifth post in a series about rewards and values. Previously the neurological origins for pleasure and reward in biological organisms were touched on, and the evolution of pleasure and the discovery of supernormal stimuli were mentioned. This post highlights some issues surrounding happiness and pleasure as ends to be sought.

First let’s refresh: we have evolved sensations and feelings including pleasure and happiness. These feelings are designed to enhance our survival in the world in which they were developed; the prehistoric world where survival was tenuous and selection favoured the “fittest”. This process of evolving first the base feelings of pleasure, wanting and desire, that later extended to the warm social feelings of friendship, attachment and social contact, couldn’t account the facility we now have for tricking these neural systems into strong, but ‘false’, positives. Things like drugs, pornography and facebook, all can deliver large doses of pleasure from directly stimulating the brain or simulating what had been evolved to be pleasurable experiences.

So where does that get us? In the world of various forms of utilitarianism we are usually trying to maximum some value. By my understanding, in plain utilitarianism the aim is to maximise happiness (sometimes described as increasing pleasure and reducing suffering), in hedonism the aim is sensual pleasure, and in preference utilitarianism it is the satisfaction of preferences. Pleasure may once have seemed like a good pursuit, but now that we have methods of creating pleasure at the push of a button, that hardly seems like a “good” way to live – being hooked up to a machine. And if we consider that our life-long search for pleasure as an ineffective process of trying to find out how to push our biological buttons, pleasure may seem like a fairly poor yardstick for measuring “good”.

Happiness is also a mental state that people have varying degrees of success in attaining. Just because we haven’t had the same success in creating happiness “artificially” it doesn’t mean that it is a much better end to seek. Of course the difficulty of living with depression is undesirable, but if we all could become happy at the push of a button the feeling might lose some value. Even the more abstract idea of satisfying preferences might not get us much further, since many of our preferences are for avoiding suffering and attaining pleasure and happiness.

Of course in all this we might be forgetting (or ignoring the perspective) that pleasure and pain were evolved responses to inform us of how to survive. And here comes a leap:

Instead of valuing feelings we could value an important underlying result of the feelings: learning about ourselves and the world.

The general idea of valuing learning and experience might not be entirely new; Buddhism has long been about seeking enlightenment to relieve suffering and find happiness. However, considering learning and gaining experience as valuable ends, and the pleasure, pain or happiness they might arouse as additional aspects of those experiences, isn’t something I’ve seen as part of the discussion of moral values. Clearly there are causes of pleasure and suffering that cause debilitation or don’t result in any “useful” learning, e.g., drug abuse and bodily mutilation, so these should be avoided. But where would a system of ethics and morality based on valuing learning and experience take us?

This idea will be extended and fleshed out in much more detail in a new blog post series starting soon. To conclude this series on rewards and values, I’ll describe an interesting thought experiment for evaluating systems of value: what would an (essentially) omnipotent artificial intelligence do if maximising those values?