WHAT do paper clips have to do with the end of the world? More than you might think, if you ask researchers trying to make sure that artificial intelligence acts in our interests.
This goes back to 2003, when Nick Bostrom, a philosopher at the University of Oxford, posed a thought experiment. Imagine a superintelligent AI has been set the goal of producing as many paper clips as possible. Bostrom suggested it could quickly decide that killing all humans was pivotal to its mission, both because they might switch it off and because they are full of atoms that could be converted into more paper clips.
The scenario is absurd, of course, but illustrates a troubling problem: AIs don’t “think” like us and, if we aren’t extremely careful about spelling out what we want them to do, they can behave in unexpected and harmful ways. “The system will optimise what you actually specified, but not what you intended,” says Brian Christian, author of The Alignment Problem and a visiting scholar at the University of California, Berkeley.
That problem boils down to the question of how to ensure AIs make decisions in line with human goals and values – whether you are worried about long-term existential risks, like the extinction of humanity, or immediate harms like AI-driven misinformation and bias.
In any case, the challenges of AI alignment are significant, says Christian, due to the inherent difficulties involved in translating fuzzy human desires into the cold, numerical logic of computers. He thinks the most promising solution is to get humans to provide feedback on AI decisions and use this to retrain …