According to a new study, artificial intelligence systems trained to align with human values could be used to develop more popular economic policies.
In an article published in Nature Human Behaviour, researchers trained an artificial intelligence (AI) system named “Democratic AI” to design a way to distribute the proceeds of an investment game and found that it was more popular with gamers than any man-made system. system.
Thousands of participants were recruited to play an investment game in groups of four. Prior to each round, each player received funds, with the size of the prize pool varying from player to player.
Each player could keep these funds or invest them in a common pool that was guaranteed to grow, but with the risk that players would not know how the profits would be shared.
The researchers then used different policies to distribute the funds: one was a human-designed policy where the funds were redistributed in proportion to the contribution; the other was a form of AI trained by “deep reinforcement learning” to observe and copy how people play the game in previous versions and maximize player preferences in a larger group.
When asked to vote for the policy they preferred, participants chose the AI system over policies such as redistributing funds equally or redistributing funds in proportion to each player’s contribution .
And when the researchers trained a “human policy maker,” players always preferred the democratic AI system.
The research addresses a question that has divided the opinions of philosophers, economists and political scientists for many years: how exactly should we allocate resources across economies and societies?
Oliver Hauser, associate professor of economics at the University of Exeter Business School and co-author of the study, said: “AI systems are sometimes criticized for having learning policies that are inconsistent with human values, but with this approach, the AI exploits the principle of democracy by maximizing the majority preferences of a group of people. Although this approach is only a prototype, it can help ensure that AI systems are less likely to learn from dangerous or unfair policies.
The researchers analyzed the policy uncovered by the AI and found that it incorporated a mix of ideas that had previously been proposed by human experts to solve the problem of fund redistribution.
This included taking into account a player’s initial means and redistributing funds in proportion to the players’ relative – rather than absolute – contribution.
They also found that the AI system rewarded players whose relative contribution was more generous, perhaps encouraging others to do the same.
“Importantly, the AI only discovered these policies by learning how to maximize human votes,” Professor Hauser said. “The method therefore ensures that humans stay ‘in the loop’ and the AI produces human-compatible solutions.”
The study was a collaboration between researchers from the University of Exeter, Deepmind, UCL and the University of Oxford.
Designing Human-Centered Mechanisms with Democratic AI is published in Nature Human Behavior.