Facebook researchers have taught chatbots the art of the deal – so well, in fact, most people didn’t know they were negotiating with an artificial interface.
The team at Facebook Artificial Intelligence Research (FAIR) used machine learning to train “dialog agents” to chat with humans and negotiate a deal.
“Building machines that can hold meaningful conversations with people is challenging because it requires a bot to combine its understanding of the conversation with its knowledge of the world, and then produce a new sentence that helps it achieve its goals,” the company state in a blog post.
For the study, published online and with open-sourced code, they gathered a dataset of 5,808 dialogues between humans on a negotiation task and explored two methods to improve the strategic reasoning skills of the models: “self play” and “dialogue rollouts”.
The first “self play” method involved the models practicing their negotiation skills with each other in order to improve performance. This led to them creating their own non-human language, so the team tweaked the bots as fixed supervised models instead.
The second “dialogue rollouts” method had the agents simulate complete dialogues to maximize reward, which requires long-term planning and predicting how a conversation will proceed.
To do this, the chatbots “build mental models of their interlocutors and ‘think ahead’ or anticipate directions a conversation is going to take in the future,” state the team. In this way, “they can choose to steer away from uninformative, confusing, or frustrating exchanges toward successful ones.”
Similar planning models have been made for gaming AIs, but are applied much less often to language because the complexity and number of actions are higher. Facebook has been tinkering with chatbots for a few years now, but for this latest iteration the team trained the bots to achieve negotiation goals and then they reinforced positive outcomes. This meant the bots were self-serving, trying to get the best end of the deal, even bluffing to achieve their ends.
“We find instances of the model feigning interest in a valueless issue, so that it can later ‘compromise’ by conceding it,” wrote the authors. “Deceit is a complex skill that requires hypothesizing the other agent’s beliefs, and is learnt relatively late in child development. Our agents have learnt to deceive without any explicit human design, simply by trying to achieve their goals.”