September 18, 2018 feature

Facebook researchers build a dataset to train personalized dialogue agents

by Ingrid Fadelli , Tech Xplore

Researchers at Facebook have recently compiled a dataset of 5 million personas and 700 million persona-based dialogues. This database could be used to train end-to-end dialogue systems, resulting in more engaging and rich dialogues between computer agents and humans.

Dialogue systems, or conversational agents (CA), are computer systems designed to communicate with human beings via text, speech, graphics, or other methods, in a coherent way. So far, dialogue systems based on neural architectures, such as LSTMs or memory networks, have been found to be particularly promising in achieving fluent communication, particularly when trained directly on dialogue logs.

"One of their main advantages is that they can rely on large data sources of existing dialogues to learn to cover various domains without requiring any expert knowledge," the researchers wrote in their paper, which was pre-published on arXiv. "However, the flip side is that they also exhibit limited engagement, especially in chit-chat settings: They lack consistency and do not leverage proactive engagement strategies as (even partially) scripted chatbots do."

In a recent study, a different team of researchers at Montreal Institute for Learning Algorithms (MILA) and Facebook AI created a dataset called PERSONA-CHAT, which includes dialogues between agents with text profiles, or personas, attached to them. They found that training a dialogue system on a particular persona improved their engagement in interactions.

"However, the PERSONA-CHAT dataset was created using an artificial data collection mechanism based on Mechanical Turk," the researchers explained in their paper. "As a result, neither dialogs nor personas can be fully representative of real user-bot interactions and the dataset coverage remains limited, containing a bit more than 1k different personas."

To address the limitations of the previously compiled dataset, the Facebook researchers created a new, large-scale persona-based dialogue dataset, composed of conversations extracted from online platform Reddit. Their study takes the work of their predecessors one step further, by using more representative interactions.

"In this paper, we build a very large-scale persona-based dialogue dataset using conversations previously extracted from Reddit," the researchers wrote. "With simple heuristics, we create a corpus of over 5 million personas spanning more than 700 million conversations."

To evaluate its effectiveness, the researchers trained persona-based end-to-end dialogue systems on their newly developed dataset. Systems trained on their dataset were able to conduct more engaging conversations, outperforming other conversational agents that did not have access to personas during their training.

Interestingly, their dataset led to state-of-the-art results even when dialogue systems were merely pre-trained on it. In future, these findings could lead to the development of more engaging chatbots, which can also be personalized and trained to acquire a particular persona.

"We show that training models to align answers both with the persona of their author and the context improves the predicting performance," the researchers wrote. "As pre-training leads to considerable improvement in performance, future work could fine-tune this model for various dialogue systems."

More information: Training Millions of Personalized Dialogue Agents. arXiv:1809.01984 [cs.CL]. arxiv.org/abs/1809.01984

Personalizing Dialogue Agents: I have a dog, do you have pets too? arxiv.org/abs/1801.07243

Citation: Facebook researchers build a dataset to train personalized dialogue agents (2018, September 18) retrieved 16 April 2024 from https://techxplore.com/news/2018-09-facebook-dataset-personalized-dialogue-agents.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Technique allows AI to learn words in the flow of dialogue

50 shares

Feedback to editors

Safeguarding the future of online security with AI and metasurfaces

9 hours ago

Security vulnerability in browser interface allows computer access via graphics card

12 hours ago

AI's new power of persuasion: Study shows LLMs can exploit personal information to change your mind

12 hours ago

Research team manufactures the first universal, programmable and multifunctional photonic chip

12 hours ago

Researchers develop stretchable quantum dot display

13 hours ago

Mimicking fish to create the ideal deep-sea submersible

13 hours ago

Advance in light-based computing shows capabilities for future smart cameras

15 hours ago

Metasurface antenna could enable future 6G communications networks

Apr 12, 2024

Making cement is very damaging for the climate. One solution is opening in California

Apr 12, 2024

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Apr 11, 2024

Load comments (1)

Facebook researchers build a dataset to train personalized dialogue agents

Safeguarding the future of online security with AI and metasurfaces

Security vulnerability in browser interface allows computer access via graphics card

AI's new power of persuasion: Study shows LLMs can exploit personal information to change your mind

Research team manufactures the first universal, programmable and multifunctional photonic chip

Researchers develop stretchable quantum dot display

Mimicking fish to create the ideal deep-sea submersible

Advance in light-based computing shows capabilities for future smart cameras

Metasurface antenna could enable future 6G communications networks

Making cement is very damaging for the climate. One solution is opening in California

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Technique allows AI to learn words in the flow of dialogue

A neural network to extract knowledgeable snippets and documents

SPIE journal announces public access to largest multi-lesion medical imaging dataset

Training artificial intelligence with artificial X-rays

AI-assisted note-taking for electronic health records

Research community can go on Facebook AI's NYC conversation tour

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Game theory research shows AI can evolve into more selfish or cooperative personalities

Proof-of-principle demonstration of 3D magnetic recording could lead to enhanced hard disk drives

Tech companies want to build artificial general intelligence. But who decides when AGI is attained?

Computer scientists show the way: AI models need not be so power hungry

DeepMind develops SAFE, an AI-based app that can fact-check LLMs

Phys.org

Medical Xpress

Science X

Facebook researchers build a dataset to train personalized dialogue agents

Safeguarding the future of online security with AI and metasurfaces

Security vulnerability in browser interface allows computer access via graphics card

AI's new power of persuasion: Study shows LLMs can exploit personal information to change your mind

Research team manufactures the first universal, programmable and multifunctional photonic chip

Researchers develop stretchable quantum dot display

Mimicking fish to create the ideal deep-sea submersible

Advance in light-based computing shows capabilities for future smart cameras

Metasurface antenna could enable future 6G communications networks

Making cement is very damaging for the climate. One solution is opening in California

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Related Stories

Technique allows AI to learn words in the flow of dialogue

A neural network to extract knowledgeable snippets and documents

SPIE journal announces public access to largest multi-lesion medical imaging dataset

Training artificial intelligence with artificial X-rays

AI-assisted note-taking for electronic health records

Research community can go on Facebook AI's NYC conversation tour

Recommended for you

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Game theory research shows AI can evolve into more selfish or cooperative personalities

Proof-of-principle demonstration of 3D magnetic recording could lead to enhanced hard disk drives

Tech companies want to build artificial general intelligence. But who decides when AGI is attained?

Computer scientists show the way: AI models need not be so power hungry

DeepMind develops SAFE, an AI-based app that can fact-check LLMs

Your Privacy