Table of Contents
Fetching ...

Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations

Ashwin Paranjape, Abigail See, Kathleen Kenealy, Haojun Li, Amelia Hardy, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Christopher D. Manning

TL;DR

This work presents Chirpy Cardinal, an open-domain socialbot developed for the 2019 Alexa Prize to achieve knowledge-rich, emotionally engaging, mixed-initiative conversations. It integrates a modular DM with navigational intent, an entity-tracking system, and a collection of diverse RGs—including neural generators and knowledge-grounded modules—coordinated via multi-tier priority and prompting mechanisms. The system demonstrates long, varied conversations with real users and analyzes how engagement, topic coverage, and RG selection relate to ratings, highlighting both the strengths of neural generation for naturalness and the challenges of maintaining consistency and safety. Key contributions include the Treelets framework for dialogue graphs, robust entity linking with phonetically aware matching, and empirical insights into how initialization strategies and opinion/discussion mechanics affect user experience in open-domain chat. The work advances practical full-stack NLP for emotionally engaging socialbots and outlines concrete future directions for improving initiative, domain adaptation, and content quality at scale.

Abstract

We present Chirpy Cardinal, an open-domain dialogue agent, as a research platform for the 2019 Alexa Prize competition. Building an open-domain socialbot that talks to real people is challenging - such a system must meet multiple user expectations such as broad world knowledge, conversational style, and emotional connection. Our socialbot engages users on their terms - prioritizing their interests, feelings and autonomy. As a result, our socialbot provides a responsive, personalized user experience, capable of talking knowledgeably about a wide variety of topics, as well as chatting empathetically about ordinary life. Neural generation plays a key role in achieving these goals, providing the backbone for our conversational and emotional tone. At the end of the competition, Chirpy Cardinal progressed to the finals with an average rating of 3.6/5.0, a median conversation duration of 2 minutes 16 seconds, and a 90th percentile duration of over 12 minutes.

Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations

TL;DR

This work presents Chirpy Cardinal, an open-domain socialbot developed for the 2019 Alexa Prize to achieve knowledge-rich, emotionally engaging, mixed-initiative conversations. It integrates a modular DM with navigational intent, an entity-tracking system, and a collection of diverse RGs—including neural generators and knowledge-grounded modules—coordinated via multi-tier priority and prompting mechanisms. The system demonstrates long, varied conversations with real users and analyzes how engagement, topic coverage, and RG selection relate to ratings, highlighting both the strengths of neural generation for naturalness and the challenges of maintaining consistency and safety. Key contributions include the Treelets framework for dialogue graphs, robust entity linking with phonetically aware matching, and empirical insights into how initialization strategies and opinion/discussion mechanics affect user experience in open-domain chat. The work advances practical full-stack NLP for emotionally engaging socialbots and outlines concrete future directions for improving initiative, domain adaptation, and content quality at scale.

Abstract

We present Chirpy Cardinal, an open-domain dialogue agent, as a research platform for the 2019 Alexa Prize competition. Building an open-domain socialbot that talks to real people is challenging - such a system must meet multiple user expectations such as broad world knowledge, conversational style, and emotional connection. Our socialbot engages users on their terms - prioritizing their interests, feelings and autonomy. As a result, our socialbot provides a responsive, personalized user experience, capable of talking knowledgeably about a wide variety of topics, as well as chatting empathetically about ordinary life. Neural generation plays a key role in achieving these goals, providing the backbone for our conversational and emotional tone. At the end of the competition, Chirpy Cardinal progressed to the finals with an average rating of 3.6/5.0, a median conversation duration of 2 minutes 16 seconds, and a 90th percentile duration of over 12 minutes.

Paper Structure

This paper contains 48 sections, 4 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Overall system design.
  • Figure 2: An example treelet for the Movies RG.
  • Figure 3: Strategies for the emotion-focused Neural Chat starter question. POS/NEG/NEGOPT refer to positive/negative/negative+optimistic emotion. OTHERS/BOT refer to whether the emotion is attributed to other people, or to the bot. STORY indicates that the bot shares a personal anecdote.
  • Figure 4: Effect of Neural Chat emotion-focused starter question strategies on user response length.
  • Figure 5: Engagement metrics vs rating
  • ...and 6 more figures