Collaborative and Proactive Management of Task-Oriented Conversations
Arezoo Saedi, Afsaneh Fatemi, Mohammad Ali Nematbakhsh, Sophie Rosset, Anne Vilnat
TL;DR
This work tackles the challenge of proactive, goal-aware task-oriented dialogue by embedding constructive intermediate information within an information-state dialogue management framework. It implements a domain-independent TOD using in-context learning with LLMs (GPT-4o) and a domain-specific entity-search ranking that orders retrieved entities according to a text_part-congruent score, enabling complete and non-contradictory information presentation. The architecture introduces predefined_slots and text_part components, a set of informational components, dialogue moves, and an update strategy to navigate information states during conversation. Evaluated on MultiWOZ 2.2 with single-domain conversations, the approach achieves maximal $inform$ and $success$ and outperforms prior methods on task completion metrics, highlighting the value of rich intermediate information and proactive planning for robust TOD across domains.
Abstract
Task oriented dialogue systems (TOD) complete particular tasks based on user preferences across natural language interactions. Considering the impressive performance of large language models (LLMs) in natural language processing (NLP) tasks, most of the latest TODs are centered on LLMs. While proactive planning is crucial for task completion, many existing TODs overlook effective goal-aware planning. This paper creates a model for managing task-oriented conversations, conceptualized centered on the information state approach to dialogue management. The created model incorporated constructive intermediate information in planning. Initially, predefined slots and text part informational components are created to model user preferences. Investigating intermediate information, critical circumstances are identified. Informational components corresponding to these circumstances are created. Possible configurations for these informational components lead to limited information states. Then, dialogue moves, which indicate movement between these information states and the procedures that must be performed in the movements, are created. Eventually, the update strategy is constructed. The created model is implemented leveraging in-context learning of LLMs. In this model, database queries are created centered on indicated predefined slots and the order of retrieved entities is indicated centered on text part. This mechanism enables passing the whole corresponding entities to the preferences in the order of congruency. Evaluations exploiting the complete test conversations of MultiWOZ, with no more than a domain in a conversation, illustrate maximal inform and success, and improvement compared with previous methods.
