Table of Contents
Fetching ...

Strength Lies in Differences! Improving Strategy Planning for Non-collaborative Dialogues via Diversified User Simulation

Tong Zhang, Chen Huang, Yang Deng, Hongru Liang, Jia Liu, Zujie Wen, Wenqiang Lei, Tat-Seng Chua

TL;DR

This work tackles non-collaborative dialogues where agents must strategically negotiate with diverse users. It introduces Trip, combining a user-aware strategic planning module that uses Theory-of-Mind concepts with a population-based training paradigm to train planners with diverse user simulators. Across two benchmark tasks—price negotiation and charity persuasion—Trip outperforms strong baselines and demonstrates balanced improvements across personas, validated by human evaluations. The approach enhances adaptability and practicality of LLM-driven agents in real-world, asymmetric negotiations and has potential to reduce training costs through population-aware methods.

Abstract

We investigate non-collaborative dialogue agents, which are expected to engage in strategic conversations with diverse users, for securing a mutual agreement that leans favorably towards the system's objectives. This poses two main challenges for existing dialogue agents: 1) The inability to integrate user-specific characteristics into the strategic planning, and 2) The difficulty of training strategic planners that can be generalized to diverse users. To address these challenges, we propose Trip to enhance the capability in tailored strategic planning, incorporating a user-aware strategic planning module and a population-based training paradigm. Through experiments on benchmark non-collaborative dialogue tasks, we demonstrate the effectiveness of Trip in catering to diverse users.

Strength Lies in Differences! Improving Strategy Planning for Non-collaborative Dialogues via Diversified User Simulation

TL;DR

This work tackles non-collaborative dialogues where agents must strategically negotiate with diverse users. It introduces Trip, combining a user-aware strategic planning module that uses Theory-of-Mind concepts with a population-based training paradigm to train planners with diverse user simulators. Across two benchmark tasks—price negotiation and charity persuasion—Trip outperforms strong baselines and demonstrates balanced improvements across personas, validated by human evaluations. The approach enhances adaptability and practicality of LLM-driven agents in real-world, asymmetric negotiations and has potential to reduce training costs through population-aware methods.

Abstract

We investigate non-collaborative dialogue agents, which are expected to engage in strategic conversations with diverse users, for securing a mutual agreement that leans favorably towards the system's objectives. This poses two main challenges for existing dialogue agents: 1) The inability to integrate user-specific characteristics into the strategic planning, and 2) The difficulty of training strategic planners that can be generalized to diverse users. To address these challenges, we propose Trip to enhance the capability in tailored strategic planning, incorporating a user-aware strategic planning module and a population-based training paradigm. Through experiments on benchmark non-collaborative dialogue tasks, we demonstrate the effectiveness of Trip in catering to diverse users.
Paper Structure (30 sections, 1 equation, 8 figures, 20 tables)

This paper contains 30 sections, 1 equation, 8 figures, 20 tables.

Figures (8)

  • Figure 1: The overall evaluation process.
  • Figure 2: TRIP Overview. This method includes a user-aware strategic planning module (UASP) and a population-based training paradigm (PBTP). The UASP incorporates user-specific characteristics into strategic planning using the Theory-of-Mind (ToM). The PBTP diversifies training user simulators to promote agents' adaptation. We use numbers to indicate the overall process of TRIP.
  • Figure 3: The agents performance across various personas. We report their success rate on two tasks, namely price negotiation (Left) and charity persuasion (Right). Trip achieves balanced improvements on all personas, significantly outperforming other agents by a considerable margin. Due to limited space, we report other results using different metrics in Appendix \ref{['app:radar_persona_results']}.
  • Figure 4: Human Evaluation Results. Trip shows a high practical utility to deal with real users.
  • Figure 5: Case study on the charity persuasion task (Top-3 conversation rounds). The user resisting strategies and agent strategies are marked in bleu and red respectively. While PPDPP repeats its strategy usage pattern to different user types, Trip effectively tailor its strategies for different users. When dealing with theOpenness persona (Left), Trip introduces the charitable organization and evoke specific emotions to sway users' decision. Conversely, in addressing the Neuroticism persona (Right), Trip tends to discuss personal experiences related to charity and employs reasoning persuade the user.
  • ...and 3 more figures