Table of Contents
Fetching ...

Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel Planning

Song Jiang, Da JU, Andrew Cohen, Sasha Mitts, Aaron Foss, Justine T Kao, Xian Li, Yuandong Tian

TL;DR

This work proposes APEC Agent Constitution, a list of criteria that an agent should follow for good agentic behaviors, including Accuracy, Proactivity, Efficiency and Credibility, and develops APEC-Travel, a travel planning agent that proactively extracts hidden personalized needs via multi-round dialog with travelers.

Abstract

How are LLM-based agents used in the future? While many of the existing work on agents has focused on improving the performance of a specific family of objective and challenging tasks, in this work, we take a different perspective by thinking about full delegation: agents take over humans' routine decision-making processes and are trusted by humans to find solutions that fit people's personalized needs and are adaptive to ever-changing context. In order to achieve such a goal, the behavior of the agents, i.e., agentic behaviors, should be evaluated not only on their achievements (i.e., outcome evaluation), but also how they achieved that (i.e., procedure evaluation). For this, we propose APEC Agent Constitution, a list of criteria that an agent should follow for good agentic behaviors, including Accuracy, Proactivity, Efficiency and Credibility. To verify whether APEC aligns with human preferences, we develop APEC-Travel, a travel planning agent that proactively extracts hidden personalized needs via multi-round dialog with travelers. APEC-Travel is constructed purely from synthetic data generated by Llama3.1-405B-Instruct with a diverse set of travelers' persona to simulate rich distribution of dialogs. Iteratively fine-tuned to follow APEC Agent Constitution, APEC-Travel surpasses baselines by 20.7% on rule-based metrics and 9.1% on LLM-as-a-Judge scores across the constitution axes.

Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel Planning

TL;DR

This work proposes APEC Agent Constitution, a list of criteria that an agent should follow for good agentic behaviors, including Accuracy, Proactivity, Efficiency and Credibility, and develops APEC-Travel, a travel planning agent that proactively extracts hidden personalized needs via multi-round dialog with travelers.

Abstract

How are LLM-based agents used in the future? While many of the existing work on agents has focused on improving the performance of a specific family of objective and challenging tasks, in this work, we take a different perspective by thinking about full delegation: agents take over humans' routine decision-making processes and are trusted by humans to find solutions that fit people's personalized needs and are adaptive to ever-changing context. In order to achieve such a goal, the behavior of the agents, i.e., agentic behaviors, should be evaluated not only on their achievements (i.e., outcome evaluation), but also how they achieved that (i.e., procedure evaluation). For this, we propose APEC Agent Constitution, a list of criteria that an agent should follow for good agentic behaviors, including Accuracy, Proactivity, Efficiency and Credibility. To verify whether APEC aligns with human preferences, we develop APEC-Travel, a travel planning agent that proactively extracts hidden personalized needs via multi-round dialog with travelers. APEC-Travel is constructed purely from synthetic data generated by Llama3.1-405B-Instruct with a diverse set of travelers' persona to simulate rich distribution of dialogs. Iteratively fine-tuned to follow APEC Agent Constitution, APEC-Travel surpasses baselines by 20.7% on rule-based metrics and 9.1% on LLM-as-a-Judge scores across the constitution axes.

Paper Structure

This paper contains 40 sections, 1 equation, 9 figures, 10 tables.

Figures (9)

  • Figure 1: We develop APEC-Travel, a travel planning agent that effectively extracts hidden personalized preferences through multi-round dialogs with travelers. Compared to baseline models (left subfigure, (worse) behaviors highlighted in blue), APEC-Travel (right subfigure) prioritizes critical travel entries, asks for clarification, and proactively moves forward with new topics to gain more information (highlighted in red). These positive agent behaviors lead to improved accuracy in understanding personalized travel preferences.
  • Figure 2: An overview of APEC-Travel. (a) We prompt Llama3.1-405B-Instruct to synthesize seed dialogs between a travel agent and travellers based on a diverse set of simulated traveller personas. These dialogs are used to fine-tune (SFT) Llama3.1-8B-Instruct, resulting in APEC-Travel-SFT. Next, APEC-Travel is trained with iterative Direct Preference Optimization (DPO), in which the latest APEC-Travel-DPO agent generates new dialogs with the traveller model in each iteration. These dialogs are ranked by a weighted combination of rule-based objectives and APEC scores assigned by a judge model (also Llama3.1-405B-Instruct). Note that the reference model is fixed as APEC-Travel-SFT throughout this process. (b) Overall workflow: APEC-Travel extracts traveler's personalized preference via multi-round dialog, after then the stenographer model summarizes the dialog into a symbolic representation (JSON).
  • Figure 3: Correlation matrix from human annotation study examining the initial five agentic scores: Planning, Prioritization, Proactive, Clarification, and Helpfulness from which we derive the axes in APEC.
  • Figure 4: Breakdown of agentic scores across all axes. We compare APEC-Travel-SFT and APEC-Travel-DPO for each axis. The median of each box plot is highlighted in orange. Axes from left to right: Plan & Priority; Proactive, Clarification and the Total of these three axes.
  • Figure 5: Comparison of the Total agentic scores across all axes for Llama3.1-405B-Instruct BF16 versus FP8. From left: seed data, APEC-Travel-SFT, and APEC-Travel-DPO.
  • ...and 4 more figures