Table of Contents
Fetching ...

ATLAS: Constraints-Aware Multi-Agent Collaboration for Real-World Travel Planning

Jihye Choi, Jinsung Yoon, Jiefeng Chen, Somesh Jha, Tomas Pfister

TL;DR

The paper tackles constraint-aware real-world travel planning where explicit, implicit, and evolving constraints challenge grounding in LLM-based systems. It introduces ATLAS, a robust multi-agent framework that decouples constraint construction, constraint-aware answering, and information-gap resolution via a Planner-Checker loop augmented by an adaptive interleaved search, all within a dynamic CSP formalism. Empirical results on TravelPlanner and live-search benchmarks show ATLAS achieving superior final pass rates and substantially reducing hallucinations, outperforming strong baselines across single-turn, multi-turn, and live settings. The work demonstrates the practical viability of constraint-grounded, live-information travel planning and suggests wide applicability to open-world planning tasks beyond sandbox environments.

Abstract

While Large Language Models (LLMs) have shown remarkable advancements in reasoning and tool use, they often fail to generate optimal, grounded solutions under complex constraints. Real-world travel planning exemplifies these challenges, evaluating agents' abilities to handle constraints that are explicit, implicit, and even evolving based on interactions with dynamic environments and user needs. In this paper, we present ATLAS, a general multi-agent framework designed to effectively handle such complex nature of constraints awareness in real-world travel planning tasks. ATLAS introduces a principled approach to address the fundamental challenges of constraint-aware planning through dedicated mechanisms for dynamic constraint management, iterative plan critique, and adaptive interleaved search. ATLAS demonstrates state-of-the-art performance on the TravelPlanner benchmark, improving the final pass rate from 23.3% to 44.4% over its best alternative. More importantly, our work is the first to demonstrate quantitative effectiveness on real-world travel planning tasks with live information search and multi-turn feedback. In this realistic setting, ATLAS showcases its superior overall planning performance, achieving an 84% final pass rate which significantly outperforms baselines including ReAct (59%) and a monolithic agent (27%).

ATLAS: Constraints-Aware Multi-Agent Collaboration for Real-World Travel Planning

TL;DR

The paper tackles constraint-aware real-world travel planning where explicit, implicit, and evolving constraints challenge grounding in LLM-based systems. It introduces ATLAS, a robust multi-agent framework that decouples constraint construction, constraint-aware answering, and information-gap resolution via a Planner-Checker loop augmented by an adaptive interleaved search, all within a dynamic CSP formalism. Empirical results on TravelPlanner and live-search benchmarks show ATLAS achieving superior final pass rates and substantially reducing hallucinations, outperforming strong baselines across single-turn, multi-turn, and live settings. The work demonstrates the practical viability of constraint-grounded, live-information travel planning and suggests wide applicability to open-world planning tasks beyond sandbox environments.

Abstract

While Large Language Models (LLMs) have shown remarkable advancements in reasoning and tool use, they often fail to generate optimal, grounded solutions under complex constraints. Real-world travel planning exemplifies these challenges, evaluating agents' abilities to handle constraints that are explicit, implicit, and even evolving based on interactions with dynamic environments and user needs. In this paper, we present ATLAS, a general multi-agent framework designed to effectively handle such complex nature of constraints awareness in real-world travel planning tasks. ATLAS introduces a principled approach to address the fundamental challenges of constraint-aware planning through dedicated mechanisms for dynamic constraint management, iterative plan critique, and adaptive interleaved search. ATLAS demonstrates state-of-the-art performance on the TravelPlanner benchmark, improving the final pass rate from 23.3% to 44.4% over its best alternative. More importantly, our work is the first to demonstrate quantitative effectiveness on real-world travel planning tasks with live information search and multi-turn feedback. In this realistic setting, ATLAS showcases its superior overall planning performance, achieving an 84% final pass rate which significantly outperforms baselines including ReAct (59%) and a monolithic agent (27%).

Paper Structure

This paper contains 31 sections, 8 equations, 12 figures, 11 tables, 1 algorithm.

Figures (12)

  • Figure 1: Monolithic agent cannot solve real-world travel planning. The true challenge in real-world travel planning is satisfying both explicit user requests and implicit, commonsense expectations (in dotted bubble ). Even advanced models like Gemini-2.5-Pro fall short, as seen in critical failures like omitting lunch after a 9 a.m. arrival or suggesting a restaurant in a different city. This highlights the vital need for a multi-agentic solution like ATLAS.
  • Figure 2: An overview of our framework's workflow on a task in TravelPlanner xie_travelplanner_2024. Initially, the Search Agent populates a domain of available options, while the Constraint Manager identifies all constraints that should be considered. These include explicit constraints from the user (e.g., must allow children $> 10$) and search results (e.g., minimum night stays), as well as implicit, commonsense constraints. The Planner then proposes a plan, which is iteratively validated by the Checker. If the Checker finds the problem is unsatisfiable, it triggers an interleaved search. The Search Advisor diagnoses the failure and provides feedback to guide a new, more informed search.
  • Figure 3: Understanding the individual contribution of key components in ATLAS. (a) compares the full ATLAS framework with a variant where the Constraint Manager is disabled. In (b), the baseline ($K=0$) is a sequential search-then-plan pipeline by ReAct. In (c), the baseline ($L=0$) is ReAct augmented with three check steps after each search. Refer to Table \ref{['tab:travelplanner-ablation']} for full results.
  • Figure 4: Live travel planning with multi-turn feedback. See Table \ref{['tab:multiturn_liveplanner']} for full results.
  • Figure 5: Distribution of the total number of check steps. We set maximum three critique steps per each search step including 10 additional interleaved search steps.
  • ...and 7 more figures

Theorems & Definitions (2)

  • Definition 2.1: Static Travel Planning Objective
  • Definition 2.2: Dynamic Travel Planning Objective