Table of Contents
Fetching ...

TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning

Hang Ni, Fan Liu, Xinyu Ma, Lixin Su, Shuaiqiang Wang, Dawei Yin, Hui Xiong, Hao Liu

TL;DR

TP-RAG introduces a spatiotemporal-aware travel planning benchmark (TP-RAG) that integrates trajectory-level web knowledge with LLM agents to improve plan coherence. It demonstrates that retrieval-augmented planning can enhance spatial efficiency and POI rationality but faces universality and robustness challenges due to conflicting/noisy references. To address these, the authors propose EvoRAG, an evolutionary framework that blends diverse trajectories with intrinsic reasoning, achieving state-of-the-art performance across multiple metrics. The work highlights the value of hybrid Web knowledge for adaptive travel planning and sets directions for more reliable, context-aware agents.

Abstract

Large language models (LLMs) have shown promise in automating travel planning, yet they often fall short in addressing nuanced spatiotemporal rationality. While existing benchmarks focus on basic plan validity, they neglect critical aspects such as route efficiency, POI appeal, and real-time adaptability. This paper introduces TP-RAG, the first benchmark tailored for retrieval-augmented, spatiotemporal-aware travel planning. Our dataset includes 2,348 real-world travel queries, 85,575 fine-grain annotated POIs, and 18,784 high-quality travel trajectory references sourced from online tourist documents, enabling dynamic and context-aware planning. Through extensive experiments, we reveal that integrating reference trajectories significantly improves spatial efficiency and POI rationality of the travel plan, while challenges persist in universality and robustness due to conflicting references and noisy data. To address these issues, we propose EvoRAG, an evolutionary framework that potently synergizes diverse retrieved trajectories with LLMs' intrinsic reasoning. EvoRAG achieves state-of-the-art performance, improving spatiotemporal compliance and reducing commonsense violation compared to ground-up and retrieval-augmented baselines. Our work underscores the potential of hybridizing Web knowledge with LLM-driven optimization, paving the way for more reliable and adaptive travel planning agents.

TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning

TL;DR

TP-RAG introduces a spatiotemporal-aware travel planning benchmark (TP-RAG) that integrates trajectory-level web knowledge with LLM agents to improve plan coherence. It demonstrates that retrieval-augmented planning can enhance spatial efficiency and POI rationality but faces universality and robustness challenges due to conflicting/noisy references. To address these, the authors propose EvoRAG, an evolutionary framework that blends diverse trajectories with intrinsic reasoning, achieving state-of-the-art performance across multiple metrics. The work highlights the value of hybrid Web knowledge for adaptive travel planning and sets directions for more reliable, context-aware agents.

Abstract

Large language models (LLMs) have shown promise in automating travel planning, yet they often fall short in addressing nuanced spatiotemporal rationality. While existing benchmarks focus on basic plan validity, they neglect critical aspects such as route efficiency, POI appeal, and real-time adaptability. This paper introduces TP-RAG, the first benchmark tailored for retrieval-augmented, spatiotemporal-aware travel planning. Our dataset includes 2,348 real-world travel queries, 85,575 fine-grain annotated POIs, and 18,784 high-quality travel trajectory references sourced from online tourist documents, enabling dynamic and context-aware planning. Through extensive experiments, we reveal that integrating reference trajectories significantly improves spatial efficiency and POI rationality of the travel plan, while challenges persist in universality and robustness due to conflicting references and noisy data. To address these issues, we propose EvoRAG, an evolutionary framework that potently synergizes diverse retrieved trajectories with LLMs' intrinsic reasoning. EvoRAG achieves state-of-the-art performance, improving spatiotemporal compliance and reducing commonsense violation compared to ground-up and retrieval-augmented baselines. Our work underscores the potential of hybridizing Web knowledge with LLM-driven optimization, paving the way for more reliable and adaptive travel planning agents.

Paper Structure

This paper contains 31 sections, 11 figures, 8 tables.

Figures (11)

  • Figure 1: TP-RAG distinguishes itself from existing works by: (1) query-specific spatiotemporal contextualization and (2) trajectory-level knowledge utilization.
  • Figure 2: Dataset construction pipeline.
  • Figure 3: Similarities between plans and trajectories.
  • Figure 4: Correlation analysis.
  • Figure 5: The sensitivity analysis of retrieval-augmented methods with different retrieval quantity, based on noisy and clean trajectory knowledge.
  • ...and 6 more figures