Iti-Validator: A Guardrail Framework for Validating and Correcting LLM-Generated Itineraries
Shravan Gadbail, Masumi Desai, Kamalakar Karlapalem
TL;DR
Iti-Validator addresses the problem of temporal infeasibility in LLM-generated travel itineraries by introducing a model-agnostic guardrail that grounds plans in real-world flight durations using the AeroDataBox API and enforces explicit temporal constraints. The framework combines an LLM itinerary generator with a rule-based validator and deterministic corrections, achieving feasible itineraries even when raw LLM outputs are inconsistent. Key contributions include a formal set of validation rules (e.g., $t_{ ext{min}}$ between legs, $2t_{ ext{min}}$ maximum, and a minimum stay of $2$ days) and a fast, scalable correction pipeline that operates in a few seconds per itinerary. The results demonstrate significant improvement over raw LLM outputs and highlight the practical potential for deployment in travel planning, with future work aimed at multi-modal transport, richer constraints, and interactive user collaboration.
Abstract
The rapid advancement of Large Language Models (LLMs) has enabled them to generate complex, multi-step plans and itineraries. However, these generated plans often lack temporal and spatial consistency, particularly in scenarios involving physical travel constraints. This research aims to study the temporal performance of different LLMs and presents a validation framework that evaluates and improves the temporal consistency of LLM-generated travel itineraries. The system employs multiple state-of-the-art LLMs to generate travel plans and validates them against real-world flight duration constraints using the AeroDataBox API. This work contributes to the understanding of LLM capabilities in handling complex temporal reasoning tasks like itinerary generation and provides a framework to rectify any temporal inconsistencies like overlapping journeys or unrealistic transit times in the itineraries generated by LLMs before the itinerary is given to the user. Our experiments reveal that while current LLMs frequently produce temporally inconsistent itineraries, these can be systematically and reliably corrected using our framework, enabling their practical deployment in large-scale travel planning.
