Harnessing LLMs for Cross-City OD Flow Prediction
Chenyang Yu, Xinpeng Xie, Yan Huang, Chenxi Qiu
TL;DR
This work addresses cross-city origin-destination (OD) flow prediction by leveraging large language models (LLMs) to transfer mobility patterns from a data-rich source city to a target city lacking OD data. The authors introduce LLM-COD, a four-step framework that collects OD-POI data in the source city, instruction-tunes an LLM with a novel POI-distance loss, predicts destination POIs in the target city, and matches destination cells to form the OD matrix $\\hat{\mathbf{F}}^B$. A key aspect is a loss that jointly encodes POI semantics and travel distance, enabling robust cross-city transfer even with limited target-city data. Experiments on Beijing, Chengdu, and Xi'an demonstrate that LLM-COD outperforms traditional and learning-based baselines in cross-city OD prediction, especially for high-volume and long-distance flows, and show the importance of city indicators, multi-POI representations, and appropriate distance weighting. The approach has practical implications for rapid urban planning in data-sparse cities and highlights a promising direction for combining LLMs with mobility data to generalize across urban contexts, while also suggesting avenues for further interpretability and dataset expansion.
Abstract
Understanding and predicting Origin-Destination (OD) flows is crucial for urban planning and transportation management. Traditional OD prediction models, while effective within single cities, often face limitations when applied across different cities due to varied traffic conditions, urban layouts, and socio-economic factors. In this paper, by employing Large Language Models (LLMs), we introduce a new method for cross-city OD flow prediction. Our approach leverages the advanced semantic understanding and contextual learning capabilities of LLMs to bridge the gap between cities with different characteristics, providing a robust and adaptable solution for accurate OD flow prediction that can be transferred from one city to another. Our novel framework involves four major components: collecting OD training datasets from a source city, instruction-tuning the LLMs, predicting destination POIs in a target city, and identifying the locations that best match the predicted destination POIs. We introduce a new loss function that integrates POI semantics and trip distance during training. By extracting high-quality semantic features from human mobility and POI data, the model understands spatial and functional relationships within urban spaces and captures interactions between individuals and various POIs. Extensive experimental results demonstrate the superiority of our approach over the state-of-the-art learning-based methods in cross-city OD flow prediction.
