Table of Contents
Fetching ...

A Foundational individual Mobility Prediction Model based on Open-Source Large Language Models

Zhenlin Qin, Leizhen Wang, Francisco Camara Pereira, Zhenliang Ma

TL;DR

This paper introduces MoBLLM, a foundational mobility prediction model built by fine-tuning open-source LLMs using a practitioner-friendly instruction dataset generated with a strong teacher-student framework. It addresses cross-city transferability by normalizing location labels and training on diverse mobility data, enabling robust next-location and trip-prediction tasks across GPS, check-in, and AFC data. The approach combines zero-shot CoT prompting and parameter-efficient fine-tuning (LoRA and variants), achieving state-of-the-art accuracy and transferability across six real-world datasets while significantly reducing cost versus commercial LLMs. Empirical results demonstrate MoBLLM’s robustness to contextual changes and its potential for broad mobility-policy applications, with future work focusing on prompt robustness, interpretability, and broader task coverage.

Abstract

Large Language Models (LLMs) are widely applied to domain-specific tasks due to their massive general knowledge and remarkable inference capacities. Current studies on LLMs have shown immense potential in applying LLMs to model individual mobility prediction problems. However, most LLM-based mobility prediction models only train on specific datasets or use single well-designed prompts, leading to difficulty in adapting to different cities and users with diverse contexts. To fill these gaps, this paper proposes a unified fine-tuning framework to train a foundational open source LLM-based mobility prediction model. We conducted extensive experiments on six real-world mobility datasets to validate the proposed model. The results showed that the proposed model achieved the best performance in prediction accuracy and transferability over state-of-the-art models based on deep learning and LLMs.

A Foundational individual Mobility Prediction Model based on Open-Source Large Language Models

TL;DR

This paper introduces MoBLLM, a foundational mobility prediction model built by fine-tuning open-source LLMs using a practitioner-friendly instruction dataset generated with a strong teacher-student framework. It addresses cross-city transferability by normalizing location labels and training on diverse mobility data, enabling robust next-location and trip-prediction tasks across GPS, check-in, and AFC data. The approach combines zero-shot CoT prompting and parameter-efficient fine-tuning (LoRA and variants), achieving state-of-the-art accuracy and transferability across six real-world datasets while significantly reducing cost versus commercial LLMs. Empirical results demonstrate MoBLLM’s robustness to contextual changes and its potential for broad mobility-policy applications, with future work focusing on prompt robustness, interpretability, and broader task coverage.

Abstract

Large Language Models (LLMs) are widely applied to domain-specific tasks due to their massive general knowledge and remarkable inference capacities. Current studies on LLMs have shown immense potential in applying LLMs to model individual mobility prediction problems. However, most LLM-based mobility prediction models only train on specific datasets or use single well-designed prompts, leading to difficulty in adapting to different cities and users with diverse contexts. To fill these gaps, this paper proposes a unified fine-tuning framework to train a foundational open source LLM-based mobility prediction model. We conducted extensive experiments on six real-world mobility datasets to validate the proposed model. The results showed that the proposed model achieved the best performance in prediction accuracy and transferability over state-of-the-art models based on deep learning and LLMs.

Paper Structure

This paper contains 25 sections, 2 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: The examples of base tasks. The subfigure (a) presents an example of Task 1 or Task 2, and Tasks 3 and 4 are respectively illustrated by subfigures (b) and (c).
  • Figure 2: The overview of the proposed MoBLLM framework. Given 4 base task prompt templates, Step 1 generates multi-style semi-complete instructions without any mobility data by prompting advanced commercial LLMs such as GPT-4o mini. Step 2 constructs the instruction dataset by randomly sampling and assembling real-world mobility data with the semi-complete instructions. Step 3 obtains the MoBLLM model (for prediction) by a standard supervised fine-tuning pipeline that uses PEFT approach to train an open-source LLM on the instruction dataset.
  • Figure 3: The example of instruction data. For the data input, $<$history$>$ and $<$context$>$ are the user's mobility sequences respectively, containing long-term and recent stays or trips. $<$target$>$ includes the information about current states except for location, e.g., the tap-out station and time of the last trip.
  • Figure 4: The prompt example to generate multi-style instructions. A prompt template of the base task is put at the end, prompting LLMs to learn the style.
  • Figure 5: The framework of LoRA.
  • ...and 6 more figures