Table of Contents
Fetching ...

Aligning LLM with human travel choices: a persona-based embedding learning approach

Tianming Liu, Manzi Li, Yafeng Yin

TL;DR

The paper tackles aligning large language models (LLMs) with human travel choices under typical travel-demand data constraints. It introduces a two-stage framework: (i) infer personas from detailed data using an expert LLM to ground behavior in interpretable preferences, and (ii) learn a persona loading function that maps socio-demographics to loaded personas via embeddings, enabling prompt conditioning through $P(\boldsymbol{Z_k}|\boldsymbol{d})$ and a structured prompt $g(\boldsymbol{d},\boldsymbol{X},\boldsymbol{Z})$. The loading function is estimated with a Monte Carlo stochastic EM algorithm, incorporating a cosine-based embedding $\boldsymbol{e_i}=e(\boldsymbol{d_i};\boldsymbol{\beta})$ and a weighted softmax to balance population subgroups. Empirical results on the Swissmetro dataset show our method yields superior aggregate mode-share predictions (lower $D_{JS}$) and higher individual-level F1 scores compared with traditional discrete-choice models and lightweight LLM baselines, while offering interpretable latent groups and embedding maps that illuminate behavior drivers. This approach enables robust, accessible, and interpretable LLM-based travel demand simulations without extensive supervised fine-tuning, suggesting a practical path for integrating LLMs into transportation planning and policy analysis.

Abstract

The advent of large language models (LLMs) presents new opportunities for travel demand modeling. However, behavioral misalignment between LLMs and humans presents obstacles for the usage of LLMs, and existing alignment methods are frequently inefficient or impractical given the constraints of typical travel demand data. This paper introduces a novel framework for aligning LLMs with human travel choice behavior, tailored to the current travel demand data sources. Our framework uses a persona inference and loading process to condition LLMs with suitable prompts to enhance alignment. The inference step establishes a set of base personas from empirical data, and a learned persona loading function driven by behavioral embeddings guides the loading process. We validate our framework on the Swissmetro mode choice dataset, and the results show that our proposed approach significantly outperformed baseline choice models and LLM-based simulation models in predicting both aggregate mode choice shares and individual choice outcomes. Furthermore, we showcase that our framework can generate insights on population behavior through interpretable parameters. Overall, our research offers a more adaptable, interpretable, and resource-efficient pathway to robust LLM-based travel behavior simulation, paving the way to integrate LLMs into travel demand modeling practice in the future.

Aligning LLM with human travel choices: a persona-based embedding learning approach

TL;DR

The paper tackles aligning large language models (LLMs) with human travel choices under typical travel-demand data constraints. It introduces a two-stage framework: (i) infer personas from detailed data using an expert LLM to ground behavior in interpretable preferences, and (ii) learn a persona loading function that maps socio-demographics to loaded personas via embeddings, enabling prompt conditioning through and a structured prompt . The loading function is estimated with a Monte Carlo stochastic EM algorithm, incorporating a cosine-based embedding and a weighted softmax to balance population subgroups. Empirical results on the Swissmetro dataset show our method yields superior aggregate mode-share predictions (lower ) and higher individual-level F1 scores compared with traditional discrete-choice models and lightweight LLM baselines, while offering interpretable latent groups and embedding maps that illuminate behavior drivers. This approach enables robust, accessible, and interpretable LLM-based travel demand simulations without extensive supervised fine-tuning, suggesting a practical path for integrating LLMs into transportation planning and policy analysis.

Abstract

The advent of large language models (LLMs) presents new opportunities for travel demand modeling. However, behavioral misalignment between LLMs and humans presents obstacles for the usage of LLMs, and existing alignment methods are frequently inefficient or impractical given the constraints of typical travel demand data. This paper introduces a novel framework for aligning LLMs with human travel choice behavior, tailored to the current travel demand data sources. Our framework uses a persona inference and loading process to condition LLMs with suitable prompts to enhance alignment. The inference step establishes a set of base personas from empirical data, and a learned persona loading function driven by behavioral embeddings guides the loading process. We validate our framework on the Swissmetro mode choice dataset, and the results show that our proposed approach significantly outperformed baseline choice models and LLM-based simulation models in predicting both aggregate mode choice shares and individual choice outcomes. Furthermore, we showcase that our framework can generate insights on population behavior through interpretable parameters. Overall, our research offers a more adaptable, interpretable, and resource-efficient pathway to robust LLM-based travel behavior simulation, paving the way to integrate LLMs into travel demand modeling practice in the future.

Paper Structure

This paper contains 22 sections, 25 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Overview of our alignment framework
  • Figure 2: Illustration of the persona inference process
  • Figure 3: Illustration of the persona loading function
  • Figure 4: Illustration of the model estimation process
  • Figure 5: Comparison of predicted mode shares
  • ...and 3 more figures