Table of Contents
Fetching ...

Enhancing Traffic Prediction with Textual Data Using Large Language Models

Xiannan Huang

TL;DR

The paper tackles the challenge of incorporating rich textual context into short-term traffic prediction, proposing to generate embeddings from textual information via large language models and feed them into traditional spatiotemporal forecasters. By introducing auxiliary nodes in a graph—regional-level nodes connected to all districts and district-level nodes connected to their respective districts—the approach integrates textual context without direct LLM-based prediction, mitigating cost and non-determinism concerns. On NYC bike-sharing data, the method consistently improves predictive accuracy (lower MAE and RMSE) across eight baseline models for both city-wide and grid-specific tasks, including Grid 84 near the Barclays Center. This work offers a practical pathway to leverage textual data in urban traffic forecasting, achieving improved robustness to holidays, events, and weather while maintaining compatibility with established spatiotemporal models.

Abstract

Traffic prediction is pivotal for rational transportation supply scheduling and allocation. Existing researches into short-term traffic prediction, however, face challenges in adequately addressing exceptional circumstances and integrating non-numerical contextual information like weather into models. While, Large language models offer a promising solution due to their inherent world knowledge. However, directly using them for traffic prediction presents drawbacks such as high cost, lack of determinism, and limited mathematical capability. To mitigate these issues, this study proposes a novel approach. Instead of directly employing large models for prediction, it utilizes them to process textual information and obtain embeddings. These embeddings are then combined with historical traffic data and inputted into traditional spatiotemporal forecasting models. The study investigates two types of special scenarios: regional-level and node-level. For regional-level scenarios, textual information is represented as a node connected to the entire network. For node-level scenarios, embeddings from the large model represent additional nodes connected only to corresponding nodes. This approach shows a significant improvement in prediction accuracy according to our experiment of New York Bike dataset.

Enhancing Traffic Prediction with Textual Data Using Large Language Models

TL;DR

The paper tackles the challenge of incorporating rich textual context into short-term traffic prediction, proposing to generate embeddings from textual information via large language models and feed them into traditional spatiotemporal forecasters. By introducing auxiliary nodes in a graph—regional-level nodes connected to all districts and district-level nodes connected to their respective districts—the approach integrates textual context without direct LLM-based prediction, mitigating cost and non-determinism concerns. On NYC bike-sharing data, the method consistently improves predictive accuracy (lower MAE and RMSE) across eight baseline models for both city-wide and grid-specific tasks, including Grid 84 near the Barclays Center. This work offers a practical pathway to leverage textual data in urban traffic forecasting, achieving improved robustness to holidays, events, and weather while maintaining compatibility with established spatiotemporal models.

Abstract

Traffic prediction is pivotal for rational transportation supply scheduling and allocation. Existing researches into short-term traffic prediction, however, face challenges in adequately addressing exceptional circumstances and integrating non-numerical contextual information like weather into models. While, Large language models offer a promising solution due to their inherent world knowledge. However, directly using them for traffic prediction presents drawbacks such as high cost, lack of determinism, and limited mathematical capability. To mitigate these issues, this study proposes a novel approach. Instead of directly employing large models for prediction, it utilizes them to process textual information and obtain embeddings. These embeddings are then combined with historical traffic data and inputted into traditional spatiotemporal forecasting models. The study investigates two types of special scenarios: regional-level and node-level. For regional-level scenarios, textual information is represented as a node connected to the entire network. For node-level scenarios, embeddings from the large model represent additional nodes connected only to corresponding nodes. This approach shows a significant improvement in prediction accuracy according to our experiment of New York Bike dataset.
Paper Structure (14 sections, 6 equations, 3 figures, 1 table)

This paper contains 14 sections, 6 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Method Overview.
  • Figure 2: The grid partitioning scheme. The grid marked at the center represents Grid 84, with the Barclays Center in New York highlighted by the red circle.
  • Figure 3: Bike flow in some days