ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

Yang Zhan; Yunhao Li; Zhang Chao; Yuxu Lu; Yan Li

ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

Yang Zhan, Yunhao Li, Zhang Chao, Yuxu Lu, Yan Li

TL;DR

The proposed ShipTraj-R1 is a novel LLM-based framework that reformulates ship trajectory prediction as a text-to-text generation problem and achieves the least error compared with state-of-the-art deep learning and LLM-based baselines.

Abstract

Recent advancements in reinforcement fine-tuning have significantly improved the reasoning ability of large language models (LLMs). In particular, methods such as group relative policy optimization (GRPO) have demonstrated strong capabilities across various fields. However, applying LLMs to ship trajectory prediction remains largely unexplored. In this paper, we propose ShipTraj-R1, a novel LLM-based framework that reformulates ship trajectory prediction as a text-to-text generation problem. (1) We design a dynamic prompt containing trajectory information about conflicting ships to guide the model to achieve adaptive chain-of-thought (CoT) reasoning. (2) We introduce a comprehensive rule-based reward mechanism to incentivize the reasoning format and prediction accuracy of the model. (3) Our ShipTraj-R1 is reinforced through the GRPO mechanism guided by domain-specific prompts and rewards, and utilizes the Qwen3 as the model backbone. Extensive experimental results on two complex and real-world maritime datasets show that the proposed ShipTraj-R1 achieves the least error compared with state-of-the-art deep learning and LLM-based baselines.

ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

TL;DR

Abstract

Paper Structure (15 sections, 6 equations, 3 figures, 4 tables)

This paper contains 15 sections, 6 equations, 3 figures, 4 tables.

Introduction
Related Work
Ship Trajectory Prediction
LLM-based Trajectory Prediction
Methodology
Problem Definition
ShipTraj-R1 Model
Rule-based Reward Function
Reinforcement Fine-Tuning
Experiment
Experimental Settings
Comparison with State-of-art Methods
Ablation Studies
Conclusion

Figures (3)

Figure 1: Overview of our proposed ShipTraj-R1 framework.
Figure 2: Qualitative examples of our ShipTraj-R1 on the CSJP dataset. In this case, the ShipTraj-R1 adaptively generates CoT reasoning by itself based on the user prompt.
Figure 3: Illustration of predicted ship trajectories from the CSJP and CFDP datasets.

ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

TL;DR

Abstract

ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (3)