Table of Contents
Fetching ...

TrAISformer -- A Transformer Network with Sparse Augmented Data Representation and Cross Entropy Loss for AIS-based Vessel Trajectory Prediction

Duong Nguyen, Ronan Fablet

TL;DR

TrAISformer tackles medium-range AIS-based vessel trajectory prediction by explicitly modeling heterogeneity and multimodality through a discrete four-hot representation of AIS features and a probabilistic transformer. The method reframes prediction as a multimodal classification problem using a cross-entropy loss over four feature heads, enabling sampling of multiple plausible futures. On public DMA AIS data, it substantially outperforms state-of-the-art approaches, achieving sub-1 nmi errors at short horizons and below 10 nmi up to around 10 hours, with ablations confirming the necessity of its encoding, sparsity, and multimodal loss. The work advances maritime surveillance and routing applications and opens avenues for weather-conditioned predictions, interaction modeling, and model compression for operational deployment.

Abstract

Vessel trajectory prediction plays a pivotal role in numerous maritime applications and services. While the Automatic Identification System (AIS) offers a rich source of information to address this task, forecasting vessel trajectory using AIS data remains challenging, even for modern machine learning techniques, because of the inherent heterogeneous and multimodal nature of motion data. In this paper, we propose a novel approach to tackle these challenges. We introduce a discrete, high-dimensional representation of AIS data and a new loss function designed to explicitly address heterogeneity and multimodality. The proposed model-referred to as TrAISformer-is a modified transformer network that extracts long-term temporal patterns in AIS vessel trajectories in the proposed enriched space to forecast the positions of vessels several hours ahead. We report experimental results on real, publicly available AIS data. TrAISformer significantly outperforms state-of-the-art methods, with an average prediction performance below 10 nautical miles up to ~10 hours.

TrAISformer -- A Transformer Network with Sparse Augmented Data Representation and Cross Entropy Loss for AIS-based Vessel Trajectory Prediction

TL;DR

TrAISformer tackles medium-range AIS-based vessel trajectory prediction by explicitly modeling heterogeneity and multimodality through a discrete four-hot representation of AIS features and a probabilistic transformer. The method reframes prediction as a multimodal classification problem using a cross-entropy loss over four feature heads, enabling sampling of multiple plausible futures. On public DMA AIS data, it substantially outperforms state-of-the-art approaches, achieving sub-1 nmi errors at short horizons and below 10 nmi up to around 10 hours, with ablations confirming the necessity of its encoding, sparsity, and multimodal loss. The work advances maritime surveillance and routing applications and opens avenues for weather-conditioned predictions, interaction modeling, and model compression for operational deployment.

Abstract

Vessel trajectory prediction plays a pivotal role in numerous maritime applications and services. While the Automatic Identification System (AIS) offers a rich source of information to address this task, forecasting vessel trajectory using AIS data remains challenging, even for modern machine learning techniques, because of the inherent heterogeneous and multimodal nature of motion data. In this paper, we propose a novel approach to tackle these challenges. We introduce a discrete, high-dimensional representation of AIS data and a new loss function designed to explicitly address heterogeneity and multimodality. The proposed model-referred to as TrAISformer-is a modified transformer network that extracts long-term temporal patterns in AIS vessel trajectories in the proposed enriched space to forecast the positions of vessels several hours ahead. We report experimental results on real, publicly available AIS data. TrAISformer significantly outperforms state-of-the-art methods, with an average prediction performance below 10 nautical miles up to ~10 hours.

Paper Structure

This paper contains 12 sections, 12 equations, 9 figures, 2 tables, 4 algorithms.

Figures (9)

  • Figure 1: Illustration of long-termdependence patterns in AIS vessel trajectories: At E, vessels typically follow one of the two main maritime routes indicated by the red and the yellow dashed arrows. In order to forecast whether a vessel will continue straight ahead (the red path) or turn right (the yellow path), the prediction model may need to roll back several time steps to D, C, B, and A to understand the vessel's previous movements. Moreover, if the prediction model is not multimodal, it may output as a prediction an unusual green dashed path, which is a merged path of the true red and yellow ones.
  • Figure 2: Proposed representation of AIS data: To overcome the challenge of representing the heterogeneous and multimodal nature of motion data with relatively low-dimensional observations in AIS-based vessel trajectory prediction, a new representation of AIS data is proposed in this study. For each attribute $att \in \{lat, lon, SOG, COG\}$), the observed value (which is continuous) is discretized into a one-hot vector $\boldsymbol{\mathrm{h}}^{att}_t$. Each $\boldsymbol{\mathrm{h}}^{att}_t$ is then associated with a high dimensional real-valued embedding vector $\boldsymbol{\mathrm{e}}^{att}_t$.
  • Figure 3: Example of multi-resolution "four-hot" vectors for AIS data: The model uses the fine-resolution vector $\boldsymbol{\mathrm{h}}_t$ in the embedding module (see Fig.\ref{['fig:e']}), while the loss function uses both $\boldsymbol{\mathrm{h}}_t$ and a coarse-resolution $\boldsymbol{\mathrm{h}}'_t$.
  • Figure 4: Illustration of the loss function $\mathcal{L}_{CE}$ to account for multimodal posterior: Let's consider a scenario where at a specific waypoint, half of the vessels in the training set turn left and the other half turn right. The true distribution of the longitude at the next time step forms a bimodal normal distribution, depicted by the blue curve. If we use a real-valued scalar to represent the longitude and use $\mathcal{L}_{MSE}$ as the loss function, the implicit distribution of the model is an unimodal Gaussian distribution. Consequently, the model tends to merge the two modes of the true distribution, as illustrated by the orange curve. By contrast, if we use a one-hot vector to represent the longitude and use $\mathcal{L}_{CE}$ as the loss function, the implicit distribution of the model is a categorical distribution. With this distribution, the model preserves the two modes, as shown by the green curve.
  • Figure 5: Sketch of the TrAISformer architecture: each AIS observation $\boldsymbol{\mathrm{x}}_t$ is discretized into a "four-hot" vector $\boldsymbol{\mathrm{h}}_t$ (for visualization purposes, we illustrate a one-hot vector instead of a "four-hot" vector for $\boldsymbol{\mathrm{h}}_t$). Subsequently, each $\boldsymbol{\mathrm{h}}_t$ is paired with a high dimensional real-valued embedding vector $\boldsymbol{\mathrm{e}}_t$. The sequence of embeddings the $\boldsymbol{\mathrm{e}}_{0:t}$ will be fed into a transformer network to predict $p_{t+1} \triangleq p(\boldsymbol{\mathrm{h}}_{t+1}|\boldsymbol{\mathrm{e}}_{0:t})$. During the training phase, the model is optimized to minimize the cross-entropy loss between the true value $\boldsymbol{\mathrm{h}}_{t+1}$ and $p_{t+1}$. To enhance prediction accuracy, we introduce a "multi-resolution" loss. This involves calculating the cross-entropy at different spatial resolutions of $\boldsymbol{\mathrm{h}}_{t+1}$. In the forecasting phase, we generate vessel positions recursively. We sample $\boldsymbol{\mathrm{h}}^{pred}_{t+1}$ from $p_{t+1}$, calculate the "pseudo-inverse" of it to derive $\boldsymbol{\mathrm{x}}^{pred}_{t+1}$. The predicted $\boldsymbol{\mathrm{x}}^{pred}_{t+1}$ is fed back into the network to sample the next position (as shown by the red path in the diagram). This iterative process continues until we reach the desired prediction horizon $L$.
  • ...and 4 more figures