Table of Contents
Fetching ...

ST-ProC: A Graph-Prototypical Framework for Robust Semi-Supervised Travel Mode Identification

Luyao Niu, Nuoxian Huang

TL;DR

Travel mode identification from GPS trajectories suffers from label scarcity, limiting supervised learning. ST-ProC introduces a graph-prototypical multi-objective SSL framework that couples a dynamic semantic graph with prototypical anchoring, reinforced by contrastive learning and teacher-student consistency to mitigate confirmation bias. A robust, dual-filtered pseudo-labeling strategy uses both prototype similarity and graph propagation to expand supervision without propagating noise. On GeoLife, ST-ProC achieves state-of-the-art results under sparse labeling (e.g., 0.635 F1 at 5% labels, surpassing FixMatch and fully supervised baselines), illustrating strong leverage of intrinsic data topology for robust TMI in realistic deployment scenarios.

Abstract

Travel mode identification (TMI) from GPS trajectories is critical for urban intelligence, but is hampered by the high cost of annotation, leading to severe label scarcity. Prevailing semi-supervised learning (SSL) methods are ill-suited for this task, as they suffer from catastrophic confirmation bias and ignore the intrinsic data manifold. We propose ST-ProC, a novel graph-prototypical multi-objective SSL framework to address these limitations. Our framework synergizes a graph-prototypical core with foundational SSL Support. The core exploits the data manifold via graph regularization, prototypical anchoring, and a novel, margin-aware pseudo-labeling strategy to actively reject noise. This core is supported and stabilized by foundational contrastive and teacher-student consistency losses, ensuring high-quality representations and robust optimization. ST-ProC outperforms all baselines by a significant margin, demonstrating its efficacy in real-world sparse-label settings, with a performance boost of 21.5% over state-of-the-art methods like FixMatch.

ST-ProC: A Graph-Prototypical Framework for Robust Semi-Supervised Travel Mode Identification

TL;DR

Travel mode identification from GPS trajectories suffers from label scarcity, limiting supervised learning. ST-ProC introduces a graph-prototypical multi-objective SSL framework that couples a dynamic semantic graph with prototypical anchoring, reinforced by contrastive learning and teacher-student consistency to mitigate confirmation bias. A robust, dual-filtered pseudo-labeling strategy uses both prototype similarity and graph propagation to expand supervision without propagating noise. On GeoLife, ST-ProC achieves state-of-the-art results under sparse labeling (e.g., 0.635 F1 at 5% labels, surpassing FixMatch and fully supervised baselines), illustrating strong leverage of intrinsic data topology for robust TMI in realistic deployment scenarios.

Abstract

Travel mode identification (TMI) from GPS trajectories is critical for urban intelligence, but is hampered by the high cost of annotation, leading to severe label scarcity. Prevailing semi-supervised learning (SSL) methods are ill-suited for this task, as they suffer from catastrophic confirmation bias and ignore the intrinsic data manifold. We propose ST-ProC, a novel graph-prototypical multi-objective SSL framework to address these limitations. Our framework synergizes a graph-prototypical core with foundational SSL Support. The core exploits the data manifold via graph regularization, prototypical anchoring, and a novel, margin-aware pseudo-labeling strategy to actively reject noise. This core is supported and stabilized by foundational contrastive and teacher-student consistency losses, ensuring high-quality representations and robust optimization. ST-ProC outperforms all baselines by a significant margin, demonstrating its efficacy in real-world sparse-label settings, with a performance boost of 21.5% over state-of-the-art methods like FixMatch.

Paper Structure

This paper contains 19 sections, 6 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Schematic illustration of the ST-ProC framework.
  • Figure 2: Confusion Matrix of ST-ProC's performance at the 20% labeled data ratio.