Table of Contents
Fetching ...

SSP-GNN: Learning to Track via Bilevel Optimization

Griffin Golias, Masa Nakura-Fan, Vitaly Ablavsky

TL;DR

The paper tackles multi-object tracking with feature-rich detections by formulating tracking as a global optimization over a tracking graph solved via successive shortest paths (SSP). An edge-cost function, implemented as a graph neural network, is learned end-to-end through a bilevel optimization framework that aligns SSP-derived tracks with ground-truth trajectories. This approach combines expressive cost modeling with guaranteed optimal track solutions and demonstrates favorable performance against a strong GNN-based baseline on synthetic scenarios. The method emphasizes computational efficiency and robustness to varying ReID strength, false alarms, and training data sizes, with practical implications for real-time or batch MOT systems.

Abstract

We propose a graph-based tracking formulation for multi-object tracking (MOT) where target detections contain kinematic information and re-identification features (attributes). Our method applies a successive shortest paths (SSP) algorithm to a tracking graph defined over a batch of frames. The edge costs in this tracking graph are computed via a message-passing network, a graph neural network (GNN) variant. The parameters of the GNN, and hence, the tracker, are learned end-to-end on a training set of example ground-truth tracks and detections. Specifically, learning takes the form of bilevel optimization guided by our novel loss function. We evaluate our algorithm on simulated scenarios to understand its sensitivity to scenario aspects and model hyperparameters. Across varied scenario complexities, our method compares favorably to a strong baseline.

SSP-GNN: Learning to Track via Bilevel Optimization

TL;DR

The paper tackles multi-object tracking with feature-rich detections by formulating tracking as a global optimization over a tracking graph solved via successive shortest paths (SSP). An edge-cost function, implemented as a graph neural network, is learned end-to-end through a bilevel optimization framework that aligns SSP-derived tracks with ground-truth trajectories. This approach combines expressive cost modeling with guaranteed optimal track solutions and demonstrates favorable performance against a strong GNN-based baseline on synthetic scenarios. The method emphasizes computational efficiency and robustness to varying ReID strength, false alarms, and training data sizes, with practical implications for real-time or batch MOT systems.

Abstract

We propose a graph-based tracking formulation for multi-object tracking (MOT) where target detections contain kinematic information and re-identification features (attributes). Our method applies a successive shortest paths (SSP) algorithm to a tracking graph defined over a batch of frames. The edge costs in this tracking graph are computed via a message-passing network, a graph neural network (GNN) variant. The parameters of the GNN, and hence, the tracker, are learned end-to-end on a training set of example ground-truth tracks and detections. Specifically, learning takes the form of bilevel optimization guided by our novel loss function. We evaluate our algorithm on simulated scenarios to understand its sensitivity to scenario aspects and model hyperparameters. Across varied scenario complexities, our method compares favorably to a strong baseline.
Paper Structure (16 sections, 6 equations, 6 figures, 7 tables, 2 algorithms)

This paper contains 16 sections, 6 equations, 6 figures, 7 tables, 2 algorithms.

Figures (6)

  • Figure 1: System diagram of our learnable method. $f_{\boldsymbol{\theta}}$ corresponds to Eq.\ref{['eq:edge_cost']}, explained in detail by Algorithm \ref{['alg:message_passing']}. In our bi-level formulation (Eq.\ref{['eq:bilevel_opt']}), learning ${\boldsymbol{\theta}}$ is the outer objective, while SSP is the graph-based inner optimization that explicitly produces globally optimal tracks $\mathcal{P}^{*}$ in $\mathcal{G}^{\textrm{trk}}$. GNN parameters ${\boldsymbol{\theta}}$ are updated in the backward pass through Eq.\ref{['eq:ssp_gnn_chain_rule']}.
  • Figure 2: Our formulation employs two types of graphs: (a) detection graph$\mathcal{G}^{\textrm{det}}$ constructed from measurements in a given temporal window; (b) its corresponding tracking graph$\mathcal{G}^{\textrm{trk}}$.
  • Figure 3: The average Stage II loss values across 20 SSP-GNN models with randomly initialized weights. As the number of epochs increases, the average loss converges to zero with decreasing standard deviation (shaded).
  • Figure 4: Predicted tracks for the train (left) and test (right) scenario.
  • Figure 5: Effect of the number of GNN message passing layers and hidden dimension size. Note that as the model capacity increases so does the tracking accuracy, until saturation.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Claim 1
  • Claim 2