Table of Contents
Fetching ...

HybridTrack: A Hybrid Approach for Robust Multi-Object Tracking

Leandro Di Bella, Yangxintong Lyu, Bruno Cornelis, Adrian Munteanu

TL;DR

HybridTrack tackles robust 3D multi-object tracking in autonomous driving by embedding a learnable Kalman filter within a tracking-by-detection framework. It introduces a Transition Residual Predictor to model motion and a Kalman Gain Estimation Module to refine updates, while a dynamic scaling factor stabilizes early predictions; all components are trained end-to-end with a lightweight design. On KITTI, it achieves a HOTA of 82.72 and runs at 112 FPS, surpassing many model-based trackers and maintaining real-time performance without scene-specific tuning. The approach demonstrates strong data efficiency and generalization, offering a practical solution for ADAS that handles occlusion and distant vehicles with minimal hand-crafted parameter design.

Abstract

The evolution of Advanced Driver Assistance Systems (ADAS) has increased the need for robust and generalizable algorithms for multi-object tracking. Traditional statistical model-based tracking methods rely on predefined motion models and assumptions about system noise distributions. Although computationally efficient, they often lack adaptability to varying traffic scenarios and require extensive manual design and parameter tuning. To address these issues, we propose a novel 3D multi-object tracking approach for vehicles, HybridTrack, which integrates a data-driven Kalman Filter (KF) within a tracking-by-detection paradigm. In particular, it learns the transition residual and Kalman gain directly from data, which eliminates the need for manual motion and stochastic parameter modeling. Validated on the real-world KITTI dataset, HybridTrack achieves 82.72% HOTA accuracy, significantly outperforming state-of-the-art methods. We also evaluate our method under different configurations, achieving the fastest processing speed of 112 FPS. Consequently, HybridTrack eliminates the dependency on scene-specific designs while improving performance and maintaining real-time efficiency. The code is publicly available at: https://github.com/leandro-svg/HybridTrack.

HybridTrack: A Hybrid Approach for Robust Multi-Object Tracking

TL;DR

HybridTrack tackles robust 3D multi-object tracking in autonomous driving by embedding a learnable Kalman filter within a tracking-by-detection framework. It introduces a Transition Residual Predictor to model motion and a Kalman Gain Estimation Module to refine updates, while a dynamic scaling factor stabilizes early predictions; all components are trained end-to-end with a lightweight design. On KITTI, it achieves a HOTA of 82.72 and runs at 112 FPS, surpassing many model-based trackers and maintaining real-time performance without scene-specific tuning. The approach demonstrates strong data efficiency and generalization, offering a practical solution for ADAS that handles occlusion and distant vehicles with minimal hand-crafted parameter design.

Abstract

The evolution of Advanced Driver Assistance Systems (ADAS) has increased the need for robust and generalizable algorithms for multi-object tracking. Traditional statistical model-based tracking methods rely on predefined motion models and assumptions about system noise distributions. Although computationally efficient, they often lack adaptability to varying traffic scenarios and require extensive manual design and parameter tuning. To address these issues, we propose a novel 3D multi-object tracking approach for vehicles, HybridTrack, which integrates a data-driven Kalman Filter (KF) within a tracking-by-detection paradigm. In particular, it learns the transition residual and Kalman gain directly from data, which eliminates the need for manual motion and stochastic parameter modeling. Validated on the real-world KITTI dataset, HybridTrack achieves 82.72% HOTA accuracy, significantly outperforming state-of-the-art methods. We also evaluate our method under different configurations, achieving the fastest processing speed of 112 FPS. Consequently, HybridTrack eliminates the dependency on scene-specific designs while improving performance and maintaining real-time efficiency. The code is publicly available at: https://github.com/leandro-svg/HybridTrack.
Paper Structure (18 sections, 8 equations, 4 figures, 5 tables)

This paper contains 18 sections, 8 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The proposed HybridTrack. During inference, HybridTrack begins with 3D object detection, localizing vehicles using sensor data. The resulting detections $\mathcal{R}^{N_k}_k$ are sent to the State Prediction Module (SPM), where new detections $\mathbf{r}^i_k$ initialize trajectories $\mathcal{T}_{k}^{M_k}$. For each trajectory $T_{k}^j$, the TRP predicts the next state $\hat{\mathbf{x}}^j_{k} = \alpha_k S^j_{k} + \mathbf{x}^j_{k-1}$ forming the set of predicted states $\hat{X}_{k}^{M_k}$. In the Data Association module, $\hat{X}_{k}^{M_k}$ is matched with current $\mathcal{R}^{N_k}_k$. The State Update Module (SUM) refines the matched pairs, updating the vehicle states ${X}_{k}^{M_k}$. Finally, the Trajectory Management module maintains and updates trajectories.
  • Figure 2: Qualitative results comparison between ground truth, UG3DMOT and HybridTrack on sequence 15 of the validation set. $^*$ uses CasA wu2022casa detector.
  • Figure 3: Performance Metrics vs. Training dataset size.
  • Figure 4: Share of Cumulative Execution Time for Sequence 1 of the validation set.