Table of Contents
Fetching ...

Beyond Frame-wise Tracking: A Trajectory-based Paradigm for Efficient Point Cloud Tracking

BaiChen Fan, Yuanxi Cui, Jian Li, Qin Wang, Shibo Zhao, Muqing Cao, Sifan Zhou

TL;DR

This work proposes a novel trajectory-based paradigm and its instantiation, TrajTrack, a lightweight framework that enhances a base two-frame tracker by implicitly learning motion continuity from historical bounding box trajectories alone—without requiring additional, costly point cloud inputs.

Abstract

LiDAR-based 3D single object tracking (3D SOT) is a critical task in robotics and autonomous systems. Existing methods typically follow frame-wise motion estimation or a sequence-based paradigm. However, the two-frame methods are efficient but lack long-term temporal context, making them vulnerable in sparse or occluded scenes, while sequence-based methods that process multiple point clouds gain robustness at a significant computational cost. To resolve this dilemma, we propose a novel trajectory-based paradigm and its instantiation, TrajTrack. TrajTrack is a lightweight framework that enhances a base two-frame tracker by implicitly learning motion continuity from historical bounding box trajectories alone-without requiring additional, costly point cloud inputs. It first generates a fast, explicit motion proposal and then uses an implicit motion modeling module to predict the future trajectory, which in turn refines and corrects the initial proposal. Extensive experiments on the large-scale NuScenes benchmark show that TrajTrack achieves new state-of-the-art performance, dramatically improving tracking precision by 3.02% over a strong baseline while running at 55 FPS. Besides, we also demonstrate the strong generalizability of TrajTrack across different base trackers. Code is available at https://github.com/FiBonaCci225/TrajTrack.

Beyond Frame-wise Tracking: A Trajectory-based Paradigm for Efficient Point Cloud Tracking

TL;DR

This work proposes a novel trajectory-based paradigm and its instantiation, TrajTrack, a lightweight framework that enhances a base two-frame tracker by implicitly learning motion continuity from historical bounding box trajectories alone—without requiring additional, costly point cloud inputs.

Abstract

LiDAR-based 3D single object tracking (3D SOT) is a critical task in robotics and autonomous systems. Existing methods typically follow frame-wise motion estimation or a sequence-based paradigm. However, the two-frame methods are efficient but lack long-term temporal context, making them vulnerable in sparse or occluded scenes, while sequence-based methods that process multiple point clouds gain robustness at a significant computational cost. To resolve this dilemma, we propose a novel trajectory-based paradigm and its instantiation, TrajTrack. TrajTrack is a lightweight framework that enhances a base two-frame tracker by implicitly learning motion continuity from historical bounding box trajectories alone-without requiring additional, costly point cloud inputs. It first generates a fast, explicit motion proposal and then uses an implicit motion modeling module to predict the future trajectory, which in turn refines and corrects the initial proposal. Extensive experiments on the large-scale NuScenes benchmark show that TrajTrack achieves new state-of-the-art performance, dramatically improving tracking precision by 3.02% over a strong baseline while running at 55 FPS. Besides, we also demonstrate the strong generalizability of TrajTrack across different base trackers. Code is available at https://github.com/FiBonaCci225/TrajTrack.

Paper Structure

This paper contains 14 sections, 8 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Different tracking paradigms. (a) Two-frame paradigm exploits two-frame inputs for tracking through appearance matching or motion prediction. (b) Sequence-based paradigm uses multi-frames inputs to integrate target information. (c) Our Trajectory-based paradigm considers both short- and long-term motion clues.
  • Figure 2: Overview of the proposed Trajectory-Based Paradigm and its instantiation TrajTrack. Explicit Motion Proposal uses a two-frame tracking baseline to obtain the local-aware motion proposal. Implicit Trajectory Prediction is used to learn the object’s global-aware motion continuity for trajectory proposal. Finally, Trajectory-guided Proposal Refinement cooperates the local and global-aware motion cues to get the refined 3D BBOX as the final output.
  • Figure 3: (a) Framework of Implicit Trajectory Prediction. (b) The TrajFormer Encoder. (c) The TrajFormer Decoder.
  • Figure 4: Tracking performance across varying numbers of template points in the first frame.
  • Figure 5: Visualization results on nuScenes. Starting from the second frame, the global-aware trajectory proposal from Implicit Motion Modeling corrects the biased tracking results, achieving accurate tracking.
  • ...and 3 more figures