CompTrack: Information Bottleneck-Guided Low-Rank Dynamic Token Compression for Point Cloud Tracking
Sifan Zhou, Yichao Cao, Jiahao Nie, Yuqian Fu, Ziyu Zhao, Xiaobo Lu, Shuo Wang
TL;DR
CompTrack tackles the sparsity of LiDAR point clouds in 3D SOT by addressing two forms of redundancy: spatial background noise and informational redundancy in foreground geometry. It introduces a Spatial Foreground Predictor to suppress background and an Information Bottleneck-guided Dynamic Token Compression that uses online SVD and learnable queries to distill foreground into a compact, high-information proxy token set, enabling accurate tracking with a high throughput of 90 FPS. The method achieves state-of-the-art results on nuScenes and Waymo and competitive performance on KITTI, validated by extensive ablations. The approach offers a practical, end-to-end framework for real-time autonomous driving scenarios, balancing precision and latency through principled rank-aware token compression.
Abstract
3D single object tracking (SOT) in LiDAR point clouds is a critical task in computer vision and autonomous driving. Despite great success having been achieved, the inherent sparsity of point clouds introduces a dual-redundancy challenge that limits existing trackers: (1) vast spatial redundancy from background noise impairs accuracy, and (2) informational redundancy within the foreground hinders efficiency. To tackle these issues, we propose CompTrack, a novel end-to-end framework that systematically eliminates both forms of redundancy in point clouds. First, CompTrack incorporates a Spatial Foreground Predictor (SFP) module to filter out irrelevant background noise based on information entropy, addressing spatial redundancy. Subsequently, its core is an Information Bottleneck-guided Dynamic Token Compression (IB-DTC) module that eliminates the informational redundancy within the foreground. Theoretically grounded in low-rank approximation, this module leverages an online SVD analysis to adaptively compress the redundant foreground into a compact and highly informative set of proxy tokens. Extensive experiments on KITTI, nuScenes and Waymo datasets demonstrate that CompTrack achieves top-performing tracking performance with superior efficiency, running at a real-time 90 FPS on a single RTX 3090 GPU.
