FlowTrack: Point-level Flow Network for 3D Single Object Tracking
Shuo Li, Yubo Cui, Zhiheng Li, Zheng Fang
TL;DR
FlowTrack reframes 3D single object tracking as a multi-frame point-level flow estimation problem. It combines a Historical Information Fusion Module to inject history via a learnable target feature, a Point-level Motion Module to generate multi-scale point-level flow, and an Instance Flow Head to convert per-point motion into a global, instance-level target motion for rigid-body transformation. The approach yields strong gains on KITTI and NuScenes, maintains real-time speed, and demonstrates robustness in sparse and occluded scenarios. This work highlights the value of integrating dense point-level motion cues with historical context for improved 3D tracking performance.
Abstract
3D single object tracking (SOT) is a crucial task in fields of mobile robotics and autonomous driving. Traditional motion-based approaches achieve target tracking by estimating the relative movement of target between two consecutive frames. However, they usually overlook local motion information of the target and fail to exploit historical frame information effectively. To overcome the above limitations, we propose a point-level flow method with multi-frame information for 3D SOT task, called FlowTrack. Specifically, by estimating the flow for each point in the target, our method could capture the local motion details of target, thereby improving the tracking performance. At the same time, to handle scenes with sparse points, we present a learnable target feature as the bridge to efficiently integrate target information from past frames. Moreover, we design a novel Instance Flow Head to transform dense point-level flow into instance-level motion, effectively aggregating local motion information to obtain global target motion. Finally, our method achieves competitive performance with improvements of 5.9% on the KITTI dataset and 2.9% on NuScenes. The code will be made publicly available soon.
