SRRT: Exploring Search Region Regulation for Visual Object Tracking
Jiawen Zhu, Xin Chen, Pengyu Zhang, Xinying Wang, Dong Wang, Wenda Zhao, Huchuan Lu
TL;DR
The paper tackles the rigidity of fixed-size search regions in visual object tracking, which can hamper performance under fast motion or distractor interference. It introduces SRRT, a dynamic paradigm that uses a Search Region Regulator (SRR) to predict an optimal per-frame search radius $ ilde{oldsymbol{ ilde{ y}}}$ and a locking-state-based update to refresh the dynamic reference when needed, enabling flexible, robust tracking. The method demonstrates consistent gains across eight benchmarks, notably achieving +4.6% and +3.1% improvements in AUC over strong baselines on LaSOT, and delivering state-of-the-art results on several datasets while preserving real-time speed. SRRT is designed as a plug-and-play enhancement that can be integrated with existing trackers with minimal overhead, broadening applicability in real-world tracking scenarios.
Abstract
The dominant trackers generate a fixed-size rectangular region based on the previous prediction or initial bounding box as the model input, i.e., search region. While this manner obtains promising tracking efficiency, a fixed-size search region lacks flexibility and is likely to fail in some cases, e.g., fast motion and distractor interference. Trackers tend to lose the target object due to the limited search region or experience interference from distractors due to the excessive search region. Drawing inspiration from the pattern humans track an object, we propose a novel tracking paradigm, called Search Region Regulation Tracking (SRRT) that applies a small eyereach when the target is captured and zooms out the search field when the target is about to be lost. SRRT applies a proposed search region regulator to estimate an optimal search region dynamically for each frame, by which the tracker can flexibly respond to transient changes in the location of object occurrences. To adapt the object's appearance variation during online tracking, we further propose a lockingstate determined updating strategy for reference frame updating. The proposed SRRT is concise without bells and whistles, yet achieves evident improvements and competitive results with other state-of-the-art trackers on eight benchmarks. On the large-scale LaSOT benchmark, SRRT improves SiamRPN++ and TransT with absolute gains of 4.6% and 3.1% in terms of AUC. The code and models will be released.
