Table of Contents
Fetching ...

ClickTrack: Towards Real-time Interactive Single Object Tracking

Kuiran Wang, Xuehui Yu, Wenwen Yu, Guorong Li, Xiangyuan Lan, Qixiang Ye, Jianbin Jiao, Zhenjun Han

TL;DR

A new paradigm for single object tracking algorithms, ClickTrack, a new paradigm using clicking interaction for real-time scenarios, and the Guided Click Refiner (GCR), which accepts point and optional textual information as inputs, transforming the point into the bounding box expected by the operator.

Abstract

Single object tracking(SOT) relies on precise object bounding box initialization. In this paper, we reconsidered the deficiencies in the current approaches to initializing single object trackers and propose a new paradigm for single object tracking algorithms, ClickTrack, a new paradigm using clicking interaction for real-time scenarios. Moreover, click as an input type inherently lack hierarchical information. To address ambiguity in certain special scenarios, we designed the Guided Click Refiner(GCR), which accepts point and optional textual information as inputs, transforming the point into the bounding box expected by the operator. The bounding box will be used as input of single object trackers. Experiments on LaSOT and GOT-10k benchmarks show that tracker combined with GCR achieves stable performance in real-time interactive scenarios. Furthermore, we explored the integration of GCR into the Segment Anything model(SAM), significantly reducing ambiguity issues when SAM receives point inputs.

ClickTrack: Towards Real-time Interactive Single Object Tracking

TL;DR

A new paradigm for single object tracking algorithms, ClickTrack, a new paradigm using clicking interaction for real-time scenarios, and the Guided Click Refiner (GCR), which accepts point and optional textual information as inputs, transforming the point into the bounding box expected by the operator.

Abstract

Single object tracking(SOT) relies on precise object bounding box initialization. In this paper, we reconsidered the deficiencies in the current approaches to initializing single object trackers and propose a new paradigm for single object tracking algorithms, ClickTrack, a new paradigm using clicking interaction for real-time scenarios. Moreover, click as an input type inherently lack hierarchical information. To address ambiguity in certain special scenarios, we designed the Guided Click Refiner(GCR), which accepts point and optional textual information as inputs, transforming the point into the bounding box expected by the operator. The bounding box will be used as input of single object trackers. Experiments on LaSOT and GOT-10k benchmarks show that tracker combined with GCR achieves stable performance in real-time interactive scenarios. Furthermore, we explored the integration of GCR into the Segment Anything model(SAM), significantly reducing ambiguity issues when SAM receives point inputs.

Paper Structure

This paper contains 22 sections, 7 equations, 17 figures, 11 tables.

Figures (17)

  • Figure 1: (a) The performance drops when the deviation rate of annotated bounding box increases. (b) The green dashed box means the object position in the current frame (starting annotating from the left-top of the object) and the yellow box means the annotated box in the $k$-th frame (finishing the bounding box annotation at the right-down of the object), the annotated bounding box is inaccurate.
  • Figure 2: Different initialization methods for single object tracker.
  • Figure 3: Tracking ambiguity: when the point is clicked on the overlap region, e.g., the green point on the license plate, the tracker is confused about which one is for tracking.
  • Figure 4: The Guided Click Refiner (GCR) framework, including the Prototype Selection module and Iterative Regression module. With the click annotation and guiding information, the refined bounding box is obtained by GCR.
  • Figure 5: Visualization of bounding box (blue) generated with the same one point (green) and different text information (red).
  • ...and 12 more figures