Table of Contents
Fetching ...

Solution for Point Tracking Task of ECCV 2nd Perception Test Challenge 2024

Yuxuan Zhang, Pengsong Niu, Kun Yu, Qingguo Chen, Yang Yang

TL;DR

An improved method for the Tracking Any Point~(TAP), focusing on monitoring physical surfaces in video footage, called Fine-grained Point Discrimination, which focuses on perceiving and rectifying point tracking at multiple granularities in zero-shot manner.

Abstract

This report introduces an improved method for the Tracking Any Point~(TAP), focusing on monitoring physical surfaces in video footage. Despite their success with short-sequence scenarios, TAP methods still face performance degradation and resource overhead in long-sequence situations. To address these issues, we propose a simple yet effective approach called Fine-grained Point Discrimination~(\textbf{FPD}), which focuses on perceiving and rectifying point tracking at multiple granularities in zero-shot manner, especially for static points in the videos shot by a static camera. The proposed FPD contains two key components: $(1)$ Multi-granularity point perception, which can detect static sequences in video and points. $(2)$ Dynamic trajectory correction, which replaces point trajectories based on the type of tracked point. Our approach achieved the second highest score in the final test with a score of $0.4720$.

Solution for Point Tracking Task of ECCV 2nd Perception Test Challenge 2024

TL;DR

An improved method for the Tracking Any Point~(TAP), focusing on monitoring physical surfaces in video footage, called Fine-grained Point Discrimination, which focuses on perceiving and rectifying point tracking at multiple granularities in zero-shot manner.

Abstract

This report introduces an improved method for the Tracking Any Point~(TAP), focusing on monitoring physical surfaces in video footage. Despite their success with short-sequence scenarios, TAP methods still face performance degradation and resource overhead in long-sequence situations. To address these issues, we propose a simple yet effective approach called Fine-grained Point Discrimination~(\textbf{FPD}), which focuses on perceiving and rectifying point tracking at multiple granularities in zero-shot manner, especially for static points in the videos shot by a static camera. The proposed FPD contains two key components: Multi-granularity point perception, which can detect static sequences in video and points. Dynamic trajectory correction, which replaces point trajectories based on the type of tracked point. Our approach achieved the second highest score in the final test with a score of .

Paper Structure

This paper contains 9 sections, 5 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The common issues of mainstream TAP methods.
  • Figure 2: Framework of the proposed Fine-grained Point Discrimination (FPD). MCMD: multi-granularity camera motion detection. We first initialize the tracks of each point using three SOTA TAP methods (i.e., TAPIR, DOT, and TAPTR). For the original video sequence, we conduct a hierarchical assessment at both the camera and point levels. At the camera level, we use a multi-granularity motion detection algorithm to determine if the video is stationary. At the point level, we apply a Moving Point Detection algorithm to identify stationary predicted points. Based on these evaluations, we use dynamic trajectory correction to refine the tracks and achieve the final results.