Table of Contents
Fetching ...

The Solution for Single Object Tracking Task of Perception Test Challenge 2024

Zhiqiang Zhong, Yang Yang, Fengqiang Wan, Henglu Wei, Xiangyang Ji

TL;DR

The essence of the work lies in adapting LoRA, a technique that fine-tunes a small subset of model parameters without adding inference latency, to the domain of visual tracking, to help track a specified object throughout a video sequence.

Abstract

This report presents our method for Single Object Tracking (SOT), which aims to track a specified object throughout a video sequence. We employ the LoRAT method. The essence of the work lies in adapting LoRA, a technique that fine-tunes a small subset of model parameters without adding inference latency, to the domain of visual tracking. We train our model using the extensive LaSOT and GOT-10k datasets, which provide a solid foundation for robust performance. Additionally, we implement the alpha-refine technique for post-processing the bounding box outputs. Although the alpha-refine method does not yield the anticipated results, our overall approach achieves a score of 0.813, securing first place in the competition.

The Solution for Single Object Tracking Task of Perception Test Challenge 2024

TL;DR

The essence of the work lies in adapting LoRA, a technique that fine-tunes a small subset of model parameters without adding inference latency, to the domain of visual tracking, to help track a specified object throughout a video sequence.

Abstract

This report presents our method for Single Object Tracking (SOT), which aims to track a specified object throughout a video sequence. We employ the LoRAT method. The essence of the work lies in adapting LoRA, a technique that fine-tunes a small subset of model parameters without adding inference latency, to the domain of visual tracking. We train our model using the extensive LaSOT and GOT-10k datasets, which provide a solid foundation for robust performance. Additionally, we implement the alpha-refine technique for post-processing the bounding box outputs. Although the alpha-refine method does not yield the anticipated results, our overall approach achieves a score of 0.813, securing first place in the competition.

Paper Structure

This paper contains 5 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: the examples of object tracking annotations.
  • Figure 2: Architecture of LoRAT.