Table of Contents
Fetching ...

MSITrack: A Challenging Benchmark for Multispectral Single Object Tracking

Tao Feng, Tingfa Xu, Haolin Qin, Tianhao Li, Shuaihao Han, Xuyang Zou, Zhan Lv, Jianan Li

TL;DR

MSITrack introduces the largest and most diverse multispectral single-object tracking dataset to date, addressing limitations of RGB trackers in real-world conditions by leveraging eight spectral bands ($395$ nm to $950$ nm) across 300 videos and 129k frames spanning 55 categories. The dataset emphasizes challenging attributes such as similar-object interference and color/texture similarity, and provides thorough annotations with over 1,300 hours of manual labeling. Evaluations show that multispectral inputs significantly improve tracking performance over RGB alone, with GRM, UNTrack, and AQATrack achieving top performance and notable gains across challenging conditions like low resolution and occlusion. The work demonstrates the practical value of spectral cues for robust object tracking and provides publicly available data and code to drive future research in multispectral tracking.

Abstract

Visual object tracking in real-world scenarios presents numerous challenges including occlusion, interference from similar objects and complex backgrounds-all of which limit the effectiveness of RGB-based trackers. Multispectral imagery, which captures pixel-level spectral reflectance, enhances target discriminability. However, the availability of multispectral tracking datasets remains limited. To bridge this gap, we introduce MSITrack, the largest and most diverse multispectral single object tracking dataset to date. MSITrack offers the following key features: (i) More Challenging Attributes-including interference from similar objects and similarity in color and texture between targets and backgrounds in natural scenarios, along with a wide range of real-world tracking challenges; (ii) Richer and More Natural Scenes-spanning 55 object categories and 300 distinct natural scenes, MSITrack far exceeds the scope of existing benchmarks. Many of these scenes and categories are introduced to the multispectral tracking domain for the first time; (iii) Larger Scale-300 videos comprising over 129k frames of multispectral imagery. To ensure annotation precision, each frame has undergone meticulous processing, manual labeling and multi-stage verification. Extensive evaluations using representative trackers demonstrate that the multispectral data in MSITrack significantly improves performance over RGB-only baselines, highlighting its potential to drive future advancements in the field. The MSITrack dataset is publicly available at: https://github.com/Fengtao191/MSITrack.

MSITrack: A Challenging Benchmark for Multispectral Single Object Tracking

TL;DR

MSITrack introduces the largest and most diverse multispectral single-object tracking dataset to date, addressing limitations of RGB trackers in real-world conditions by leveraging eight spectral bands ( nm to nm) across 300 videos and 129k frames spanning 55 categories. The dataset emphasizes challenging attributes such as similar-object interference and color/texture similarity, and provides thorough annotations with over 1,300 hours of manual labeling. Evaluations show that multispectral inputs significantly improve tracking performance over RGB alone, with GRM, UNTrack, and AQATrack achieving top performance and notable gains across challenging conditions like low resolution and occlusion. The work demonstrates the practical value of spectral cues for robust object tracking and provides publicly available data and code to drive future research in multispectral tracking.

Abstract

Visual object tracking in real-world scenarios presents numerous challenges including occlusion, interference from similar objects and complex backgrounds-all of which limit the effectiveness of RGB-based trackers. Multispectral imagery, which captures pixel-level spectral reflectance, enhances target discriminability. However, the availability of multispectral tracking datasets remains limited. To bridge this gap, we introduce MSITrack, the largest and most diverse multispectral single object tracking dataset to date. MSITrack offers the following key features: (i) More Challenging Attributes-including interference from similar objects and similarity in color and texture between targets and backgrounds in natural scenarios, along with a wide range of real-world tracking challenges; (ii) Richer and More Natural Scenes-spanning 55 object categories and 300 distinct natural scenes, MSITrack far exceeds the scope of existing benchmarks. Many of these scenes and categories are introduced to the multispectral tracking domain for the first time; (iii) Larger Scale-300 videos comprising over 129k frames of multispectral imagery. To ensure annotation precision, each frame has undergone meticulous processing, manual labeling and multi-stage verification. Extensive evaluations using representative trackers demonstrate that the multispectral data in MSITrack significantly improves performance over RGB-only baselines, highlighting its potential to drive future advancements in the field. The MSITrack dataset is publicly available at: https://github.com/Fengtao191/MSITrack.

Paper Structure

This paper contains 11 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: The target’s spectral information differs significantly from the background and aligns with the template’s spectral data, facilitating differentiation and localization.
  • Figure 2: Comparison of scenes between MSITrack and HOT.
  • Figure 3: Visualization of several annotation examples in the proposed MSITrack.
  • Figure 4: Comparison of datasets, category distribution, object size, relative area and dataset splitting in MSITrack.
  • Figure 5: Comparison of some state-of-the-art trackers when dealing with various challenge attributes.
  • ...and 1 more figures