Table of Contents
Fetching ...

Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount

Yanbiao Ma, Wei Dai, Jiayi Chen

TL;DR

This work tackles the mismatch between dataset imbalance and observed category bias in object detection by introducing category information amount, a measure of intra-class learning difficulty derived from the volume of a category’s perceptual manifold. It shows a strong negative correlation between information amount and accuracy, motivating the Information Amount-Guided Angular Margin (IGAM) loss, which allocates larger decision spaces to more information-rich (and hence more difficult) categories. IGAM dynamically updates category information density with a low-cost strategy, combining angular-margin adjustments with covariance-based density estimates. Across LVIS v1.0, COCO-LT, and Pascal VOC, IGAM yields substantial gains on rare categories, improves overall performance, and reduces model bias, demonstrating the utility of information-theoretic category analysis for long-tailed detection.

Abstract

In object detection, the instance count is typically used to define whether a dataset exhibits a long-tail distribution, implicitly assuming that models will underperform on categories with fewer instances. This assumption has led to extensive research on category bias in datasets with imbalanced instance counts. However, models still exhibit category bias even in datasets where instance counts are relatively balanced, clearly indicating that instance count alone cannot explain this phenomenon. In this work, we first introduce the concept and measurement of category information amount. We observe a significant negative correlation between category information amount and accuracy, suggesting that category information amount more accurately reflects the learning difficulty of a category. Based on this observation, we propose Information Amount-Guided Angular Margin (IGAM) Loss. The core idea of IGAM is to dynamically adjust the decision space of each category based on its information amount, thereby reducing category bias in long-tail datasets. IGAM Loss not only performs well on long-tailed benchmark datasets such as LVIS v1.0 and COCO-LT but also shows significant improvement for underrepresented categories in the non-long-tailed dataset Pascal VOC. Comprehensive experiments demonstrate the potential of category information amount as a tool and the generality of our proposed method.

Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount

TL;DR

This work tackles the mismatch between dataset imbalance and observed category bias in object detection by introducing category information amount, a measure of intra-class learning difficulty derived from the volume of a category’s perceptual manifold. It shows a strong negative correlation between information amount and accuracy, motivating the Information Amount-Guided Angular Margin (IGAM) loss, which allocates larger decision spaces to more information-rich (and hence more difficult) categories. IGAM dynamically updates category information density with a low-cost strategy, combining angular-margin adjustments with covariance-based density estimates. Across LVIS v1.0, COCO-LT, and Pascal VOC, IGAM yields substantial gains on rare categories, improves overall performance, and reduces model bias, demonstrating the utility of information-theoretic category analysis for long-tailed detection.

Abstract

In object detection, the instance count is typically used to define whether a dataset exhibits a long-tail distribution, implicitly assuming that models will underperform on categories with fewer instances. This assumption has led to extensive research on category bias in datasets with imbalanced instance counts. However, models still exhibit category bias even in datasets where instance counts are relatively balanced, clearly indicating that instance count alone cannot explain this phenomenon. In this work, we first introduce the concept and measurement of category information amount. We observe a significant negative correlation between category information amount and accuracy, suggesting that category information amount more accurately reflects the learning difficulty of a category. Based on this observation, we propose Information Amount-Guided Angular Margin (IGAM) Loss. The core idea of IGAM is to dynamically adjust the decision space of each category based on its information amount, thereby reducing category bias in long-tail datasets. IGAM Loss not only performs well on long-tailed benchmark datasets such as LVIS v1.0 and COCO-LT but also shows significant improvement for underrepresented categories in the non-long-tailed dataset Pascal VOC. Comprehensive experiments demonstrate the potential of category information amount as a tool and the generality of our proposed method.

Paper Structure

This paper contains 21 sections, 17 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: The left vertical axis represents the number of instances per class. The right vertical axis represents the performance of Faster R-CNN trained with cross-entropy loss using R-50-FPN as the backbone across all classes, trained on the Pascal VOC. The model was trained using the settings described in Section \ref{['sec4.2']}. The red text box displays the Pearson correlation coefficient between class performance and the number of instances.
  • Figure 2: Pearson correlation coefficients between category information amount and category average precision and between category instance count and category average precision, under two backbone networks and three loss function settings.
  • Figure 3: The function of the storage space ratio $R$ as it varies with the queue length $d$ on the Pascal VOC and MS COCO datasets.
  • Figure 4: Model bias from models trained with different methods on LVIS v1.0.

Theorems & Definitions (1)

  • proof : Proof 1: Integrating Local Covariance Matrices to Obtain the Global Covariance Matrix.