Table of Contents
Fetching ...

Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection

Feiran Li, Qianqian Xu, Shilong Bao, Zhiyong Yang, Runmin Cong, Xiaochun Cao, Qingming Huang

TL;DR

This work reveals that standard salient object detection metrics are biased toward larger objects in images with multiple salient targets due to size-weighted contributions. It introduces a size-invariant evaluation protocol and per-object metrics (\(\mathsf{SI\text{-}MAE}, \mathsf{SI\text{-}F}, \mathsf{SI\text{-}AUC}\)) by partitioning images into foreground frames and a background frame, effectively removing the weight \(P_{X_i}\). It also proposes a generic size-invariant optimization objective \(\mathcal{L}_{\mathsf{SI}}(f)=\sum_{k=1}^K \ell(f_k^{fore}) + \alpha \ell(f_{K+1}^{back})\) and provides a generalization bound showing favorable scaling with sample size \(N\) and image size \(K=H\times W\). Empirically, SI-SOD yields consistent improvements across benchmarks (MSOD, DUTS-TE) for multiple backbones, notably enhancing small-object and multi-object detection while maintaining competitive traditional metrics; code is available at the authors' GitHub repository.

Abstract

This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image. We observe that current metrics are size-sensitive, where larger objects are focused, and smaller ones tend to be ignored. We argue that the evaluation should be size-invariant because bias based on size is unjustified without additional semantic information. In pursuit of this, we propose a generic approach that evaluates each salient object separately and then combines the results, effectively alleviating the imbalance. We further develop an optimization framework tailored to this goal, achieving considerable improvements in detecting objects of different sizes. Theoretically, we provide evidence supporting the validity of our new metrics and present the generalization analysis of SOD. Extensive experiments demonstrate the effectiveness of our method. The code is available at https://github.com/Ferry-Li/SI-SOD.

Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection

TL;DR

This work reveals that standard salient object detection metrics are biased toward larger objects in images with multiple salient targets due to size-weighted contributions. It introduces a size-invariant evaluation protocol and per-object metrics () by partitioning images into foreground frames and a background frame, effectively removing the weight . It also proposes a generic size-invariant optimization objective \(\mathcal{L}_{\mathsf{SI}}(f)=\sum_{k=1}^K \ell(f_k^{fore}) + \alpha \ell(f_{K+1}^{back})\) and provides a generalization bound showing favorable scaling with sample size and image size . Empirically, SI-SOD yields consistent improvements across benchmarks (MSOD, DUTS-TE) for multiple backbones, notably enhancing small-object and multi-object detection while maintaining competitive traditional metrics; code is available at the authors' GitHub repository.

Abstract

This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image. We observe that current metrics are size-sensitive, where larger objects are focused, and smaller ones tend to be ignored. We argue that the evaluation should be size-invariant because bias based on size is unjustified without additional semantic information. In pursuit of this, we propose a generic approach that evaluates each salient object separately and then combines the results, effectively alleviating the imbalance. We further develop an optimization framework tailored to this goal, achieving considerable improvements in detecting objects of different sizes. Theoretically, we provide evidence supporting the validity of our new metrics and present the generalization analysis of SOD. Extensive experiments demonstrate the effectiveness of our method. The code is available at https://github.com/Ferry-Li/SI-SOD.
Paper Structure (41 sections, 9 theorems, 67 equations, 18 figures, 11 tables)

This paper contains 41 sections, 9 theorems, 67 equations, 18 figures, 11 tables.

Key Result

Proposition 3.3

Given two different predictors $f_{A}$ and $f_{B}$, the following two possible cases suggest that $\mathsf{SI\text{-}MAE}$ is more effective than $\mathsf{MAE}$ during evaluation. Case 1: Assume that there is a single salient object (i.e., $K=1$), with two different results from predictors $f_A$ and

Figures (18)

  • Figure 1: Statistics on dataset MSOD. \ref{['fig:msod_area']} illustrates the widely existing small salient objects, with Size(%) as the proportion of the size of an object over the whole image.\ref{['fig:msod_num']} reveals that practical SOD scenarios usually involve multiple salient objects.
  • Figure 2: (c) is the result of backbone EDN EDN, and (d) is the prediction optimized by our approach. (c) detects fewer salient objects, yet enjoys lower $\mathsf{MAE}$ than (d). However, $\mathsf{SI\text{-}MAE}$ can correctly distinguish two detections.
  • Figure 3: Examples of partitions. In \ref{['fig:object_frame1']}, there is a foreground frame ➀ and a background frame ➁. In \ref{['fig:object_frame2']}, there are five foreground frames from ➀ to ➄, and a background frame ➅.
  • Figure 4: $\mathsf{SI\text{-}MAE}$ performance on objects with different sizes on two representative datasets, with EDN and PoolNet as backbones.
  • Figure 5: $\mathsf{SI\text{-}MAE}$ performance with different object numbers on two representative datasets, with EDN and PoolNet as backbones.
  • ...and 13 more figures

Theorems & Definitions (18)

  • Definition 3.1: Separable Function
  • Definition 3.2: Composite Function
  • Proposition 3.3: Informal
  • Proposition 4.1: Mechanism of SI-SOD
  • Theorem 4.2: Generalization Bound for SI-SOD
  • Proposition 2.1: Informal
  • proof
  • proof
  • proof
  • proof
  • ...and 8 more