Towards Size-invariant Salient Object Detection: A Generic Evaluation and Optimization Approach
Shilong Bao, Qianqian Xu, Feiran Li, Boyu Han, Zhiyong Yang, Xiaochun Cao, Qingming Huang
TL;DR
This work tackles the sensitivity of SOD evaluation metrics to object size by formalizing a size-invariant framework, SIEva, that evaluates each salient region independently before aggregating results. It then operationalizes this principle into SIOpt, an optimization framework that per-objectizes losses (e.g., SI-BCE, SI-Dice, SI-AUC) and introduces PBAcc to make AUC-based, size-invariant optimization tractable. The authors provide a generalization bound for SIOpt, and extensive experiments show consistent improvements across RGB, RGB-D, RGB-T, and foundation-model settings, with pronounced gains for small or multi-object scenes. The approach yields model-agnostic benefits, improves robustness to size imbalance, and demonstrates practical scalability with modest overheads and memory efficiency when using PBAcc. Overall, SIOpt and SIEva offer a principled path to fairer, more reliable SOD in real-world, multi-object environments.$
Abstract
This paper investigates a fundamental yet underexplored issue in Salient Object Detection (SOD): the size-invariant property for evaluation protocols, particularly in scenarios when multiple salient objects of significantly different sizes appear within a single image. We first present a novel perspective to expose the inherent size sensitivity of existing widely used SOD metrics. Through careful theoretical derivations, we show that the evaluation outcome of an image under current SOD metrics can be essentially decomposed into a sum of several separable terms, with the contribution of each term being directly proportional to its corresponding region size. Consequently, the prediction errors would be dominated by the larger regions, while smaller yet potentially more semantically important objects are often overlooked, leading to biased performance assessments and practical degradation. To address this challenge, a generic Size-Invariant Evaluation (SIEva) framework is proposed. The core idea is to evaluate each separable component individually and then aggregate the results, thereby effectively mitigating the impact of size imbalance across objects. Building upon this, we further develop a dedicated optimization framework (SIOpt), which adheres to the size-invariant principle and significantly enhances the detection of salient objects across a broad range of sizes. Notably, SIOpt is model-agnostic and can be seamlessly integrated with a wide range of SOD backbones. Theoretically, we also present generalization analysis of SOD methods and provide evidence supporting the validity of our new evaluation protocols. Finally, comprehensive experiments speak to the efficacy of our proposed approach. The code is available at https://github.com/Ferry-Li/SI-SOD.
