Unified Unsupervised Salient Object Detection via Knowledge Transfer
Yao Yuan, Wutao Liu, Pan Gao, Qun Dai, Jie Qin
TL;DR
This work tackles unsupervised salient object detection (USOD) across diverse tasks by proposing a unified framework that learns saliency knowledge from Natural Still Image (NSI) SOD and transfers it to non-NSI tasks. The core ideas are Progressive Curriculum Learning-based Saliency Distilling (PCL-SD) to robustly extract saliency cues from easy to hard samples, and Self-rectify Pseudo-label Refinement (SPR) to progressively improve pseudo-labels via posterior and prior rectifications, coupled with an adapter-tuning strategy to transfer knowledge to non-NSI domains. The approach achieves state-of-the-art or competitive results on RGB, RGB-D, RGB-T, video SOD, and RSI SOD benchmarks, demonstrating strong cross-task generalization and effective zero-shot transfer with targeted fine-tuning. The proposed modality-agnostic yet knowledge-sharing pipeline provides practical implications for data-scarce SOD tasks and real-world applications where annotated data is limited or unavailable.
Abstract
Recently, unsupervised salient object detection (USOD) has gained increasing attention due to its annotation-free nature. However, current methods mainly focus on specific tasks such as RGB and RGB-D, neglecting the potential for task migration. In this paper, we propose a unified USOD framework for generic USOD tasks. Firstly, we propose a Progressive Curriculum Learning-based Saliency Distilling (PCL-SD) mechanism to extract saliency cues from a pre-trained deep network. This mechanism starts with easy samples and progressively moves towards harder ones, to avoid initial interference caused by hard samples. Afterwards, the obtained saliency cues are utilized to train a saliency detector, and we employ a Self-rectify Pseudo-label Refinement (SPR) mechanism to improve the quality of pseudo-labels. Finally, an adapter-tuning method is devised to transfer the acquired saliency knowledge, leveraging shared knowledge to attain superior transferring performance on the target tasks. Extensive experiments on five representative SOD tasks confirm the effectiveness and feasibility of our proposed method. Code and supplement materials are available at https://github.com/I2-Multimedia-Lab/A2S-v3.
