CONDA: Condensed Deep Association Learning for Co-Salient Object Detection
Long Li, Nian Liu, Dingwen Zhang, Zhongyu Li, Salman Khan, Rao Anwer, Hisham Cholakkal, Junwei Han, Fahad Shahbaz Khan
TL;DR
This work addresses co-salient object detection by tackling the limitations of relying on raw inter-image associations. It introduces CONDA, a deep association learning framework that converts pixel-wise hyperassociations into deep association features within an FPN-based decoder, leveraging a Progressive Association Generation (PAG) module, Correspondence-induced Association Condensation (CAC) to condense associations, and an Object-aware Cycle Consistency (OCC) loss to supervise pixel-level correspondences. The approach achieves state-of-the-art results across CoCA, CoSal2015, and CoSOD3k under various training setups, with ablations confirming the importance of PAG, CAC, and OCC and demonstrating reduced computation via condensation. By explicitly modeling inter-image association knowledge and pixel-level correspondences, CONDA offers a robust and scalable pathway for improving co-saliency detection and related inter-image tasks.
Abstract
Inter-image association modeling is crucial for co-salient object detection. Despite satisfactory performance, previous methods still have limitations on sufficient inter-image association modeling. Because most of them focus on image feature optimization under the guidance of heuristically calculated raw inter-image associations. They directly rely on raw associations which are not reliable in complex scenarios, and their image feature optimization approach is not explicit for inter-image association modeling. To alleviate these limitations, this paper proposes a deep association learning strategy that deploys deep networks on raw associations to explicitly transform them into deep association features. Specifically, we first create hyperassociations to collect dense pixel-pair-wise raw associations and then deploys deep aggregation networks on them. We design a progressive association generation module for this purpose with additional enhancement of the hyperassociation calculation. More importantly, we propose a correspondence-induced association condensation module that introduces a pretext task, i.e. semantic correspondence estimation, to condense the hyperassociations for computational burden reduction and noise elimination. We also design an object-aware cycle consistency loss for high-quality correspondence estimations. Experimental results in three benchmark datasets demonstrate the remarkable effectiveness of our proposed method with various training settings.
