Table of Contents
Fetching ...

Collaborative Learning of Semantic-Aware Feature Learning and Label Recovery for Multi-Label Image Recognition with Incomplete Labels

Zhi-Fen He, Ren-Dong Xie, Bo Li, Bin Liu, Jin-Yan Hu

TL;DR

Addresses multi-label image recognition with incomplete labels by proposing CLSL, a framework that jointly learns semantic-aware features and recovers missing labels in a collaborative loop. It combines Semantic-Related Feature Learning (SRFL) and Semantic-Guided Feature Enhancement (SGFE) to align visual and semantic spaces, and uses a Label Recovery module to generate pseudo-labels guiding training via an ASL-based loss on refined and global predictions. The joint optimization enables mutual reinforcement between feature discriminability and label completeness, improving performance under sparse supervision. Experiments on MS-COCO, VOC2007, and NUS-WIDE demonstrate state-of-the-art results and robustness to label sparsity, highlighting practical impact for weakly supervised multi-label recognition.

Abstract

Multi-label image recognition with incomplete labels is a critical learning task and has emerged as a focal topic in computer vision. However, this task is confronted with two core challenges: semantic-aware feature learning and missing label recovery. In this paper, we propose a novel Collaborative Learning of Semantic-aware feature learning and Label recovery (CLSL) method for multi-label image recognition with incomplete labels, which unifies the two aforementioned challenges into a unified learning framework. More specifically, we design a semantic-related feature learning module to learn robust semantic-related features by discovering semantic information and label correlations. Then, a semantic-guided feature enhancement module is proposed to generate high-quality discriminative semantic-aware features by effectively aligning visual and semantic feature spaces. Finally, we introduce a collaborative learning framework that integrates semantic-aware feature learning and label recovery, which can not only dynamically enhance the discriminability of semantic-aware features but also adaptively infer and recover missing labels, forming a mutually reinforced loop between the two processes. Extensive experiments on three widely used public datasets (MS-COCO, VOC2007, and NUS-WIDE) demonstrate that CLSL outperforms the state-of-the-art multi-label image recognition methods with incomplete labels.

Collaborative Learning of Semantic-Aware Feature Learning and Label Recovery for Multi-Label Image Recognition with Incomplete Labels

TL;DR

Addresses multi-label image recognition with incomplete labels by proposing CLSL, a framework that jointly learns semantic-aware features and recovers missing labels in a collaborative loop. It combines Semantic-Related Feature Learning (SRFL) and Semantic-Guided Feature Enhancement (SGFE) to align visual and semantic spaces, and uses a Label Recovery module to generate pseudo-labels guiding training via an ASL-based loss on refined and global predictions. The joint optimization enables mutual reinforcement between feature discriminability and label completeness, improving performance under sparse supervision. Experiments on MS-COCO, VOC2007, and NUS-WIDE demonstrate state-of-the-art results and robustness to label sparsity, highlighting practical impact for weakly supervised multi-label recognition.

Abstract

Multi-label image recognition with incomplete labels is a critical learning task and has emerged as a focal topic in computer vision. However, this task is confronted with two core challenges: semantic-aware feature learning and missing label recovery. In this paper, we propose a novel Collaborative Learning of Semantic-aware feature learning and Label recovery (CLSL) method for multi-label image recognition with incomplete labels, which unifies the two aforementioned challenges into a unified learning framework. More specifically, we design a semantic-related feature learning module to learn robust semantic-related features by discovering semantic information and label correlations. Then, a semantic-guided feature enhancement module is proposed to generate high-quality discriminative semantic-aware features by effectively aligning visual and semantic feature spaces. Finally, we introduce a collaborative learning framework that integrates semantic-aware feature learning and label recovery, which can not only dynamically enhance the discriminability of semantic-aware features but also adaptively infer and recover missing labels, forming a mutually reinforced loop between the two processes. Extensive experiments on three widely used public datasets (MS-COCO, VOC2007, and NUS-WIDE) demonstrate that CLSL outperforms the state-of-the-art multi-label image recognition methods with incomplete labels.

Paper Structure

This paper contains 15 sections, 12 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Examples of completely labeled and incompletely labeled instances. Here, "$\checkmark$" denotes positive labels, "$\times$" stands for negative labels, and "$?$" represents unlabeled labels.
  • Figure 2: Pipeline of the CLSL Method.
  • Figure 3: Visual analysis of our CLSL method. (a) Class-attention maps from image features $F$; (b) Class-attention maps from semantic-aware features $E$.
  • Figure 4: Analysis of label recovery. (a) Class-attention maps from image features $F$; (b) Class-attention maps from semantic-aware features $E$; (c) Image with fully annotated labels, (d) Image with incomplete labels and (e) Image with recovered labels. In the incomplete labeled setting, some annotations (e.g., chair, plant, car and bottle) are missing.