Table of Contents
Fetching ...

SemCovNet: Towards Fair and Semantic Coverage-Aware Learning for Underrepresented Visual Concepts

Sakib Ahammed, Xia Cui, Xinqi Fan, Wenqi Lu, Moi Hoon Yap

TL;DR

SemCovNet tackles Semantic Coverage Imbalance (SCI), a descriptor-level fairness issue where underrepresented semantic concepts co-occur with inconsistent coverage across classes and subgroups. It introduces a Semantic Descriptor Map (SDM), Descriptor Attention Modulation (DAM), and Descriptor–Visual Alignment (DVA), combined with a Coverage Disparity Index (CDI) to regularize and align semantic coverage with model error. Empirical results on MILK10k and ISIC-DICM-17K show substantial reductions in CDI and improved tail performance, while maintaining calibration and enabling cross-domain generalization to CelebA. The work formalizes semantic fairness, demonstrates a closed-loop learning approach, and provides a foundation for interpretable, fair vision learning in both medical and non-medical domains.

Abstract

Modern vision models increasingly rely on rich semantic representations that extend beyond class labels to include descriptive concepts and contextual attributes. However, existing datasets exhibit Semantic Coverage Imbalance (SCI), a previously overlooked bias arising from the long-tailed semantic representations. Unlike class imbalance, SCI occurs at the semantic level, affecting how models learn and reason about rare yet meaningful semantics. To mitigate SCI, we propose Semantic Coverage-Aware Network (SemCovNet), a novel model that explicitly learns to correct semantic coverage disparities. SemCovNet integrates a Semantic Descriptor Map (SDM) for learning semantic representations, a Descriptor Attention Modulation (DAM) module that dynamically weights visual and concept features, and a Descriptor-Visual Alignment (DVA) loss that aligns visual features with descriptor semantics. We quantify semantic fairness using a Coverage Disparity Index (CDI), which measures the alignment between coverage and error. Extensive experiments across multiple datasets demonstrate that SemCovNet enhances model reliability and substantially reduces CDI, achieving fairer and more equitable performance. This work establishes SCI as a measurable and correctable bias, providing a foundation for advancing semantic fairness and interpretable vision learning.

SemCovNet: Towards Fair and Semantic Coverage-Aware Learning for Underrepresented Visual Concepts

TL;DR

SemCovNet tackles Semantic Coverage Imbalance (SCI), a descriptor-level fairness issue where underrepresented semantic concepts co-occur with inconsistent coverage across classes and subgroups. It introduces a Semantic Descriptor Map (SDM), Descriptor Attention Modulation (DAM), and Descriptor–Visual Alignment (DVA), combined with a Coverage Disparity Index (CDI) to regularize and align semantic coverage with model error. Empirical results on MILK10k and ISIC-DICM-17K show substantial reductions in CDI and improved tail performance, while maintaining calibration and enabling cross-domain generalization to CelebA. The work formalizes semantic fairness, demonstrates a closed-loop learning approach, and provides a foundation for interpretable, fair vision learning in both medical and non-medical domains.

Abstract

Modern vision models increasingly rely on rich semantic representations that extend beyond class labels to include descriptive concepts and contextual attributes. However, existing datasets exhibit Semantic Coverage Imbalance (SCI), a previously overlooked bias arising from the long-tailed semantic representations. Unlike class imbalance, SCI occurs at the semantic level, affecting how models learn and reason about rare yet meaningful semantics. To mitigate SCI, we propose Semantic Coverage-Aware Network (SemCovNet), a novel model that explicitly learns to correct semantic coverage disparities. SemCovNet integrates a Semantic Descriptor Map (SDM) for learning semantic representations, a Descriptor Attention Modulation (DAM) module that dynamically weights visual and concept features, and a Descriptor-Visual Alignment (DVA) loss that aligns visual features with descriptor semantics. We quantify semantic fairness using a Coverage Disparity Index (CDI), which measures the alignment between coverage and error. Extensive experiments across multiple datasets demonstrate that SemCovNet enhances model reliability and substantially reduces CDI, achieving fairer and more equitable performance. This work establishes SCI as a measurable and correctable bias, providing a foundation for advancing semantic fairness and interpretable vision learning.
Paper Structure (20 sections, 15 equations, 6 figures, 9 tables)

This paper contains 20 sections, 15 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: Semantic Coverage Imbalance. (Left) Long-tailed distribution of training coverage across Semantic Coverage Groups (SCGs) $g = (\text{class}, \text{descriptor}, \text{subgroup})$; minority SCGs consistently show lower coverage, indicating substantial semantic representation imbalance. (Right) Coverage--performance alignment across SCGs; greater deviation from ideal alignment indicates stronger coverage--error misalignment, motivating coverage-aware fairness learning. SemCovNet achieves lower Coverage Disparity Index (CDI), demonstrating improved semantic fairness.
  • Figure 2: Overview of SemCovNet architecture. Given an image and its descriptor probabilities, the SDM (left) fuses descriptor priors and visual features into semantic attention maps. The DAM (centre) applies descriptor-conditioned channel modulation. A descriptor token, refined through Cross-Attention (right), reproduces semantic details by updating descriptor priors. Semantic Encoder alternates between SDM generation, DAM modulation, and token attention, forming a closed loop that aligns descriptor coverage with prediction confidence.
  • Figure 3: Semantic Coverage–Error $w/$ and $w/o$ CDI Regularizer ($\mathcal{R}_{\mathrm{CDI}}$) on MILK10k (Dermoscopic). (Left) The CDI statistic progressively decays toward zero during training, indicating effective coverage–error decoupling. (Right) $\mathcal{R}_{\mathrm{CDI}}$ reduces CDI demonstrating improved semantic fairness.
  • Figure 4: Conceptual workflow (grouping$\rightarrow$coverage estimation$\rightarrow$modulation$\rightarrow$error analysis$\rightarrow$diagnosis) of SemCovNet for addressing SCI.
  • Figure 5: Semantic Coverage Imbalance (SCI) across SCGs for the melanoma class (MEL) on MILK10k ($\approx$1:10 class-imbalanced) and ISIC-DICM-17K (1:1 class-balanced). (Left) The SCG Coverage Heatmap displays rows that correspond to semantic descriptors and columns that indicate demographic or contextual subgroups. The patterns observed are sparse and heterogeneous, showing significant variation in the presence of descriptors across different SCGs. (Right) Long-tailed distribution of SCG coverage values. Most SCGs exhibit low coverage, indicating severe semantic under-representation and highlights the long-tailed structure of SCI within the training data.
  • ...and 1 more figures