Table of Contents
Fetching ...

SCoRe: Submodular Combinatorial Representation Learning

Anay Majee, Suraj Kothawade, Krishnateja Killamsetty, Rishabh Iyer

TL;DR

SCoRe addresses inter-class bias and intra-class variance in long-tail recognition by recasting representation learning as a set-based optimization over class-sets with submodular information measures. It introduces two core objectives, Total Information $L_{S_f}(\theta)$ and Total Correlation $L_{C_f}(\theta)$, and implements three instantiations—SCoRe-FL, SCoRe-GC, and SCoRe-LogDet—grounded in submodular functions; these formulations enable both intra-class compactness and inter-class separation while generalizing existing metric/contrastive losses. Empirically, SCoRe yields substantial gains across long-tail classification benchmarks (up to $7.6\%$) and object detection tasks (up to $19.4\%$), and demonstrates faster convergence and reduced inter-class bias compared to state-of-the-art methods. The framework’s versatility is evidenced by its ability to reproduce or improve upon SupCon, N-pairs, and OPL, and by its demonstrated applicability to large-scale, real-world imbalanced data. Overall, SCoRe provides a principled, scalable approach to robust representation learning in the presence of strong class imbalance.

Abstract

In this paper we introduce the SCoRe (Submodular Combinatorial Representation Learning) framework, a novel approach in representation learning that addresses inter-class bias and intra-class variance. SCoRe provides a new combinatorial viewpoint to representation learning, by introducing a family of loss functions based on set-based submodular information measures. We develop two novel combinatorial formulations for loss functions, using the Total Information and Total Correlation, that naturally minimize intra-class variance and inter-class bias. Several commonly used metric/contrastive learning loss functions like supervised contrastive loss, orthogonal projection loss, and N-pairs loss, are all instances of SCoRe, thereby underlining the versatility and applicability of SCoRe in a broad spectrum of learning scenarios. Novel objectives in SCoRe naturally model class-imbalance with up to 7.6\% improvement in classification on CIFAR-10-LT, CIFAR-100-LT, MedMNIST, 2.1% on ImageNet-LT, and 19.4% in object detection on IDD and LVIS (v1.0), demonstrating its effectiveness over existing approaches.

SCoRe: Submodular Combinatorial Representation Learning

TL;DR

SCoRe addresses inter-class bias and intra-class variance in long-tail recognition by recasting representation learning as a set-based optimization over class-sets with submodular information measures. It introduces two core objectives, Total Information and Total Correlation , and implements three instantiations—SCoRe-FL, SCoRe-GC, and SCoRe-LogDet—grounded in submodular functions; these formulations enable both intra-class compactness and inter-class separation while generalizing existing metric/contrastive losses. Empirically, SCoRe yields substantial gains across long-tail classification benchmarks (up to ) and object detection tasks (up to ), and demonstrates faster convergence and reduced inter-class bias compared to state-of-the-art methods. The framework’s versatility is evidenced by its ability to reproduce or improve upon SupCon, N-pairs, and OPL, and by its demonstrated applicability to large-scale, real-world imbalanced data. Overall, SCoRe provides a principled, scalable approach to robust representation learning in the presence of strong class imbalance.

Abstract

In this paper we introduce the SCoRe (Submodular Combinatorial Representation Learning) framework, a novel approach in representation learning that addresses inter-class bias and intra-class variance. SCoRe provides a new combinatorial viewpoint to representation learning, by introducing a family of loss functions based on set-based submodular information measures. We develop two novel combinatorial formulations for loss functions, using the Total Information and Total Correlation, that naturally minimize intra-class variance and inter-class bias. Several commonly used metric/contrastive learning loss functions like supervised contrastive loss, orthogonal projection loss, and N-pairs loss, are all instances of SCoRe, thereby underlining the versatility and applicability of SCoRe in a broad spectrum of learning scenarios. Novel objectives in SCoRe naturally model class-imbalance with up to 7.6\% improvement in classification on CIFAR-10-LT, CIFAR-100-LT, MedMNIST, 2.1% on ImageNet-LT, and 19.4% in object detection on IDD and LVIS (v1.0), demonstrating its effectiveness over existing approaches.
Paper Structure (41 sections, 9 theorems, 21 equations, 10 figures, 7 tables)

This paper contains 41 sections, 9 theorems, 21 equations, 10 figures, 7 tables.

Key Result

Theorem 3.1

If $f(A, \theta) = {\underset{i \in \mathcal{V}}{\sum}} {\underset{j \in A}{\max}} S_{ij}(\theta)$ represents the facility-location function over a set $A$ then, $L_{S_f}(\theta)$ and $L_{C_f}(\theta)$ shown in eq:fl represents the SCoRe-FL objective with $N_f(A_k) = |\mathcal{V}|$. Both $L_{S_f}(\t

Figures (10)

  • Figure 1: Objectives in SCoRe are resilient to inter-class bias and intra-class variance in long-tail settings. Applying $L(\theta)$ to (a) reduces inter-class bias by promoting inter-cluster separation in (b) while reducing intra-class variance in (c) by inducing intra-cluster compactness.
  • Figure 2: Submodular functions$f(A)$ in SCoRe model diversity (A = Cluster 1) and cooperation (A = Cluster 2).
  • Figure 3: Overview of Combinatorial Objectives in SCoRe with respect to contrastive and metric learners.
  • Figure 4: Resilience to Intra-Class Variance and Inter-Class Bias under the Long-tail setting. Case 1 demonstrates no intra-class variance and inter-class bias, while case 2 demonstrates larger variance for the head class wile case 3 demonstrates larger variance for the tail class inducing inter-cluster overlaps. The details of the experiment have been enclosed in \ref{['app:exp_setup']}.
  • Figure 5: Comparison of Confusion Matrix plots between (a) SupCon supcon2020, (b) Graph-Cut (GC), (c) Log Determinant, and (d) Facility Location (FL) for the longtail imbalanced setting of CIFAR-10 dataset. We show a significant reduction in inter-class bias when employing combinatorial objectives in SCoRe characterized by reduced confusion between classes.
  • ...and 5 more figures

Theorems & Definitions (20)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • proof
  • proof
  • proof
  • Theorem 1.1
  • proof
  • proof
  • Theorem 1.2
  • ...and 10 more