SCoRe: Submodular Combinatorial Representation Learning

Anay Majee; Suraj Kothawade; Krishnateja Killamsetty; Rishabh Iyer

SCoRe: Submodular Combinatorial Representation Learning

Anay Majee, Suraj Kothawade, Krishnateja Killamsetty, Rishabh Iyer

TL;DR

SCoRe addresses inter-class bias and intra-class variance in long-tail recognition by recasting representation learning as a set-based optimization over class-sets with submodular information measures. It introduces two core objectives, Total Information $L_{S_f}(\theta)$ and Total Correlation $L_{C_f}(\theta)$, and implements three instantiations—SCoRe-FL, SCoRe-GC, and SCoRe-LogDet—grounded in submodular functions; these formulations enable both intra-class compactness and inter-class separation while generalizing existing metric/contrastive losses. Empirically, SCoRe yields substantial gains across long-tail classification benchmarks (up to $7.6\%$) and object detection tasks (up to $19.4\%$), and demonstrates faster convergence and reduced inter-class bias compared to state-of-the-art methods. The framework’s versatility is evidenced by its ability to reproduce or improve upon SupCon, N-pairs, and OPL, and by its demonstrated applicability to large-scale, real-world imbalanced data. Overall, SCoRe provides a principled, scalable approach to robust representation learning in the presence of strong class imbalance.

Abstract

In this paper we introduce the SCoRe (Submodular Combinatorial Representation Learning) framework, a novel approach in representation learning that addresses inter-class bias and intra-class variance. SCoRe provides a new combinatorial viewpoint to representation learning, by introducing a family of loss functions based on set-based submodular information measures. We develop two novel combinatorial formulations for loss functions, using the Total Information and Total Correlation, that naturally minimize intra-class variance and inter-class bias. Several commonly used metric/contrastive learning loss functions like supervised contrastive loss, orthogonal projection loss, and N-pairs loss, are all instances of SCoRe, thereby underlining the versatility and applicability of SCoRe in a broad spectrum of learning scenarios. Novel objectives in SCoRe naturally model class-imbalance with up to 7.6\% improvement in classification on CIFAR-10-LT, CIFAR-100-LT, MedMNIST, 2.1% on ImageNet-LT, and 19.4% in object detection on IDD and LVIS (v1.0), demonstrating its effectiveness over existing approaches.

SCoRe: Submodular Combinatorial Representation Learning

TL;DR

and Total Correlation

, and implements three instantiations—SCoRe-FL, SCoRe-GC, and SCoRe-LogDet—grounded in submodular functions; these formulations enable both intra-class compactness and inter-class separation while generalizing existing metric/contrastive losses. Empirically, SCoRe yields substantial gains across long-tail classification benchmarks (up to

) and object detection tasks (up to

), and demonstrates faster convergence and reduced inter-class bias compared to state-of-the-art methods. The framework’s versatility is evidenced by its ability to reproduce or improve upon SupCon, N-pairs, and OPL, and by its demonstrated applicability to large-scale, real-world imbalanced data. Overall, SCoRe provides a principled, scalable approach to robust representation learning in the presence of strong class imbalance.

Abstract

Paper Structure (41 sections, 9 theorems, 21 equations, 10 figures, 7 tables)

This paper contains 41 sections, 9 theorems, 21 equations, 10 figures, 7 tables.

Introduction
Related Work
Long-tail Learning:
Metric and Contrastive Learning:
Submodular Functions
SCoRe: Submodular Combinatorial Representation Learning Framework
Combinatorial Loss Functions
Instantiations of Combinatorial Objectives in SCoRe
SCoRe Generalizes Existing Metric/Contrastive Learning Objectives
Contrasting Instantiations of SCoRe
Experiments
Datasets and Experimental Setup
Results on Long-tail Image Classification
Benchmark Results:
Generalization to Existing Metric/Contrastive Learners:
...and 26 more sections

Key Result

Theorem 3.1

If $f(A, \theta) = {\underset{i \in \mathcal{V}}{\sum}} {\underset{j \in A}{\max}} S_{ij}(\theta)$ represents the facility-location function over a set $A$ then, $L_{S_f}(\theta)$ and $L_{C_f}(\theta)$ shown in eq:fl represents the SCoRe-FL objective with $N_f(A_k) = |\mathcal{V}|$. Both $L_{S_f}(\t

Figures (10)

Figure 1: Objectives in SCoRe are resilient to inter-class bias and intra-class variance in long-tail settings. Applying $L(\theta)$ to (a) reduces inter-class bias by promoting inter-cluster separation in (b) while reducing intra-class variance in (c) by inducing intra-cluster compactness.
Figure 2: Submodular functions$f(A)$ in SCoRe model diversity (A = Cluster 1) and cooperation (A = Cluster 2).
Figure 3: Overview of Combinatorial Objectives in SCoRe with respect to contrastive and metric learners.
Figure 4: Resilience to Intra-Class Variance and Inter-Class Bias under the Long-tail setting. Case 1 demonstrates no intra-class variance and inter-class bias, while case 2 demonstrates larger variance for the head class wile case 3 demonstrates larger variance for the tail class inducing inter-cluster overlaps. The details of the experiment have been enclosed in \ref{['app:exp_setup']}.
Figure 5: Comparison of Confusion Matrix plots between (a) SupCon supcon2020, (b) Graph-Cut (GC), (c) Log Determinant, and (d) Facility Location (FL) for the longtail imbalanced setting of CIFAR-10 dataset. We show a significant reduction in inter-class bias when employing combinatorial objectives in SCoRe characterized by reduced confusion between classes.
...and 5 more figures

Theorems & Definitions (20)

Theorem 3.1
Theorem 3.2
Theorem 3.3
proof
proof
proof
Theorem 1.1
proof
proof
Theorem 1.2
...and 10 more

SCoRe: Submodular Combinatorial Representation Learning

TL;DR

Abstract

SCoRe: Submodular Combinatorial Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (20)