Breaking the Geometric Bottleneck: Contrastive Expansion in Asymmetric Cross-Modal Distillation

Kabir Thayani

Breaking the Geometric Bottleneck: Contrastive Expansion in Asymmetric Cross-Modal Distillation

Kabir Thayani

TL;DR

A critical capacity-density trade-off is revealed: overparameterization within fixed manifolds induces brittleness, while capacity-constrained models act as optimal low-pass semantic filters, successfully recovering inherent noise immunity.

Abstract

Knowledge distillation between asymmetric architectures often induces severe geometric constraints on the learned representation space. In this work, we investigate the Dimensional Collapse phenomenon when distilling global Vision Transformers (CLIP and DINOv2) into capacity-constrained CNNs. By employing strictly centered SVD and Effective Rank, we first demonstrate a capacity-agnostic phase transition on CIFAR-10 where standard cosine distillation collapses representations to an intrinsic Effective Rank of ~16. To reverse this, we integrate an auxiliary contrastive objective (InfoNCE), expanding the student's manifold by 2.4x (to ~38 effective dimensions). We further demonstrate that while DINOv2's uniform geometry partially prevents collapse, contrastive expansion remains a universal requirement to reach the CNN's topological capacity limit (~82 dimensions). Finally, we reveal a critical capacity-density trade-off: overparameterization within fixed manifolds induces brittleness, while capacity-constrained models act as optimal low-pass semantic filters, successfully recovering inherent noise immunity.

Breaking the Geometric Bottleneck: Contrastive Expansion in Asymmetric Cross-Modal Distillation

TL;DR

Abstract

Paper Structure (12 sections, 3 equations, 3 figures, 2 tables)

This paper contains 12 sections, 3 equations, 3 figures, 2 tables.

Introduction
Methodology
Architectures and Distillation
Rigorous Spectral Evaluation
Results and Analysis (Phase 1: CLIP)
Capacity-Agnostic Dimensional Collapse
Objective vs. Architecture Ablation
Breaking the Bottleneck via Contrastive Expansion
Generalization and Scaling (Phase 2: DINOv2)
Inherited Anisotropy vs. Universal Expansion
Capacity Saturation and the Robustness Sweet Spot
Conclusion

Figures (3)

Figure 1: Singular Value Spectrum (Log Scale). The InfoNCE student (green) successfully tracks the Teacher's geometric shape (black).
Figure 2: Effective Rank Expansion (DINOv2 Teacher on CIFAR-100).
Figure 3: High-Frequency Noise Robustness (CIFAR-100).

Breaking the Geometric Bottleneck: Contrastive Expansion in Asymmetric Cross-Modal Distillation

TL;DR

Abstract

Breaking the Geometric Bottleneck: Contrastive Expansion in Asymmetric Cross-Modal Distillation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)