Table of Contents
Fetching ...

Refine and Purify: Orthogonal Basis Optimization with Null-Space Denoising for Conditional Representation Learning

Jiaquan Wang, Yan Lyu, Chen Li, Yuheng Jia

TL;DR

This work tackles conditional representation learning by directly addressing two bottlenecks: dependence on LLM-generated text bases and interference between non-orthogonal subspaces. It introduces AOBO, which uses SVD to derive an orthogonal, curvature-delimited basis with an optimal count $k^*$, and NSDP, which denoises image embeddings by projecting onto the null space of non-target subspaces. Together, these components yield pure, criterion-specific representations that outperform prior CRL methods across customized clustering, few-shot classification, and fashion retrieval, often without requiring task-specific training. Theoretical and empirical analyses show that NSDP provides substantial noise suppression with limited loss of target information, enabling robust generalization and practical efficiency in diverse downstream tasks.

Abstract

Conditional representation learning aims to extract criterion-specific features for customized tasks. Recent studies project universal features onto the conditional feature subspace spanned by an LLM-generated text basis to obtain conditional representations. However, such methods face two key limitations: sensitivity to subspace basis and vulnerability to inter-subspace interference. To address these challenges, we propose OD-CRL, a novel framework integrating Adaptive Orthogonal Basis Optimization (AOBO) and Null-Space Denoising Projection (NSDP). Specifically, AOBO constructs orthogonal semantic bases via singular value decomposition with a curvature-based truncation. NSDP suppresses non-target semantic interference by projecting embeddings onto the null space of irrelevant subspaces. Extensive experiments conducted across customized clustering, customized classification, and customized retrieval tasks demonstrate that OD-CRL achieves a new state-of-the-art performance with superior generalization.

Refine and Purify: Orthogonal Basis Optimization with Null-Space Denoising for Conditional Representation Learning

TL;DR

This work tackles conditional representation learning by directly addressing two bottlenecks: dependence on LLM-generated text bases and interference between non-orthogonal subspaces. It introduces AOBO, which uses SVD to derive an orthogonal, curvature-delimited basis with an optimal count , and NSDP, which denoises image embeddings by projecting onto the null space of non-target subspaces. Together, these components yield pure, criterion-specific representations that outperform prior CRL methods across customized clustering, few-shot classification, and fashion retrieval, often without requiring task-specific training. Theoretical and empirical analyses show that NSDP provides substantial noise suppression with limited loss of target information, enabling robust generalization and practical efficiency in diverse downstream tasks.

Abstract

Conditional representation learning aims to extract criterion-specific features for customized tasks. Recent studies project universal features onto the conditional feature subspace spanned by an LLM-generated text basis to obtain conditional representations. However, such methods face two key limitations: sensitivity to subspace basis and vulnerability to inter-subspace interference. To address these challenges, we propose OD-CRL, a novel framework integrating Adaptive Orthogonal Basis Optimization (AOBO) and Null-Space Denoising Projection (NSDP). Specifically, AOBO constructs orthogonal semantic bases via singular value decomposition with a curvature-based truncation. NSDP suppresses non-target semantic interference by projecting embeddings onto the null space of irrelevant subspaces. Extensive experiments conducted across customized clustering, customized classification, and customized retrieval tasks demonstrate that OD-CRL achieves a new state-of-the-art performance with superior generalization.
Paper Structure (35 sections, 3 theorems, 35 equations, 4 figures, 7 tables)

This paper contains 35 sections, 3 theorems, 35 equations, 4 figures, 7 tables.

Key Result

Lemma 2.1

The noise leakage term can be expressed as:

Figures (4)

  • Figure 1: Sensitivity to subspace basis and vulnerability to inter-subspace interference constitute critical limitations in current subspace projection-based methods. (a) As the ratio of low-quality bases containing redundancy or ambiguity increases, clustering performance declines. (b) Noise components induced by subspace non-orthogonality lead to projection deviation.
  • Figure 2: The basic pipeline of subspace projection-based conditional representation learning.
  • Figure 3: T-SNE visualizations of the representations learned by CLIP, CRL, and OD-CRL on the Clevr4-10k dataset for the "shape" criterion (a, b, c) and the "color" criterion (d, e, f).
  • Figure 4: Performance with different numbers of basis vectors.

Theorems & Definitions (6)

  • Lemma 2.1: Benefit Term Expansion
  • proof
  • Lemma 2.2: Cost Term Expansion
  • proof
  • Theorem 2.3: Benefit-Cost Ratio of Null-Space Denoising
  • proof