Hyper-CL: Conditioning Sentence Representations with Hypernetworks
Young Hyun Yoo, Jii Cha, Changhyeon Kim, Taeuk Kim
TL;DR
Hyper-CL addresses the challenge of conditioning sentence representations on multiple perspectives without incurring the cost of cross- or bi-encoders. It uses a hypernetwork to generate condition-specific projection matrices, transforming precomputed sentence embeddings into condition-aware subspaces and optimizing with contrastive objectives in those subspaces. The approach narrows the performance gap to bi-encoders on C-STS and KGC while delivering substantial runtime and memory efficiency compared to traditional tri-encoders, aided by low-rank hypernetwork approximations and caching. Analyses show effective subspace clustering, strong generalization to unseen conditions, and ablations confirming the value of combining hypernetworks with contrastive learning for conditioned representations.
Abstract
While the introduction of contrastive learning frameworks in sentence representation learning has significantly contributed to advancements in the field, it still remains unclear whether state-of-the-art sentence embeddings can capture the fine-grained semantics of sentences, particularly when conditioned on specific perspectives. In this paper, we introduce Hyper-CL, an efficient methodology that integrates hypernetworks with contrastive learning to compute conditioned sentence representations. In our proposed approach, the hypernetwork is responsible for transforming pre-computed condition embeddings into corresponding projection layers. This enables the same sentence embeddings to be projected differently according to various conditions. Evaluation on two representative conditioning benchmarks, namely conditional semantic text similarity and knowledge graph completion, demonstrates that Hyper-CL is effective in flexibly conditioning sentence representations, showcasing its computational efficiency at the same time. We also provide a comprehensive analysis of the inner workings of our approach, leading to a better interpretation of its mechanisms.
