Table of Contents
Fetching ...

Can Contrastive Learning Refine Embeddings

Lihui Liu, Jinha Kim, Vidit Bansal

TL;DR

Can Contrastive Learning Refine Embeddings introduces SIMSKIP, a skip-connection-based contrastive learning framework that refines pre-existing input embeddings to improve downstream task performance. The authors provide a theoretical bound indicating that refining embeddings with SIMSKIP does not worsen downstream error and demonstrate that the skip-connected design can reduce the unsupervised loss upper bound. Empirically, SIMSKIP improves downstream performance across federated knowledge graph embeddings, image embeddings, node embeddings learned via supervised learning, and transformer-based text embeddings, with ablations showing the critical role of the skip connection. Overall, the work positions SIMSKIP as a versatile embedding-refinement tool that can augment existing contrastive-learning pipelines across modalities.

Abstract

Recent advancements in contrastive learning have revolutionized self-supervised representation learning and achieved state-of-the-art performance on benchmark tasks. While most existing methods focus on applying contrastive learning to input data modalities such as images, natural language sentences, or networks, they overlook the potential of utilizing outputs from previously trained encoders. In this paper, we introduce SIMSKIP, a novel contrastive learning framework that specifically refines input embeddings for downstream tasks. Unlike traditional unsupervised learning approaches, SIMSKIP takes advantage of the output embeddings of encoder models as its input. Through theoretical analysis, we provide evidence that applying SIMSKIP does not result in larger upper bounds on downstream task errors than those of the original embeddings, which serve as SIMSKIP's input. Experimental results on various open datasets demonstrate that the embeddings produced by SIMSKIP improve performance on downstream tasks.

Can Contrastive Learning Refine Embeddings

TL;DR

Can Contrastive Learning Refine Embeddings introduces SIMSKIP, a skip-connection-based contrastive learning framework that refines pre-existing input embeddings to improve downstream task performance. The authors provide a theoretical bound indicating that refining embeddings with SIMSKIP does not worsen downstream error and demonstrate that the skip-connected design can reduce the unsupervised loss upper bound. Empirically, SIMSKIP improves downstream performance across federated knowledge graph embeddings, image embeddings, node embeddings learned via supervised learning, and transformer-based text embeddings, with ablations showing the critical role of the skip connection. Overall, the work positions SIMSKIP as a versatile embedding-refinement tool that can augment existing contrastive-learning pipelines across modalities.

Abstract

Recent advancements in contrastive learning have revolutionized self-supervised representation learning and achieved state-of-the-art performance on benchmark tasks. While most existing methods focus on applying contrastive learning to input data modalities such as images, natural language sentences, or networks, they overlook the potential of utilizing outputs from previously trained encoders. In this paper, we introduce SIMSKIP, a novel contrastive learning framework that specifically refines input embeddings for downstream tasks. Unlike traditional unsupervised learning approaches, SIMSKIP takes advantage of the output embeddings of encoder models as its input. Through theoretical analysis, we provide evidence that applying SIMSKIP does not result in larger upper bounds on downstream task errors than those of the original embeddings, which serve as SIMSKIP's input. Experimental results on various open datasets demonstrate that the embeddings produced by SIMSKIP improve performance on downstream tasks.
Paper Structure (20 sections, 8 equations, 3 figures, 5 tables)

This paper contains 20 sections, 8 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: The problem of existing unsupervised contrastive learning
  • Figure 2: The Skip Connection. The picture of Adapter is from houlsby2019parameterefficient.
  • Figure 3: The SimSkip Encoder. Layer 1 and Layer 2 have the same architecture.