Table of Contents
Fetching ...

Representation Learning by Ranking across multiple tasks

Lifeng Gu

TL;DR

This work reframes representation learning as a ranking problem across multiple tasks and introduces an approximate NDCG loss (A-NDCG) to align feature-space similarity with label-space similarity. By optimizing a unified objective L(x) and using a differentiable position proxy pi(x_i, x_j), the approach demonstrates superior performance across classification, retrieval, multi-label learning, regression, and self-supervised learning, often outperforming traditional losses and standard contrastive methods. The results also show that data augmentation in self-supervised settings enhances the effectiveness of the ranking objective. The authors further argue that ranking provides a unifying lens for understanding pre-training and fine-tuning in modern language models and classical methods, offering design guidance for new pre-training objectives. Overall, ranking-based representation learning emerges as broadly applicable, leveraging both labeled and pseudo-labeled information to improve downstream tasks.

Abstract

In recent years, representation learning has become the research focus of the machine learning community. Large-scale neural networks are a crucial step toward achieving general intelligence, with their success largely attributed to their ability to learn abstract representations of data. Several learning fields are actively discussing how to learn representations, yet there is a lack of a unified perspective. We convert the representation learning problem under different tasks into a ranking problem. By adopting the ranking problem as a unified perspective, representation learning tasks can be solved in a unified manner by optimizing the ranking loss. Experiments under various learning tasks, such as classification, retrieval, multi-label learning, and regression, prove the superiority of the representation learning by ranking framework. Furthermore, experiments under self-supervised learning tasks demonstrate the significant advantage of the ranking framework in processing unsupervised training data, with data augmentation techniques further enhancing its performance.

Representation Learning by Ranking across multiple tasks

TL;DR

This work reframes representation learning as a ranking problem across multiple tasks and introduces an approximate NDCG loss (A-NDCG) to align feature-space similarity with label-space similarity. By optimizing a unified objective L(x) and using a differentiable position proxy pi(x_i, x_j), the approach demonstrates superior performance across classification, retrieval, multi-label learning, regression, and self-supervised learning, often outperforming traditional losses and standard contrastive methods. The results also show that data augmentation in self-supervised settings enhances the effectiveness of the ranking objective. The authors further argue that ranking provides a unifying lens for understanding pre-training and fine-tuning in modern language models and classical methods, offering design guidance for new pre-training objectives. Overall, ranking-based representation learning emerges as broadly applicable, leveraging both labeled and pseudo-labeled information to improve downstream tasks.

Abstract

In recent years, representation learning has become the research focus of the machine learning community. Large-scale neural networks are a crucial step toward achieving general intelligence, with their success largely attributed to their ability to learn abstract representations of data. Several learning fields are actively discussing how to learn representations, yet there is a lack of a unified perspective. We convert the representation learning problem under different tasks into a ranking problem. By adopting the ranking problem as a unified perspective, representation learning tasks can be solved in a unified manner by optimizing the ranking loss. Experiments under various learning tasks, such as classification, retrieval, multi-label learning, and regression, prove the superiority of the representation learning by ranking framework. Furthermore, experiments under self-supervised learning tasks demonstrate the significant advantage of the ranking framework in processing unsupervised training data, with data augmentation techniques further enhancing its performance.

Paper Structure

This paper contains 25 sections, 7 equations, 2 figures, 10 tables.

Figures (2)

  • Figure 1: Suppose there are four samples, $x_1, x_2, x_3, x_4$, and their corresponding labels $y_1, y_2, y_3, y_4$. For the query sample $x_1$, if in the label space it satisfies $\text{sim}(y_1, y_4) > \text{sim}(y_1, y_2) > \text{sim}(y_1, y_3)$, then in the feature space, we also hope to satisfy $\text{sim}(f(x_1), f(x_4)) > \text{sim}(f(x_1), f(x_2)) > \text{sim}(f(x_1), f(x_3))$, i.e., the order of similarity is preserved.
  • Figure 2: a unified perspective on understanding modern language models and classical methods