Representation Learning by Ranking across multiple tasks
Lifeng Gu
TL;DR
This work reframes representation learning as a ranking problem across multiple tasks and introduces an approximate NDCG loss (A-NDCG) to align feature-space similarity with label-space similarity. By optimizing a unified objective L(x) and using a differentiable position proxy pi(x_i, x_j), the approach demonstrates superior performance across classification, retrieval, multi-label learning, regression, and self-supervised learning, often outperforming traditional losses and standard contrastive methods. The results also show that data augmentation in self-supervised settings enhances the effectiveness of the ranking objective. The authors further argue that ranking provides a unifying lens for understanding pre-training and fine-tuning in modern language models and classical methods, offering design guidance for new pre-training objectives. Overall, ranking-based representation learning emerges as broadly applicable, leveraging both labeled and pseudo-labeled information to improve downstream tasks.
Abstract
In recent years, representation learning has become the research focus of the machine learning community. Large-scale neural networks are a crucial step toward achieving general intelligence, with their success largely attributed to their ability to learn abstract representations of data. Several learning fields are actively discussing how to learn representations, yet there is a lack of a unified perspective. We convert the representation learning problem under different tasks into a ranking problem. By adopting the ranking problem as a unified perspective, representation learning tasks can be solved in a unified manner by optimizing the ranking loss. Experiments under various learning tasks, such as classification, retrieval, multi-label learning, and regression, prove the superiority of the representation learning by ranking framework. Furthermore, experiments under self-supervised learning tasks demonstrate the significant advantage of the ranking framework in processing unsupervised training data, with data augmentation techniques further enhancing its performance.
