Table of Contents
Fetching ...

Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-Label Classification

Zitai Wang, Qianqian Xu, Zhiyong Yang, Peisong Wen, Yuan He, Xiaochun Cao, Qingming Huang

TL;DR

The paper tackles the challenge of inconsistent performance across ranking-based measures in multi-label classification by introducing Top-K Pairwise Ranking (TKPR), a unifying objective with three equivalent formulations that align with pointwise, pairwise, and listwise views. It develops an Empirical Risk Minimization (ERM) framework for TKPR supported by Fisher consistency, Bayes optimality under a top-K ranking-with-ties criterion, and a data-dependent contraction technique that yields sharp generalization bounds, including for missing-label settings. The authors show TKPR is compatible with existing measures such as precision@K, recall@K, AP@K, and NDCG@K, while providing a tight upper bound to the ranking loss and a bound relation to AP@K. Empirically, TKPR achieves consistent improvements across mAP@K, NDCG@K, and ranking loss on benchmarks like MS-COCO and Pascal VOC, with favorable computational properties (O(CK) per sample) compared to O(C^2) for the ranking loss. The work demonstrates that optimizing TKPR can serve as a practical, theoretically grounded surrogate that harmonizes multiple ranking-based metrics and scales to large label sets, offering strong potential for applications in visual recognition and related retrieval tasks.

Abstract

Multi-label ranking, which returns multiple top-ranked labels for each instance, has a wide range of applications for visual tasks. Due to its complicated setting, prior arts have proposed various measures to evaluate model performances. However, both theoretical analysis and empirical observations show that a model might perform inconsistently on different measures. To bridge this gap, this paper proposes a novel measure named Top-K Pairwise Ranking (TKPR), and a series of analyses show that TKPR is compatible with existing ranking-based measures. In light of this, we further establish an empirical surrogate risk minimization framework for TKPR. On one hand, the proposed framework enjoys convex surrogate losses with the theoretical support of Fisher consistency. On the other hand, we establish a sharp generalization bound for the proposed framework based on a novel technique named data-dependent contraction. Finally, empirical results on benchmark datasets validate the effectiveness of the proposed framework.

Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-Label Classification

TL;DR

The paper tackles the challenge of inconsistent performance across ranking-based measures in multi-label classification by introducing Top-K Pairwise Ranking (TKPR), a unifying objective with three equivalent formulations that align with pointwise, pairwise, and listwise views. It develops an Empirical Risk Minimization (ERM) framework for TKPR supported by Fisher consistency, Bayes optimality under a top-K ranking-with-ties criterion, and a data-dependent contraction technique that yields sharp generalization bounds, including for missing-label settings. The authors show TKPR is compatible with existing measures such as precision@K, recall@K, AP@K, and NDCG@K, while providing a tight upper bound to the ranking loss and a bound relation to AP@K. Empirically, TKPR achieves consistent improvements across mAP@K, NDCG@K, and ranking loss on benchmarks like MS-COCO and Pascal VOC, with favorable computational properties (O(CK) per sample) compared to O(C^2) for the ranking loss. The work demonstrates that optimizing TKPR can serve as a practical, theoretically grounded surrogate that harmonizes multiple ranking-based metrics and scales to large label sets, offering strong potential for applications in visual recognition and related retrieval tasks.

Abstract

Multi-label ranking, which returns multiple top-ranked labels for each instance, has a wide range of applications for visual tasks. Due to its complicated setting, prior arts have proposed various measures to evaluate model performances. However, both theoretical analysis and empirical observations show that a model might perform inconsistently on different measures. To bridge this gap, this paper proposes a novel measure named Top-K Pairwise Ranking (TKPR), and a series of analyses show that TKPR is compatible with existing ranking-based measures. In light of this, we further establish an empirical surrogate risk minimization framework for TKPR. On one hand, the proposed framework enjoys convex surrogate losses with the theoretical support of Fisher consistency. On the other hand, we establish a sharp generalization bound for the proposed framework based on a novel technique named data-dependent contraction. Finally, empirical results on benchmark datasets validate the effectiveness of the proposed framework.
Paper Structure (64 sections, 44 theorems, 170 equations, 9 figures, 14 tables, 1 algorithm)

This paper contains 64 sections, 44 theorems, 170 equations, 9 figures, 14 tables, 1 algorithm.

Key Result

Corollary 1

Given a relevant label $i$ and an irrelevant label $j$ such that $f(\boldsymbol{x})_{i} = f(\boldsymbol{x})_{j}$, we have $\pi_{f}(i) > \pi_{f}(j)$.

Figures (9)

  • Figure 1: Overview of measure comparison: (a) the definition of the proposed TKPR measure, (b) the advantages of TKPR over existing ranking-based measures, and (c) representative ranking-based measures and their limitations.
  • Figure 2: Normalized ranking-based measures w.r.t. the training epoch on the Pascal VOC 2007 dataset in the MLC setting. (a) When optimizing the ranking loss, the changes in different measures are inconsistent. (b) By contrast, when optimizing TKPR, the changes are highly consistent.
  • Figure 3: According to the model predictions, we visualize the rank distributions of relevant labels on MS-COCO and Pascal VOC 2007. The results show that the proposed methods can rank more relevant labels on the top-1 position, which explains why the proposed methods can improve the ranking-based measures.
  • Figure 4: Normalized ranking-based measures w.r.t. the training epoch on the Pascal VOC 2007 dataset in the MLC setting. (a)-(c) When optimizing the competitors, the changes in different measures are inconsistent. (d) By contrast, when optimizing TKPR, the changes are highly consistent, which validates our analyses in Sec.\ref{['sec:tkpr']}.
  • Figure 5: Sensitivity analysis of the proposed methods on Pascal VOC 2012-MLML. The y-axis denotes the values of the hyperparameter $K$, and the x-axis represents the value of TKPR@5 under the corresponding $K$.
  • ...and 4 more figures

Theorems & Definitions (75)

  • Corollary 1
  • Remark 1
  • Proposition 1
  • Proposition 2
  • Definition 1: Statistical discriminancy DBLP:conf/ijcai/LingHZ03
  • Theorem 1
  • Proposition 3
  • Theorem 2
  • Remark 2
  • Definition 2: TKPR Fisher consistency
  • ...and 65 more