Active Bipartite Ranking with Smooth Posterior Distributions

James Cheshire; Stephan Clémençon

Active Bipartite Ranking with Smooth Posterior Distributions

James Cheshire, Stephan Clémençon

TL;DR

A novel algorithm, referred to as smooth-rank and designed for the continuous setting, which aims to minimise the distance between the ROC curve of the estimated ranking rule and the optimal one w.r.t. the $\sup$ norm is proposed.

Abstract

In this article, bipartite ranking, a statistical learning problem involved in many applications and widely studied in the passive context, is approached in a much more general \textit{active setting} than the discrete one previously considered in the literature. While the latter assumes that the conditional distribution is piece wise constant, the framework we develop permits in contrast to deal with continuous conditional distributions, provided that they fulfill a Hölder smoothness constraint. We first show that a naive approach based on discretisation at a uniform level, fixed \textit{a priori} and consisting in applying next the active strategy designed for the discrete setting generally fails. Instead, we propose a novel algorithm, referred to as smooth-rank and designed for the continuous setting, which aims to minimise the distance between the ROC curve of the estimated ranking rule and the optimal one w.r.t. the $\sup$ norm. We show that, for a fixed confidence level $ε>0$ and probability $δ\in (0,1)$, smooth-rank is PAC$(ε,δ)$. In addition, we provide a problem dependent upper bound on the expected sampling time of smooth-rank and establish a problem dependent lower bound on the expected sampling time of any PAC$(ε,δ)$ algorithm. Beyond the theoretical analysis carried out, numerical results are presented, providing solid empirical evidence of the performance of the algorithm proposed, which compares favorably with alternative approaches.

Active Bipartite Ranking with Smooth Posterior Distributions

TL;DR

norm is proposed.

Abstract

norm. We show that, for a fixed confidence level

and probability

, smooth-rank is PAC

. In addition, we provide a problem dependent upper bound on the expected sampling time of smooth-rank and establish a problem dependent lower bound on the expected sampling time of any PAC

algorithm. Beyond the theoretical analysis carried out, numerical results are presented, providing solid empirical evidence of the performance of the algorithm proposed, which compares favorably with alternative approaches.

Paper Structure (37 sections, 26 theorems, 178 equations, 3 figures, 2 algorithms)

This paper contains 37 sections, 26 theorems, 178 equations, 3 figures, 2 algorithms.

Introduction
Our contributions
Background and Preliminaries
Notation
Bipartite ranking
The active learning setting
Assumptions on the feature space and posterior
Policies and fixed confidence regime.
Problem complexity
Related literature
Comparison to the discrete setting
Comparison to $\mathcal{X}$ - armed bandits
Novelty of our results in comparison to cheshire2023active
Main Theoretical Results
The \ref{['alg:klcrank']} algorithm
...and 22 more sections

Key Result

Theorem 1

For $\varepsilon, \delta > 0$, with $\beta(t,i,\delta) = c\log(t^2\hat{\Delta}_{i,t}^{-d/\beta}/\delta)$ where $c>0$ is an absolute constant, on all problems $\nu \in \mathcal{B}$, we have that alg:klcrank is PAC$(\varepsilon,\delta)$, and it's expected sampling time is upper bounded by, where $c ', c" >0$ are absolute constants.

Figures (3)

Figure 1: (Left) Regression function $\eta$, generated by random walk. (Right) Scoring function outputted by \ref{['alg:klcrank']}.
Figure 2: (Top left, top right respectively) Performance of \ref{['alg:klcrank']} compared with \ref{['alg:rankmessy']}, for scenario 1 and scenario 2, for K=(100,200,300,400,500), regret estimated by 50 Monte Carlo realisations of each algorithm. (Bottom left, bottom right respectively) Example simulation of scenario 1 and 2.
Figure 3: (Top right, top left respectively) Result of KDE for modeling credit default risk given user credit and annuity on EDA data. (Bottom right, bottom left respectively) Performance of \ref{['alg:klcrank']} compared with \ref{['alg:rankmessy']}, for credit default given user credit and annuity, for K=(500,600,700), regret estimated by 50 Monte Carlo realisations of each algorithm.

Theorems & Definitions (47)

Theorem 1
Theorem 2
Lemma 1
proof
Lemma 2
Lemma 3
proof
Lemma 4
Lemma 5
proof
...and 37 more

Active Bipartite Ranking with Smooth Posterior Distributions

TL;DR

Abstract

Active Bipartite Ranking with Smooth Posterior Distributions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (47)