Table of Contents
Fetching ...

Probably Approximately Precision and Recall Learning

Lee Cohen, Yishay Mansour, Shay Moran, Han Shao

TL;DR

The paper addresses learning set-valued predictions under partial feedback, where only a single positive label per input is observed during training. It extends PAC-style guarantees to a framework with precision and recall losses, introducing both scalar and Pareto-loss objectives and contrasting realizable, agnostic, and semi-realizable settings. The authors propose two core algorithms—Maximum Likelihood and a Surrogate-Loss method—that achieve near-optimal sample complexities and Pareto-front guarantees, including multiplicative rather than additive errors in the agnostic case. They establish lower bounds showing the impossibility of vanishing additive error under partial feedback and discuss semi-realizable scenarios where zero-precision learning is possible when target output sizes are bounded. The work advances understanding of learning from positive-only data with set-valued outputs and lays groundwork for future exploration of optimal factors and potential complexity measures for precision-recall learning.

Abstract

Precision and Recall are fundamental metrics in machine learning tasks where both accurate predictions and comprehensive coverage are essential, such as in multi-label learning, language generation, medical studies, and recommender systems. A key challenge in these settings is the prevalence of one-sided feedback, where only positive examples are observed during training--e.g., in multi-label tasks like tagging people in Facebook photos, we may observe only a few tagged individuals, without knowing who else appears in the image. To address learning under such partial feedback, we introduce a Probably Approximately Correct (PAC) framework in which hypotheses are set functions that map each input to a set of labels, extending beyond single-label predictions and generalizing classical binary, multi-class, and multi-label models. Our results reveal sharp statistical and algorithmic separations from standard settings: classical methods such as Empirical Risk Minimization provably fail, even for simple hypothesis classes. We develop new algorithms that learn from positive data alone, achieving optimal sample complexity in the realizable case, and establishing multiplicative--rather than additive-approximation guarantees in the agnostic case, where achieving additive regret is impossible.

Probably Approximately Precision and Recall Learning

TL;DR

The paper addresses learning set-valued predictions under partial feedback, where only a single positive label per input is observed during training. It extends PAC-style guarantees to a framework with precision and recall losses, introducing both scalar and Pareto-loss objectives and contrasting realizable, agnostic, and semi-realizable settings. The authors propose two core algorithms—Maximum Likelihood and a Surrogate-Loss method—that achieve near-optimal sample complexities and Pareto-front guarantees, including multiplicative rather than additive errors in the agnostic case. They establish lower bounds showing the impossibility of vanishing additive error under partial feedback and discuss semi-realizable scenarios where zero-precision learning is possible when target output sizes are bounded. The work advances understanding of learning from positive-only data with set-valued outputs and lays groundwork for future exploration of optimal factors and potential complexity measures for precision-recall learning.

Abstract

Precision and Recall are fundamental metrics in machine learning tasks where both accurate predictions and comprehensive coverage are essential, such as in multi-label learning, language generation, medical studies, and recommender systems. A key challenge in these settings is the prevalence of one-sided feedback, where only positive examples are observed during training--e.g., in multi-label tasks like tagging people in Facebook photos, we may observe only a few tagged individuals, without knowing who else appears in the image. To address learning under such partial feedback, we introduce a Probably Approximately Correct (PAC) framework in which hypotheses are set functions that map each input to a set of labels, extending beyond single-label predictions and generalizing classical binary, multi-class, and multi-label models. Our results reveal sharp statistical and algorithmic separations from standard settings: classical methods such as Empirical Risk Minimization provably fail, even for simple hypothesis classes. We develop new algorithms that learn from positive data alone, achieving optimal sample complexity in the realizable case, and establishing multiplicative--rather than additive-approximation guarantees in the agnostic case, where achieving additive regret is impossible.

Paper Structure

This paper contains 29 sections, 21 theorems, 103 equations, 7 figures.

Key Result

Theorem 1

In the realizable setting, there exist algorithms such that given an IID training set of size $m\geq O(\frac{\log(|\mathcal{H}|/\delta)}{\varepsilon})$, with probability at least $1-\delta$, the output hypothesis $g^{\text{output}}$ satisfies

Figures (7)

  • Figure 1: Example of hypotheses with varying precision and recall losses. Each point is a distinct hypothesis, with red points on the Pareto frontier, showing optimal trade-offs between precision and recall losses. The empty set function (which always return the empty set) always achieves zero precision loss (but has no guarantee on the recall loss), while the complete set function (which always return the entire label space $\mathcal{Y}$ for any given input $x$) always achieves zero recall loss (but has no guarantee on the precision loss).
  • Figure 2: The target hypothesis (black) outputs $g^{\text{target}}(x_i) = \{u_1,\ldots,u_n\}$ where $n$ is huge. The hypothesis $g_1$ (red) outputs only one label $u_n\in g^{\text{target}}(x_i)$ while $g_2$ (blue) outputs only one label $u'\notin g^{\text{target}}(x_i)$.
  • Figure : Figure \ref{['fig:hardness']}(a): $g_1$
  • Figure : Figure \ref{['fig:hardness']}(a): $g_1$
  • Figure : Figure \ref{['fig:hardness']}(b): $g_2$
  • ...and 2 more figures

Theorems & Definitions (26)

  • Example 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Remark 1
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 6
  • Theorem 6
  • ...and 16 more