Table of Contents
Fetching ...

Leave-One-Out Learning with Log-Loss

Yaniv Fogel, Meir Feder

TL;DR

This work introduces leave-one-out regret as a natural criterion for universal batch learning with log-loss in the deterministic, individual setting. It rigorously characterizes the first-order minimax regret for three hypothesis class families: multinomial (yielding $R^*_{loo}=\frac{(m-1)}{N}+o(\frac{1}{N})$), deterministic classes with finite VC-dimension (yielding $R^*_{loo}=O\left( \frac{d\log N}{N}\right)$ and matching lower bounds for certain constructions), and general probabilistic classes (also yielding $O\left( \frac{d\log N}{N}\right)$). The results demonstrate that universal batch learning with log-loss is possible in the individual setting, with regret bounds governed by structural properties of the hypothesis class, such as VC-dimension and the one-inclusion graph. The paper also contrasts this approach with existing pNML methods, showing cases where pNML fails while the leave-one-out criterion remains learnable, thereby advancing the understanding of learning from deterministic sequences. Overall, it provides a principled framework and tight first-order bounds for universal, single-sequence learning under log-loss.

Abstract

We study batch learning with log-loss in the individual setting, where the outcome sequence is deterministic. Because empirical statistics are not directly applicable in this regime, obtaining regret guarantees for batch learning has long posed a fundamental challenge. We propose a natural criterion based on leave-one-out regret and analyze its minimax value for several hypothesis classes. For the multinomial simplex over $m$ symbols, we show that the minimax regret is $\frac{m-1}{N} + o\!\left(\frac{1}{N}\right)$, and compare it to the stochastic realizable case where it is $\frac{m-1}{2N} + o\!\left(\frac{1}{N}\right)$. More generally, we prove that every hypothesis class of VC dimension $d$ is learnable in the individual batch-learning problem, with regret at most $\frac{d\log(N)}{N} + o\!\left(\frac{\log(N)}{N}\right)$, and we establish matching lower bounds for certain classes. We further derive additional upper bounds that depend on structural properties of the hypothesis class. These results establish, for the first time, that universal batch learning with log-loss is possible in the individual setting.

Leave-One-Out Learning with Log-Loss

TL;DR

This work introduces leave-one-out regret as a natural criterion for universal batch learning with log-loss in the deterministic, individual setting. It rigorously characterizes the first-order minimax regret for three hypothesis class families: multinomial (yielding ), deterministic classes with finite VC-dimension (yielding and matching lower bounds for certain constructions), and general probabilistic classes (also yielding ). The results demonstrate that universal batch learning with log-loss is possible in the individual setting, with regret bounds governed by structural properties of the hypothesis class, such as VC-dimension and the one-inclusion graph. The paper also contrasts this approach with existing pNML methods, showing cases where pNML fails while the leave-one-out criterion remains learnable, thereby advancing the understanding of learning from deterministic sequences. Overall, it provides a principled framework and tight first-order bounds for universal, single-sequence learning under log-loss.

Abstract

We study batch learning with log-loss in the individual setting, where the outcome sequence is deterministic. Because empirical statistics are not directly applicable in this regime, obtaining regret guarantees for batch learning has long posed a fundamental challenge. We propose a natural criterion based on leave-one-out regret and analyze its minimax value for several hypothesis classes. For the multinomial simplex over symbols, we show that the minimax regret is , and compare it to the stochastic realizable case where it is . More generally, we prove that every hypothesis class of VC dimension is learnable in the individual batch-learning problem, with regret at most , and we establish matching lower bounds for certain classes. We further derive additional upper bounds that depend on structural properties of the hypothesis class. These results establish, for the first time, that universal batch learning with log-loss is possible in the individual setting.

Paper Structure

This paper contains 14 sections, 11 theorems, 59 equations, 3 figures, 1 table.

Key Result

Theorem 1

The min-max solution of loo_multinomial_regret, $R^*_{loo} = \min_{q(\cdot|\cdot)}\max_{\vec{v}} R_{loo} \left(\vec{v}, q(\cdot|\cdot) \right)$, must be an equalizer, i.e., the regret must be equal for all possible vectors of empirical appearances $\vec{v}$.

Figures (3)

  • Figure 1: One inclusion graph for the deterministic hypothesis class of 1-dimensional barrier threshold, $n=4$, $x_0 < x_1 < x_2 < x_3 < x_4$.
  • Figure 2: One inclusion graph for $2$-unique values, $N=5$. The bottom node represents an all-$0$ sequence, the middle layer contains nodes representing sequences with a single $1$, while the sequences in the upper layer contain $2$$1$-s.
  • Figure 3: Empirical Number of Appearances Graph, $m=3, N=5$. Note that the three edges connecting the three upper nodes represent the probability assignments for $q \left( \cdot| \vec{e} = (0, 4, 0) \right)$. If we increase the probability assigned to $1$ at the expanse of the probability assigned to $2$ while keeping the probability assigned to $3$ constant, it will decrease the regret associated with $\vec{v} = (1, 4, 0)$, increase the regret associated with $\vec{v} = (1, 5, 0)$, and will not influence the regret associated with $\vec{v} = (0, 4, 1)$.

Theorems & Definitions (11)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Corollary 1
  • Theorem 6
  • Theorem 7
  • Theorem 8
  • Theorem 9
  • ...and 1 more