Tight Margin-Based Generalization Bounds for Voting Classifiers over Finite Hypothesis Sets

Kasper Green Larsen; Natascha Schalburg

Tight Margin-Based Generalization Bounds for Voting Classifiers over Finite Hypothesis Sets

Kasper Green Larsen, Natascha Schalburg

TL;DR

This work addresses margin-based generalization bounds for voting classifiers built from a finite base class $\\Hc$. By discretizing any voting classifier $f$ to a small, unweighted average $g$ within $\\Cc_N$ and leveraging the LS25 framework with Rademacher complexity, the authors derive an asymptotically tight bound that matches the known lower bounds across the margin $\\theta$, margin loss $\\Lc_\\Sr^\\theta(f)$, sample size $n$, hypothesis size $|\\Hc|$, and failure probability $\\delta$. A careful analysis of the discretization cost via Lipschitz-continuous surrogate functions $\\phi$ and $\\rho$, together with a partitioning of parameter ranges and a union bound over blocks, yields a bound of the form $\\Lc_\\Dc(f) \\le \\\Lc_\\Sr^\\theta(f) + c( \,\\sqrt{\\Lc_\\Sr^\\theta(f) [\\ln(e/\\Lc_\\Sr^\\theta(f)) \\\ln|\\Hc| / (\\theta^2 n) + \\ln(e/\\delta)/n]} + \\frac{\\ln(\\theta^2 n / \\ln|\\Hc|) \\ln|\\Hc|}{\\theta^2 n} + \\frac{\\ln(e/\\delta)}{n} )$, establishing asymptotic tightness for finite $\\Hc$. The result extends margin-based analyses to a broader, discretizable setting and clarifies the interplay between margin, training loss, and ensemble size for tight generalization guarantees. This tight bound strengthens theoretical understanding of AdaBoost-like voting classifiers and opens avenues for applying LS25-style techniques to other large-margin ensembles.

Abstract

We prove the first margin-based generalization bound for voting classifiers, that is asymptotically tight in the tradeoff between the size of the hypothesis set, the margin, the fraction of training points with the given margin, the number of training samples and the failure probability.

Tight Margin-Based Generalization Bounds for Voting Classifiers over Finite Hypothesis Sets

TL;DR

This work addresses margin-based generalization bounds for voting classifiers built from a finite base class

. By discretizing any voting classifier

to a small, unweighted average

within

and leveraging the LS25 framework with Rademacher complexity, the authors derive an asymptotically tight bound that matches the known lower bounds across the margin

, margin loss

, sample size

, hypothesis size

, and failure probability

. A careful analysis of the discretization cost via Lipschitz-continuous surrogate functions

and

, together with a partitioning of parameter ranges and a union bound over blocks, yields a bound of the form

, establishing asymptotic tightness for finite

. The result extends margin-based analyses to a broader, discretizable setting and clarifies the interplay between margin, training loss, and ensemble size for tight generalization guarantees. This tight bound strengthens theoretical understanding of AdaBoost-like voting classifiers and opens avenues for applying LS25-style techniques to other large-margin ensembles.

Tight Margin-Based Generalization Bounds for Voting Classifiers over Finite Hypothesis Sets

TL;DR

Abstract

Tight Margin-Based Generalization Bounds for Voting Classifiers over Finite Hypothesis Sets

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (23)