Kullback-Leibler excess risk bounds for exponential weighted aggregation in Generalized linear models

The Tien Mai

Kullback-Leibler excess risk bounds for exponential weighted aggregation in Generalized linear models

The Tien Mai

TL;DR

The paper tackles sparse aggregation in generalized linear models by deploying an exponential weighted aggregation (EWA) scheme with a sparsity-promoting prior. It achieves a sharp oracle inequality for the Kullback-Leibler risk with leading constant 1 and attains minimax-optimal aggregation rates, complemented by high-probability excess risk bounds. The approach leverages a fixed design GLM setting, a PAC-Bayesian analysis, and a scaled Student prior to promote sparsity, with explicit nonasymptotic guarantees including an exact sparsity-adaptive rate. The Gaussian example illustrates concrete KL-to-MSE translations, underscoring practical implications for high-dimensional sparse GLMs and robust model averaging in misspecified settings.

Abstract

Aggregation methods have emerged as a powerful and flexible framework in statistical learning, providing unified solutions across diverse problems such as regression, classification, and density estimation. In the context of generalized linear models (GLMs), where responses follow exponential family distributions, aggregation offers an attractive alternative to classical parametric modeling. This paper investigates the problem of sparse aggregation in GLMs, aiming to approximate the true parameter vector by a sparse linear combination of predictors. We prove that an exponential weighted aggregation scheme yields a sharp oracle inequality for the Kullback-Leibler risk with leading constant equal to one, while also attaining the minimax-optimal rate of aggregation. These results are further enhanced by establishing high-probability bounds on the excess risk.

Kullback-Leibler excess risk bounds for exponential weighted aggregation in Generalized linear models

TL;DR

Abstract

Kullback-Leibler excess risk bounds for exponential weighted aggregation in Generalized linear models

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (13)