Concentration inequalities and cut-off phenomena for penalized model selection within a basic Rademacher framework
Pascal Massart, Vincent Rivoirard
TL;DR
The paper addresses penalized model selection in a high-dimensional regression setting with Rademacher noise by leveraging concentration of measure and transport inequalities. It derives tight concentration bounds for suprema of Rademacher processes via Talagrand's convex distance and Marton’s transportation framework, and uses these to prove oracle-type inequalities for penalized least squares. A sharp cut-off phenomenon is established at the critical penalty κ=1, showing a phase transition between selecting large vs. small models, with emphasis on robustness beyond Gaussian errors. The approach extends to adaptive nonparametric estimation on [0,1] using a Fourier basis, achieving minimax rates and demonstrating adaptive performance without knowing the smoothness a priori.
Abstract
This article exists first and foremost to contribute to a tribute to Patrick Cattiaux. One of the two authors has known Patrick Cattiaux for a very long time, and owes him a great deal. If we are to illustrate the adage that life is made up of chance, then what could be better than the meeting of two young people in the 80s, both of whom fell in love with the mathematics of randomness, and one of whom changed the other's life by letting him in on a secret: if you really believe in it, you can turn this passion into a profession. By another happy coincidence, this tribute comes at just the right time, as Michel Talagrand has been awarded the Abel prize. The temptation was therefore great to do a double. Following one of the many galleries opened up by mathematics, we shall first draw a link between the mathematics of Patrick Cattiaux and that of Michel Talagrand. Then we shall show how the abstract probabilistic material on the concentration of product measures thus revisited can be used to shed light on cut-off phenomena in our field of expertise, mathematical statistics. Nothing revolutionary here, as everyone knows the impact that Talagrand's work has had on the development of mathematical statistics since the late 90s, but we've chosen a very simple framework in which everything can be explained with minimal technicality, leaving the main ideas to the fore.
