Some facts about the optimality of the LSE in the Gaussian sequence model with convex constraint

Akshay Prasadan; Matey Neykov

Some facts about the optimality of the LSE in the Gaussian sequence model with convex constraint

Akshay Prasadan, Matey Neykov

TL;DR

This paper develops a variational framework to understand when the least squares estimator (LSE) is minimax-optimal for convex-constrained Gaussian sequence models. By weaving local Gaussian width and local entropy into precise inequalities, it derives necessary and sufficient conditions for LSE optimality, and introduces algorithms to locate the worst-case risk over bounded convex sets. The authors illustrate both optimal and suboptimal regimes across isotonic models, hyperrectangles, subspaces, balls, pyramids, solids of revolution, and ellipsoids, highlighting the geometry-driven nature of minimax rates. The results illuminate when LSE remains a robust, computationally convenient estimator and when alternative procedures may be necessary, with implications for high-dimensional inference under convex constraints.

Abstract

We consider a convex constrained Gaussian sequence model and characterize necessary and sufficient conditions for the least squares estimator (LSE) to be minimax optimal. For a closed convex set $K\subset \mathbb{R}^n$ we observe $Y=μ+ξ$ for $ξ\sim \mathcal{N}(0,σ^2\mathbb{I}_n)$ and $μ\in K$ and aim to estimate $μ$. We characterize the worst case risk of the LSE in multiple ways by analyzing the behavior of the local Gaussian width on $K$. We demonstrate that optimality is equivalent to a Lipschitz property of the local Gaussian width mapping. We also provide theoretical algorithms that search for the worst case risk. We then provide examples showing optimality or suboptimality of the LSE on various sets, including $\ell_p$ balls for $p\in[1,2]$, pyramids, solids of revolution, and multivariate isotonic regression, among others.

Some facts about the optimality of the LSE in the Gaussian sequence model with convex constraint

TL;DR

Abstract

We consider a convex constrained Gaussian sequence model and characterize necessary and sufficient conditions for the least squares estimator (LSE) to be minimax optimal. For a closed convex set

we observe

for

and

and aim to estimate

. We characterize the worst case risk of the LSE in multiple ways by analyzing the behavior of the local Gaussian width on

. We demonstrate that optimality is equivalent to a Lipschitz property of the local Gaussian width mapping. We also provide theoretical algorithms that search for the worst case risk. We then provide examples showing optimality or suboptimality of the LSE on various sets, including

balls for

, pyramids, solids of revolution, and multivariate isotonic regression, among others.

Paper Structure (33 sections, 43 theorems, 255 equations, 2 tables, 3 algorithms)

This paper contains 33 sections, 43 theorems, 255 equations, 2 tables, 3 algorithms.

Introduction
Related Literature
Notation and Definitions
Organization
Main Results
Sufficient conditions on the worst case performance of the LSE
Characterizations of the worst case rate of the LSE
Examples
Examples with optimal LSE
Isotonic regression with known total variation bound
Multivariate Isotonic Regression
Hyperrectangle Example
Subspace (Linear Regression)
ball and balls: LSE is optimal
Examples with suboptimal LSE
...and 18 more sections

Key Result

Lemma 1.4

The minimax rate $\varepsilon^{\ast}$ satisfies $\varepsilon^{\ast} \gtrsim \sigma\wedge d$.

Theorems & Definitions (94)

Definition 1.1: Packing Sets and Global Entropy
Definition 1.2: Local Entropy
Definition 1.3: (Local) Gaussian Width
Lemma 1.4: Minimax Rate Bound
Lemma 1.5: Equivalent Forms of Information Theoretic Lower Bound
Lemma 2.1
Lemma 2.2
proof
Lemma 2.3
Proposition 2.4
...and 84 more

Some facts about the optimality of the LSE in the Gaussian sequence model with convex constraint

TL;DR

Abstract

Some facts about the optimality of the LSE in the Gaussian sequence model with convex constraint

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (94)