Minimax and adaptive estimation of general linear functionals under sparsity

Jie Xie; Dongming Huang

Minimax and adaptive estimation of general linear functionals under sparsity

Jie Xie, Dongming Huang

TL;DR

This work derives sharp, nonasymptotic minimax and adaptive rates for estimating linear functionals $L(\theta)=\eta^{\top}\theta$ in high-dimensional, $s$-sparse settings with arbitrary loadings $\eta$ under symmetric sub-Weibull noise. The authors define an oracle rate $\Phi_{o}(s;\eta)$ via a cutoff $\lambda_{o}$ and construct a heterogeneous-loadings estimator that separates large- and small-loadings, achieving minimax optimality. They then develop an adaptive estimator with an $\eta$-dependent Lepski-type threshold achieving a rate $\Phi_{adp}(s;\eta)$ up to logarithmic factors, along with a lower bound demonstrating near-optimal adaptation under mild loading-vector conditions. The theory extends to non-symmetric noise, unknown noise variance, and hypothesis testing, and is illustrated through concrete examples (homogeneous, two-phase, exponentially decaying loadings) that reveal how loadings heterogeneity reshapes the minimax and adaptive landscapes. These results provide a sharp benchmark for inference on linear functionals in sparse high-dimensional models and connect to broader questions in high-dimensional regression under general loading structures.

Abstract

We study estimation of the linear functional $η^\top θ$ of a high-dimensional $s$-sparse mean vector $θ$ when the loading vector $η$ is arbitrary and the noise is symmetric with exponentially decaying tails. Previous analyses for equal loadings treat coordinates as exchangeable and do not yield sharp rates when loadings vary. We give a sharp nonasymptotic characterization of the oracle minimax rate that makes explicit its dependence on $s$, $η$, and the noise tail parameter. To attain this rate, we construct an estimator that treats large and small loadings differently with a cutoff calibrated to $η$, and we prove a matching lower bound using a sparse prior whose inclusion probabilities and signal magnitudes depend on $η$. For unknown sparsity, we identify an $η$-dependent threshold for a Lepski type selection and show that the resulting estimator achieves the oracle minimax rate up to a logarithmic factor, and that it cannot be improved for a broad, verifiable class of loading vectors. In analytic examples, we demonstrate how heterogeneity in $η$ changes the minimax and adaptive rates. We also extend the theory to non-symmetric noise, hypothesis testing, and estimation with unknown noise variance.

Minimax and adaptive estimation of general linear functionals under sparsity

TL;DR

This work derives sharp, nonasymptotic minimax and adaptive rates for estimating linear functionals

in high-dimensional,

-sparse settings with arbitrary loadings

under symmetric sub-Weibull noise. The authors define an oracle rate

via a cutoff

and construct a heterogeneous-loadings estimator that separates large- and small-loadings, achieving minimax optimality. They then develop an adaptive estimator with an

-dependent Lepski-type threshold achieving a rate

up to logarithmic factors, along with a lower bound demonstrating near-optimal adaptation under mild loading-vector conditions. The theory extends to non-symmetric noise, unknown noise variance, and hypothesis testing, and is illustrated through concrete examples (homogeneous, two-phase, exponentially decaying loadings) that reveal how loadings heterogeneity reshapes the minimax and adaptive landscapes. These results provide a sharp benchmark for inference on linear functionals in sparse high-dimensional models and connect to broader questions in high-dimensional regression under general loading structures.

Abstract

We study estimation of the linear functional

of a high-dimensional

-sparse mean vector

when the loading vector

is arbitrary and the noise is symmetric with exponentially decaying tails. Previous analyses for equal loadings treat coordinates as exchangeable and do not yield sharp rates when loadings vary. We give a sharp nonasymptotic characterization of the oracle minimax rate that makes explicit its dependence on

, and the noise tail parameter. To attain this rate, we construct an estimator that treats large and small loadings differently with a cutoff calibrated to

, and we prove a matching lower bound using a sparse prior whose inclusion probabilities and signal magnitudes depend on

. For unknown sparsity, we identify an

-dependent threshold for a Lepski type selection and show that the resulting estimator achieves the oracle minimax rate up to a logarithmic factor, and that it cannot be improved for a broad, verifiable class of loading vectors. In analytic examples, we demonstrate how heterogeneity in

changes the minimax and adaptive rates. We also extend the theory to non-symmetric noise, hypothesis testing, and estimation with unknown noise variance.

Minimax and adaptive estimation of general linear functionals under sparsity

TL;DR

Abstract

Minimax and adaptive estimation of general linear functionals under sparsity

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (47)