Joint robust estimation
Xiang Li, Jun S. Liu, Qiang Sun, Lihu Xu
TL;DR
The paper tackles robust parameter estimation under heavy-tailed data by introducing joint estimators that solve two coupled Catoni-type equations to simultaneously recover trend parameters and the error variance for mean estimation, linear regression, and ridge regression. A key innovation is that these estimators cannot be derived from a single loss function, and their analysis leverages a Poincaré–Miranda framework to establish existence and non-asymptotic consistency with confidence intervals whose length scales as $Oig( oot 2rac{ ext{log}(1/ ext{epsilon})}{n}ig)$. The contributions include tuning-free variance handling, flexibility in Catoni function choices, and explicit variance terms in the denominators that boost robustness to heavy tails, with rigorous non-asymptotic guarantees and practical validation via simulations and real-data experiments (e.g., Boston housing and NCI-60). The results demonstrate superior performance over traditional robust methods, especially under heavier tails or higher dimensions, and provide a general framework that can extend to other joint estimation problems in statistics under heavy-tailed noise.
Abstract
We introduce a joint robust estimation method for three parametric statistical models with heavy-tailed data: mean estimation, linear regression, and L2-penalized linear regression, where both the trend parameters and the error variance are unknown. Our approach is based on solving two coupled Catoni-type equations, one for estimating the trend parameters and the other for estimating the error variance. Notably, this joint estimation strategy cannot be obtained by minimizing a single loss function involving both the trend and variance parameters. The method offers four key advantages: (i) the length of the resulting (1 - epsilon) confidence interval scales as (log(1/epsilon))^{1/2}, matching the order achieved by classical estimators for sub-Gaussian data; (ii) it is tuning-free, eliminating the need for separate variance estimation; (iii) it allows flexible selection of Catoni-type functions tailored to the data; and (iv) it delivers strong performance for high-variance data, thanks to the explicit inclusion of the variance term in the denominators of both equations. We establish the consistency and asymptotic efficiency of the proposed joint robust estimators using new analytical techniques. The coupled equations are inherently complex, which makes the theoretical analysis of their solutions challenging. To address this, we employ the Poincare-Miranda theorem to show that the solutions lie within geometric regions, such as cylinders or cones, centered around the true parameter values. This methodology is of independent interest and extends to other statistical problems.
