Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics

Xingyu Chen; Lin Liu; Rajarshi Mukherjee

Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics

Xingyu Chen, Lin Liu, Rajarshi Mukherjee

TL;DR

This paper develops Method-of-Moments (MoM) estimators for low-dimensional functionals of high-dimensional Generalized Linear Models under proportional asymptotics with Gaussian covariates and known population covariance Σ, achieving $\

Abstract

In this paper, we consider the estimation of regression coefficients and signal-to-noise (SNR) ratio in high-dimensional Generalized Linear Models (GLMs), and explore their implications in inferring popular estimands such as average treatment effects in high-dimensional observational studies. Under the ``proportional asymptotic'' regime and Gaussian covariates with known (population) covariance $Σ$, we derive Consistent and Asymptotically Normal (CAN) estimators of our targets of inference through a Method-of-Moments type of estimators that bypasses estimation of high dimensional nuisance functions and hyperparameter tuning altogether. Additionally, under non-Gaussian covariates, we demonstrate universality of our results under certain additional assumptions on the regression coefficients and $Σ$. We also demonstrate that knowing $Σ$ is not essential to our proposed methodology when the sample covariance matrix estimator is invertible. Finally, we complement our theoretical results with numerical experiments and comparisons with existing literature.

Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics

TL;DR

Abstract

, we derive Consistent and Asymptotically Normal (CAN) estimators of our targets of inference through a Method-of-Moments type of estimators that bypasses estimation of high dimensional nuisance functions and hyperparameter tuning altogether. Additionally, under non-Gaussian covariates, we demonstrate universality of our results under certain additional assumptions on the regression coefficients and

. We also demonstrate that knowing

is not essential to our proposed methodology when the sample covariance matrix estimator is invertible. Finally, we complement our theoretical results with numerical experiments and comparisons with existing literature.

Paper Structure (53 sections, 28 theorems, 139 equations, 21 figures, 8 tables)

This paper contains 53 sections, 28 theorems, 139 equations, 21 figures, 8 tables.

Introduction
Results Highlight
Related Works
Inference in GLMs
Inference in Observational Studies
Known (Population) Covariance
Inference in GLMs
Results for designs that are known to have zero mean
Results for designs with unknown and possibly non-zero means
The Case of Unknown Population Covariance
A Gaussian-centric approach when well
An alternative approach that works when ill
Inference in Observational Studies
Causal Effect of a Binary Treatment under Linear Structural Causal Models
Mean Estimation with Missing Data under Missing-At-Random (MAR)
...and 38 more sections

Key Result

Lemma 1

Under Model GLM, Assumptions as:normal mean zero, as:bounded(1) and as:var-cov(1), the following hold:

Figures (21)

Figure 1: Setting 1 of Section \ref{['sec:sims glms']} (Gaussian design and dense regression coefficients).
Figure 2: Setting 1 of Section \ref{['sec:sims glms']} (Gaussian design and dense regression coefficients): Sampling distributions of the moment estimators and the parameter estimators, over 500 Monte Carlos are displayed for the case of $n = 5000$.
Figure 3: Setting 3 of Section \ref{['sec:sims glms']} (Rademacher design and dense regression coefficients).
Figure 4: Setting 3 of Section \ref{['sec:sims glms']} (Rademacher design and dense regression coefficients): Sampling distributions of the moment estimators and the parameter estimators, over 500 Monte Carlos are displayed for the case of $n = 5000$.
Figure 5: Simulation results for Setting 1 (dense regression coefficients) in Section \ref{['sec:sims mar']}. The two methods proposed in celentano2023challenges are plotted separately in two columns of the figure, with color gradients from blue to red representing the increasing value of the tuning parameter $\lambda$. The MoM-based estimators are plotted with white circles and dashed black lines.
...and 16 more figures

Theorems & Definitions (62)

Remark 1
Remark 2
Remark 3
Lemma 1
Definition 1: $\sqrt{n}$-identifiability
Remark 4
Lemma 2
Proposition 1
Proposition 2
Remark 5
...and 52 more

Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics

TL;DR

Abstract

Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (62)