Adaptive Estimation of Multivariate Binary Distributions under Sparse Generalized Correlation Structures
Alexandre Belloni, Yan Chen, Matthew Harding
TL;DR
This work addresses the challenge of estimating the joint distribution of an M-dimensional binary vector under sparse generalized correlation structures, leveraging Bahadur's polynomial expansion to connect sparsity with conditional independence. The authors formulate a high-dimensional, nuisance-parameter problem and develop a two-stage plug-in approach, followed by computationally tractable regularized adversarial estimators that achieve rate-optimal convergence. They extend the framework to covariates via localized estimators and establish rigorous rates of convergence for both no-covariate and covariate settings, including detailed results for the marginal-probability estimators. The methodology is applied to causal inference with multiple binary treatments, showing finite-sample improvements over direct probability estimation and providing a robust route to estimating generalized propensity scores and ATEs in high-dimensional treatment spaces. Numerical simulations validate the theoretical claims and illustrate practical gains in estimating ATEs under complex treatment structures.
Abstract
We consider the problem of estimating the joint distribution of an $M$-dimensional binary vector, which involves exponentially many parameters without additional assumptions. Using the representation from \citet{bahadur1959representation}, we relate the sparsity of its parameters to conditional independence among components. The maximum likelihood estimator is computationally infeasible and prone to overfitting. {We reformulate the problem as estimating a high-dimensional vector of generalized correlation coefficients, quantifying interaction effects among all component subsets, together with low or moderate-dimensional nuisance parameters corresponding to the marginal probabilities.} Since the marginal probabilities can be consistently estimated, we first propose a two-stage procedure that first estimates the marginal probabilities and then applies an $\ell_1$-regularized estimator for the generalized correlations, exploiting sparsity arising from potential independence structures. While computationally efficient, this estimator is not rate-optimal. We therefore further develop a regularized adversarial estimator that attains the optimal rate under standard regularity conditions while remaining tractable. The framework naturally extends to settings with covariates. We apply the proposed estimators to causal inference with multiple binary treatments and demonstrate substantial finite-sample improvements over non-adaptive estimators that estimate all probabilities directly. Simulation studies corroborate the theoretical results.
