Table of Contents
Fetching ...

Bregman projection for calibration estimation

Jae Kwang Kim, Yonghyun Kwon, Yumou Qiu

Abstract

Calibration weighting is a fundamental technique in survey sampling and data integration for incorporating auxiliary information and improving efficiency of estimators. Classical calibration methods are typically formulated through distance functions applied to weight ratios relative to design weights. In this paper we develop a unified framework for calibration estimation based on Bregman divergence defined directly on the weight vector. We show that calibration estimators obtained from Bregman divergence admit a dual representation that depends only on the dimension of the auxiliary variables and can be interpreted as a Bregman projection onto the calibration constraint set. This geometric structure leads to a general asymptotic representation showing that calibration estimators are equivalent to debiased regression estimators whose regression coefficient depends on the choice of the Bregman generator. The result provides a unifying perspective on classical calibration methods such as quadratic calibration and exponential tilting, and reveals how the choice of divergence influences efficiency. Under Poisson sampling we further characterize the generator that minimizes the asymptotic variance of the calibration estimator and obtain an optimal contrast entropy divergence. The framework also extends naturally to settings where inclusion probabilities are unknown and must be estimated, yielding cross-fitted estimators that remain root-n consistent under mild conditions. Finally, we develop a regularized calibration estimator suitable for high-dimensional auxiliary variables. Simulation studies and a real data application illustrate the practical advantages of the proposed approach.

Bregman projection for calibration estimation

Abstract

Calibration weighting is a fundamental technique in survey sampling and data integration for incorporating auxiliary information and improving efficiency of estimators. Classical calibration methods are typically formulated through distance functions applied to weight ratios relative to design weights. In this paper we develop a unified framework for calibration estimation based on Bregman divergence defined directly on the weight vector. We show that calibration estimators obtained from Bregman divergence admit a dual representation that depends only on the dimension of the auxiliary variables and can be interpreted as a Bregman projection onto the calibration constraint set. This geometric structure leads to a general asymptotic representation showing that calibration estimators are equivalent to debiased regression estimators whose regression coefficient depends on the choice of the Bregman generator. The result provides a unifying perspective on classical calibration methods such as quadratic calibration and exponential tilting, and reveals how the choice of divergence influences efficiency. Under Poisson sampling we further characterize the generator that minimizes the asymptotic variance of the calibration estimator and obtain an optimal contrast entropy divergence. The framework also extends naturally to settings where inclusion probabilities are unknown and must be estimated, yielding cross-fitted estimators that remain root-n consistent under mild conditions. Finally, we develop a regularized calibration estimator suitable for high-dimensional auxiliary variables. Simulation studies and a real data application illustrate the practical advantages of the proposed approach.
Paper Structure (14 sections, 11 theorems, 57 equations, 4 figures, 3 tables)

This paper contains 14 sections, 11 theorems, 57 equations, 4 figures, 3 tables.

Key Result

Lemma 3.1

For the primal problem in (bregman), its Lagrangian dual objective function $\ell ( {\boldsymbol{\lambda}})$ in (eq:lagrangian-fun) can be expressed via a Bregman divergence as where $\widehat{\boldsymbol{\lambda}} = \operatorname{argmin} \, \ell({\boldsymbol{\lambda}})$, $\nu_i(\boldsymbol{\lambda})=g(\omega_i^{(0)})+\bm{x}_i^\top\boldsymbol{\lambda}$ and the equality in (eq:3-10) holds if and

Figures (4)

  • Figure 1: Primal--dual structure of Bregman calibration weighting. The calibration link $g = G'$ maps the weight space to the natural parameter space; the inverse calibration link $g^{-1} = F'$ maps back. Both spaces carry their own Bregman divergence, and the two optimization problems---minimizing $D_G$ in the primal and $D_F$ in the dual---are coupled through this link.
  • Figure 2: Boxplots for DS and BC estimators across four PS/OR scenarios with glm and gam propensity estimation.
  • Figure 3: RMSE ($\times 10^{2}$) of the SBC estimator as a function of $\log_{10}(\tau)$. Top row: OLS pilot; bottom row: LASSO-refit pilot. Columns correspond to the three divergence functions. Dashed and dot-dashed lines indicate the Full and Oracle baselines, respectively.
  • Figure 4: LPIS unequal--probability sampling: heatmap of scaled relative RMSE across 12 outcomes. Values below 0 indicate improved efficiency relative to BC--CE.

Theorems & Definitions (13)

  • Lemma 3.1
  • Theorem 4.1: Asymptotic expansion of BCE
  • Corollary 4.1
  • Lemma 4.1
  • Corollary 4.2
  • Theorem 4.2
  • Remark 1: Doubly robust interpretation of Assumption \ref{['ass:est-error']}
  • Lemma 4.2
  • Lemma 4.3
  • Theorem 4.3
  • ...and 3 more