A Convex Framework for Confounding Robust Inference

Kei Ishikawa; Niao He; Takafumi Kanamori

A Convex Framework for Confounding Robust Inference

Kei Ishikawa, Niao He, Takafumi Kanamori

TL;DR

This work addresses robust policy evaluation for offline contextual bandits under unobserved confounding by proposing a convex framework that yields a sharp lower bound on policy value. The core idea is to reformulate the infinite-dimensional conditional moment constraints as tractable convex problems via weight reparameterization and a kernel-based low-rank approximation (KCMC), with strong duality allowing an empirical risk minimization interpretation. The framework supports extensions to f-divergence-based uncertainty, model selection via cross-validation or information criteria, and robust policy learning, while guaranteeing consistency and asymptotic normality of both evaluation and learning procedures. Empirical results demonstrate tighter bounds and effective policy learning across discrete and continuous actions, validating the practical impact of this convex, kernel-based approach for confounding-robust inference.

Abstract

We study policy evaluation of offline contextual bandits subject to unobserved confounders. Sensitivity analysis methods are commonly used to estimate the policy value under the worst-case confounding over a given uncertainty set. However, existing work often resorts to some coarse relaxation of the uncertainty set for the sake of tractability, leading to overly conservative estimation of the policy value. In this paper, we propose a general estimator that provides a sharp lower bound of the policy value using convex programming. The generality of our estimator enables various extensions such as sensitivity analysis with f-divergence, model selection with cross validation and information criterion, and robust policy learning with the sharp lower bound. Furthermore, our estimation method can be reformulated as an empirical risk minimization problem thanks to the strong duality, which enables us to provide strong theoretical guarantees of the proposed estimator using techniques of the M-estimation.

A Convex Framework for Confounding Robust Inference

TL;DR

Abstract

Paper Structure (39 sections, 20 theorems, 134 equations, 8 figures, 1 table)

This paper contains 39 sections, 20 theorems, 134 equations, 8 figures, 1 table.

Introduction
Motivation
Contributions
Related works
Organization of the paper
Problem Settings and Proposed Method
Confounded offline contextual bandits
Uncertainty sets of base policies
Relaxation of the uncertainty sets by reparametrization
Low-rank approximation of the conditional moment constraints
Theoretical Analysis
Characterization of the solution
Specification error
Consistency of policy evaluation
Consistency of policy learning
...and 24 more sections

Key Result

Lemma 2

Let $w^*_\mathrm{CMC}$, $w^*_\mathrm{KCMC}$, and $\hat{w}_\mathrm{KCMC}$ be defined as in eq:w_solutions. Then, there exist function $\eta_\mathrm{CMC}:\mathcal{T}\times\mathcal{X}\to\mathbb{R}$, vectors $\eta_\mathrm{KCMC}, \eta_\mathrm{KCMC}'\in\mathbb{R}^D$, and constants $\eta_f, \eta_f', \eta_f

Figures (8)

Figure 1: Graphical model of unconfounded and confounded contextual bandit
Figure 2: Estimated upper and lower bounds of policy value using different sensitivity parameter $\Gamma$ for the synthetic data of sample size 1000 with binary action space.
Figure 3: Estimated upper and lower bounds of policy value using different sensitivity parameters. Synthetic data of sample size 1000 with continuous action space (left) and the NLS data of sample size 668 with a binary action space (right) are used.
Figure 4: Estimated upper and lower bounds of policy value with 95% confidence interval with (right) and without (left) second-order correction. Solid lines and the bands surrounding them indicate the point estimate (both without second order correction) and its confidence interval.
Figure 5: Acceptance rate of the null hypothesis under different null hypothesises with Tan's box constraints ($\Gamma=1.5$) with significance level $\alpha=0.05$. The plot on the left is in the original size and the one on the right is its zoomed version.
...and 3 more figures

Theorems & Definitions (33)

Example 1: Box constraints
Example 2: f-divergence constraint
Remark 1: A comparison to ZSB sensitivity model
Lemma 2: Characterization of solutions
Example 3: Solutions for box constraints
Theorem 3
Example 4: Lipschitz constant for box constraints
Example 5: Lipschitz constant for bounded conditional f-constraint
Lemma 4: Convergence of $\|\Pi_{\boldsymbol{\psi}} \eta^*_\mathrm{CMC} - \eta^*_\mathrm{CMC}\|$ with kernel PCA
Example 6: Derivation of quantile balancing estimator dorn2022sharp
...and 23 more

A Convex Framework for Confounding Robust Inference

TL;DR

Abstract

A Convex Framework for Confounding Robust Inference

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (33)