Geometry-Aware Instrumental Variable Regression
Heiner Kremer, Bernhard Schölkopf
TL;DR
The paper addresses IV regression under endogeneity by integrating OT-based geometry into conditional moment restrictions via the Sinkhorn Method of Moments (SMM). It develops a dual formulation and a leading-order expansion that support SGD optimization, proves consistency under standard CMR identifiability assumptions, and provides a kernel-based (Kernel-SMM) and a neural-network extension (Neural-SMM) for flexible instrument modeling. Empirically, SMM matches state-of-the-art estimators in standard settings and offers improved robustness to corrupted or adversarial data, with neural variants explored as potential scalability paths. This geometry-aware approach offers a practical, plug-and-play IV estimator that leverages data manifold structure to enhance robustness without sacrificing performance in benign environments.
Abstract
Instrumental variable (IV) regression can be approached through its formulation in terms of conditional moment restrictions (CMR). Building on variants of the generalized method of moments, most CMR estimators are implicitly based on approximating the population data distribution via reweightings of the empirical sample. While for large sample sizes, in the independent identically distributed (IID) setting, reweightings can provide sufficient flexibility, they might fail to capture the relevant information in presence of corrupted data or data prone to adversarial attacks. To address these shortcomings, we propose the Sinkhorn Method of Moments, an optimal transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information. We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings but improves robustness against data corruption and adversarial attacks.
