Exploiting Missing Data Remediation Strategies using Adversarial Missingness Attacks
Deniz Koyuncu, Alex Gittens, Bülent Yener, Moti Yung
TL;DR
This work addresses security risks in learning with missing data by introducing BLAMM, a general bi-level optimization framework that learns adversarial missingness mechanisms $p_{R|X}$ to steer ERM-based models toward malicious objectives. It provides differentiable proxy objectives for common remediation techniques, including complete-case analysis and mean/regression-based imputation, enabling gradient-based attacks. Empirical results on real-world tabular datasets show AM attacks can suppress feature significance and drastically inflate average treatment effects, even under partial data access and with defenses like data valuation sometimes offering limited protection. The findings highlight a systemic vulnerability in standard missing-data pipelines and motivate the development of robust defenses and broader extensions to other remediation methods.
Abstract
Adversarial Missingness (AM) attacks aim to manipulate model fitting by carefully engineering a missing data problem to achieve a specific malicious objective. AM attacks are significantly different from prior data poisoning attacks in that no malicious data inserted and no data is maliciously perturbed. Current AM attacks are feasible only under the assumption that the modeler (victim) uses full-information maximum likelihood methods to handle missingness. This work aims to remedy this limitation of AM attacks; in the approach taken here, the adversary achieves their goal by solving a bi-level optimization problem to engineer the adversarial missingness mechanism, where the lower level problem incorporates a differentiable approximation of the targeted missingness remediation technique. As instantiations of this framework, AM attacks are provided for three popular techniques: (i) complete case analysis, (ii) mean imputation, and (iii) regression-based imputation for general empirical risk minimization (ERM) problems. Experiments on real-world data show that AM attacks are successful with modest levels of missingness (less than 20%). Furthermore, we show on the real-world Twins dataset that AM attacks can manipulate the estimated average treatment effect (ATE) as an instance of the general ERM problems: the adversary succeeds in not only reversing the sign, but also in substantially inflating the ATE values from a true value of -1.61% to a manipulated one as high as 10%. These experimental results hold when the ATE is calculated using multiple regression-based estimators with different architectures, even when the adversary is restricted to modifying only a subset of the training data.
