Rethinking Robustness in Machine Learning: A Posterior Agreement Approach
João Borges S. Carvalho, Victor Jimenez Rodriguez, Alessandro Torcinovich, Antonio E. Cinà, Carlos Cotrini, Lea Schönherr, Joachim M. Buhmann
TL;DR
The paper addresses robustness of ML systems under covariate shift, arguing that accuracy alone is insufficient for evaluating resilience to distribution changes. It introduces Posterior Agreement (PA), a principled, supervision-free framework that compares two Gibbs posteriors, p(c|X′) and p(c|X″), under covariate perturbations by optimizing a shared inverse temperature β to maximize their overlap. The methodology provides a tractable, factorized posterior form and a robust kernel k(X′,X″) that captures agreement beyond task performance. Through extensive experiments on adversarial attacks and domain generalization benchmarks, PA demonstrates superior discriminability and stability compared to accuracy-based metrics and shows promise for robust model selection under realistic shift scenarios. The work lays groundwork for a principled robustness assessment that decouples performance from confidence and highlights paths for future theoretical and algorithmic extensions.
Abstract
The robustness of algorithms against covariate shifts is a fundamental problem with critical implications for the deployment of machine learning algorithms in the real world. Current evaluation methods predominantly measure robustness through the lens of standard generalization, relying on task performance measures like accuracy. This approach lacks a theoretical justification and underscores the need for a principled foundation of robustness assessment under distribution shifts. In this work, we set the desiderata for a robustness measure, and we propose a novel principled framework for the robustness assessment problem that directly follows the Posterior Agreement (PA) theory of model validation. Specifically, we extend the PA framework to the covariate shift setting and propose a measure for robustness evaluation. We assess the soundness of our measure in controlled environments and through an empirical robustness analysis in two different covariate shift scenarios: adversarial learning and domain generalization. We illustrate the suitability of PA by evaluating several models under different nature and magnitudes of shift, and proportion of affected observations. The results show that PA offers a reliable analysis of the vulnerabilities in learning algorithms across different shift conditions and provides higher discriminability than accuracy-based measures, while requiring no supervision.
