Towards Robust Influence Functions with Flat Validation Minima
Xichen Ye, Yifan Wu, Weizhong Zhang, Cheng Jin, Yifan Chen
TL;DR
The paper identifies a fundamental flaw in traditional Influence Functions when used with deep neural networks trained on noisy data: loss-change estimation fails in the presence of sharp validation risk. It establishes a theoretical link tying influence estimation error to validation risk and its sharpness, and proposes flat validation minima achieved via Sharpness-Aware Minimization (SAM) as a remedy. Building on this, it introduces a novel second-order Influence Function tailored for flat minima (VM/FVM), which accounts for parameter and loss changes under flat-regime optimization. Across mislabeled detection, relabeling, and generation tasks (text and image), VM/FVM consistently outperform existing methods, demonstrating improved reliability and robustness for influence-based data analysis. The approach offers practical benefits for dataset debugging, data cleaning, and model diagnostics in real-world, noisy settings.
Abstract
The Influence Function (IF) is a widely used technique for assessing the impact of individual training samples on model predictions. However, existing IF methods often fail to provide reliable influence estimates in deep neural networks, particularly when applied to noisy training data. This issue does not stem from inaccuracies in parameter change estimation, which has been the primary focus of prior research, but rather from deficiencies in loss change estimation, specifically due to the sharpness of validation risk. In this work, we establish a theoretical connection between influence estimation error, validation set risk, and its sharpness, underscoring the importance of flat validation minima for accurate influence estimation. Furthermore, we introduce a novel estimation form of Influence Function specifically designed for flat validation minima. Experimental results across various tasks validate the superiority of our approach.
