Table of Contents
Fetching ...

Towards Robust Influence Functions with Flat Validation Minima

Xichen Ye, Yifan Wu, Weizhong Zhang, Cheng Jin, Yifan Chen

TL;DR

The paper identifies a fundamental flaw in traditional Influence Functions when used with deep neural networks trained on noisy data: loss-change estimation fails in the presence of sharp validation risk. It establishes a theoretical link tying influence estimation error to validation risk and its sharpness, and proposes flat validation minima achieved via Sharpness-Aware Minimization (SAM) as a remedy. Building on this, it introduces a novel second-order Influence Function tailored for flat minima (VM/FVM), which accounts for parameter and loss changes under flat-regime optimization. Across mislabeled detection, relabeling, and generation tasks (text and image), VM/FVM consistently outperform existing methods, demonstrating improved reliability and robustness for influence-based data analysis. The approach offers practical benefits for dataset debugging, data cleaning, and model diagnostics in real-world, noisy settings.

Abstract

The Influence Function (IF) is a widely used technique for assessing the impact of individual training samples on model predictions. However, existing IF methods often fail to provide reliable influence estimates in deep neural networks, particularly when applied to noisy training data. This issue does not stem from inaccuracies in parameter change estimation, which has been the primary focus of prior research, but rather from deficiencies in loss change estimation, specifically due to the sharpness of validation risk. In this work, we establish a theoretical connection between influence estimation error, validation set risk, and its sharpness, underscoring the importance of flat validation minima for accurate influence estimation. Furthermore, we introduce a novel estimation form of Influence Function specifically designed for flat validation minima. Experimental results across various tasks validate the superiority of our approach.

Towards Robust Influence Functions with Flat Validation Minima

TL;DR

The paper identifies a fundamental flaw in traditional Influence Functions when used with deep neural networks trained on noisy data: loss-change estimation fails in the presence of sharp validation risk. It establishes a theoretical link tying influence estimation error to validation risk and its sharpness, and proposes flat validation minima achieved via Sharpness-Aware Minimization (SAM) as a remedy. Building on this, it introduces a novel second-order Influence Function tailored for flat minima (VM/FVM), which accounts for parameter and loss changes under flat-regime optimization. Across mislabeled detection, relabeling, and generation tasks (text and image), VM/FVM consistently outperform existing methods, demonstrating improved reliability and robustness for influence-based data analysis. The approach offers practical benefits for dataset debugging, data cleaning, and model diagnostics in real-world, noisy settings.

Abstract

The Influence Function (IF) is a widely used technique for assessing the impact of individual training samples on model predictions. However, existing IF methods often fail to provide reliable influence estimates in deep neural networks, particularly when applied to noisy training data. This issue does not stem from inaccuracies in parameter change estimation, which has been the primary focus of prior research, but rather from deficiencies in loss change estimation, specifically due to the sharpness of validation risk. In this work, we establish a theoretical connection between influence estimation error, validation set risk, and its sharpness, underscoring the importance of flat validation minima for accurate influence estimation. Furthermore, we introduce a novel estimation form of Influence Function specifically designed for flat validation minima. Experimental results across various tasks validate the superiority of our approach.

Paper Structure

This paper contains 35 sections, 2 theorems, 51 equations, 3 figures, 10 tables, 1 algorithm.

Key Result

Theorem 3.2

Given an influence estimator $\mathcal{I}$ in a certain function space $\mathcal{H}$, a target influence estimator $\mathcal{I}^\star$, a fixed validation set $S_\text{val}$, and an underlying distribution $\mathcal{D}$. We then condition the random variable $z \sim \mathcal{D}$ on the influence sig If the following conditions hold: then the generalization influence estimation error $\mathcal{E}(

Figures (3)

  • Figure 1: Illustration of flat minima. $R$-axis indicates the risk, while $\hat{\mathcal{I}}$ and $\mathcal{I}$ denote the estimated and actual influence (increase in empirical risk after deviating from minima), respectively. $\theta$ refers to the model parameters, and $\Delta \theta$ represents the parameter change. The orange arrows indicate the first-order or second-order approximation used to estimate the influence. As shown, the estimation gap is smaller in flat minima, leading to more reliable influence (marginal increase in risk) estimation.
  • Figure 2: Validation set accuracy and influence estimation performance (measured by ROC AUC) for the mislabeled samples detection task across training epochs. The orange and green curves represent the performance of the Influence Function in identifying mislabeled samples, which are assumed to have a negative influence on the clean validation set. Influence is estimated using LiSSA DBLP:conf/icml/KohL17:IF for second-order parameter change approximation and TracIn DBLP:conf/nips/PruthiLKS20:TraceIn for first-order parameter change approximation. As shown, influence estimation performance is highly correlated with validation set accuracy, regardless of whether the parameter change is estimated using a first-order or second-order approximation. The experiment is conducted on the CIFAR-10N DBLP:conf/iclr/WeiZ0L0022:CIFAR-10N/-100N dataset under the “worst” setting. For further details, see \ref{['sec: noisy label detection']}.
  • Figure 3: (a) The influence estimation performance, measured by ROC AUC, for the mislabeled sample detection task across tuning steps. The standard influence is estimated using LiSSA DBLP:conf/icml/KohL17:IF, with SAM DBLP:conf/iclr/ForetKMN21:SAM employed as the flat minima solver. As shown, the performance of the standard Influence Function decreases as $\hat{R}^\gamma_\text{val} (\theta)$ decreases, while our proposed Influence Function shows an improvement. The experiment is conducted on the CIFAR-10N DBLP:conf/iclr/WeiZ0L0022:CIFAR-10N/-100N dataset under the “worst” setting. For more details, see \ref{['sec: noisy label detection']}. (b) Box plot of influence for clean samples estimated using the standard Influence Function across tuning steps. As illustrated, the absolute value of the estimated influence continuously decreases.

Theorems & Definitions (5)

  • Definition 3.1: Influence Estimation Error
  • Theorem 3.2: Upper Bound on Generalization Influence Estimation Error
  • Corollary 3.3: Upper Bound on Empirical Influence Estimation Error
  • proof
  • proof