Table of Contents
Fetching ...

Bayesian Influence Functions for Hessian-Free Data Attribution

Philipp Alexander Kreer, Wilson Wu, Maxwell Adam, Zach Furman, Jesse Hoogland

TL;DR

This work introduces the local Bayesian influence function (BIF), a Hessian-free training data attribution method that replaces Hessian inversion with covariance estimation over a localized posterior, enabling scalable data attribution for deep networks with billions of parameters. By leveraging SGLD-based covariance estimation, the local BIF captures higher-order interactions in the loss landscape and reduces to the classical influence function in non-singular settings, providing a principled generalization for modern DNNs. Empirically, the method matches or exceeds state-of-the-art Hessian-based baselines on retraining-prediction benchmarks, offers fine-grained per-token attribution in language models, and scales more favorably as model size grows. The approach is architecture-agnostic, provides interpretable visualizations, and opens avenues for dynamic, checkpoint-level data attribution, with practical trade-offs in sampling cost and hyperparameter sensitivity.

Abstract

Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function (BIF), an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic-gradient MCMC sampling. This Hessian-free approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. We demonstrate state-of-the-art results on predicting retraining experiments.

Bayesian Influence Functions for Hessian-Free Data Attribution

TL;DR

This work introduces the local Bayesian influence function (BIF), a Hessian-free training data attribution method that replaces Hessian inversion with covariance estimation over a localized posterior, enabling scalable data attribution for deep networks with billions of parameters. By leveraging SGLD-based covariance estimation, the local BIF captures higher-order interactions in the loss landscape and reduces to the classical influence function in non-singular settings, providing a principled generalization for modern DNNs. Empirically, the method matches or exceeds state-of-the-art Hessian-based baselines on retraining-prediction benchmarks, offers fine-grained per-token attribution in language models, and scales more favorably as model size grows. The approach is architecture-agnostic, provides interpretable visualizations, and opens avenues for dynamic, checkpoint-level data attribution, with practical trade-offs in sampling cost and hyperparameter sensitivity.

Abstract

Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function (BIF), an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic-gradient MCMC sampling. This Hessian-free approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. We demonstrate state-of-the-art results on predicting retraining experiments.

Paper Structure

This paper contains 56 sections, 22 equations, 19 figures, 3 tables, 1 algorithm.

Figures (19)

  • Figure 1: From influence functions (IF) to Bayesian influence functions (BIF): We introduce the local Bayesian Influence Function (BIF), which replaces the Hessian inversion of classical Influence Functions (IF) with a covariance estimation over the local loss landscape. This approach is sensitive to higher-order geometry and scales to models with billions of parameters.
  • Figure 2: The per-token BIF captures semantic relationships in Pythia-2.8B. The posterior correlation (negative of the normalized BIF) between tokens is maximized for relationships like translations, alternate spellings, and synonyms.
  • Figure 3: BIF and EK-FAC show convergent validity on Inception-v1. For a given query image (left), our local BIF (center) and EK-FAC (right) identify similar or identical training images as most influential. See Appendix \ref{['appendix:vision']} for more examples.
  • Figure 4: Bayesian influence functions (BIF) vs. classical influence function approximations (EK-FAC, TRAK, GradSim) on predicting retraining experiments measured by the linear datamodeling score (LDS). We vary the size of the query dataset and full dataset according to $\alpha_{\text{attribution}}$, then retrain on random subsets of $\alpha_{\text{retrain}}$ samples. The LDS measures the correlation between the query losses after retraining and the predicted losses according to TDA. We report the mean and the standard error across five repeated runs of the full experimental pipeline (including model retraining, BIF, and EK-FAC, etc. computation) with fixed hyperparameters but distinct initial seeds. The BIF consistently matches EK-FAC, which is SOTA. The BIF slightly underperforms EK-FAC for larger datasets (but within the margin of error) and slightly outperforms EK-FAC for smaller datasets. Both EK-FAC and BIF consistently outperform GradSim and TRAK.
  • Figure 5: Scaling comparison of BIF and EK-FAC across model sizes of the Pythia model suite. (Left) Evaluation time, excluding the tokenization time. (Right) The node's (4xA100) peak GPU RAM usage. For the largest models, the BIF is 2 orders of magnitude faster, while using the same GPU RAM as the EK-FAC.
  • ...and 14 more figures