Table of Contents
Fetching ...

Revisit, Extend, and Enhance Hessian-Free Influence Functions

Ziao Yang, Han Yue, Jian Chen, Hongfu Liu

TL;DR

This paper revisits a specific, albeit naive, yet effective approximation method known as TracIn, which substitutes the inverse of the Hessian matrix with an identity matrix and extends its applications beyond measuring model utility to include considerations of fairness and robustness.

Abstract

Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primarily due to the non-convex nature of the loss function and the large size of model parameters. This difficulty not only makes computing the inverse of the Hessian matrix costly but also renders it non-existent in some cases. Various approaches, including matrix decomposition, have been explored to expedite and approximate the inversion of the Hessian matrix, with the aim of making influence functions applicable to deep models. In this paper, we revisit a specific, albeit naive, yet effective approximation method known as TracIn. This method substitutes the inverse of the Hessian matrix with an identity matrix. We provide deeper insights into why this simple approximation method performs well. Furthermore, we extend its applications beyond measuring model utility to include considerations of fairness and robustness. Finally, we enhance TracIn through an ensemble strategy. To validate its effectiveness, we conduct experiments on synthetic data and extensive evaluations on noisy label detection, sample selection for large language model fine-tuning, and defense against adversarial attacks.

Revisit, Extend, and Enhance Hessian-Free Influence Functions

TL;DR

This paper revisits a specific, albeit naive, yet effective approximation method known as TracIn, which substitutes the inverse of the Hessian matrix with an identity matrix and extends its applications beyond measuring model utility to include considerations of fairness and robustness.

Abstract

Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primarily due to the non-convex nature of the loss function and the large size of model parameters. This difficulty not only makes computing the inverse of the Hessian matrix costly but also renders it non-existent in some cases. Various approaches, including matrix decomposition, have been explored to expedite and approximate the inversion of the Hessian matrix, with the aim of making influence functions applicable to deep models. In this paper, we revisit a specific, albeit naive, yet effective approximation method known as TracIn. This method substitutes the inverse of the Hessian matrix with an identity matrix. We provide deeper insights into why this simple approximation method performs well. Furthermore, we extend its applications beyond measuring model utility to include considerations of fairness and robustness. Finally, we enhance TracIn through an ensemble strategy. To validate its effectiveness, we conduct experiments on synthetic data and extensive evaluations on noisy label detection, sample selection for large language model fine-tuning, and defense against adversarial attacks.
Paper Structure (29 sections, 5 equations, 9 figures, 7 tables)

This paper contains 29 sections, 5 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Illustration of $\nabla v^{\textrm{util}}\mathbf{H}^{-1}_{\Hat{\theta}}$ and $\nabla v^{\textrm{util}}$.
  • Figure 2: Illustrating our IP on two synthetic datasets and convex/non-convex models. A-C illustrate a 2D linearly separable synthetic dataset with a subset of detrimental samples bearing incorrect labels, trained using a Logistic Regression model, and D-F demonstrate the similar analysis on a non-linear synthetic half-moon dataset using a Multilayer Perceptron neural network. Specifically, A and D depict training sets with two classes, where detrimental samples are marked with $\times$ and regular samples with $\circ$. B and E show test sets. C and F present influence scores and IP scores by Eqs. (\ref{['eq:influence']}) and (\ref{['eq:ip']}), respectively. In the linear case, there is a clear correlation between influence scores and inner product scores, the detrimental samples have both negative influence scores and IP scores. However, in the non-linear case, the influence scores of detrimental samples appear intermixed; but the detrimental samples can be identified via IP.
  • Figure 3: Directions of gradients of the validation set and detrimental samples in Figure \ref{['fig:synthetic']}. In the linear case, $\alpha$ is $0.66\degree$, and we draw a larger angle for better visualization. For the non-linear case, we use the dimension reduction bingham2001random for visualization, and $\alpha$ is $94.36\degree$.
  • Figure 3: Order-consistency in different scenarios
  • Figure 4: Speed improvement factors of IP over other baseline methods.
  • ...and 4 more figures