Table of Contents
Fetching ...

Influence Functions in Deep Learning Are Fragile

Samyadeep Basu, Philip Pope, Soheil Feizi

TL;DR

The paper conducts a comprehensive empirical study of influence functions in deep learning, revealing that their accuracy is highly fragile in non-convex settings. By comparing exact Hessians and stochastic inverse-Hessian methods across Iris, MNIST, CIFAR-10/100, and ImageNet with architectures ranging from simple CNNs to ResNets, it shows that depth, width, and regularization (notably weight decay) critically affect estimation quality, and that test-point choice can drastically alter results. Ground-truth estimates via leave-one-out re-training remain noisy at scale, casting doubt on the reliability of influence signals for large models. The work calls for robust, scalable influence estimation methods and suggests considering group-level analyses to mitigate ground-truth and optimization challenges in deep learning applications.

Abstract

Influence functions approximate the effect of training samples in test-time predictions and have a wide variety of applications in machine learning interpretability and uncertainty estimation. A commonly-used (first-order) influence function can be implemented efficiently as a post-hoc method requiring access only to the gradients and Hessian of the model. For linear models, influence functions are well-defined due to the convexity of the underlying loss function and are generally accurate even across difficult settings where model changes are fairly large such as estimating group influences. Influence functions, however, are not well-understood in the context of deep learning with non-convex loss functions. In this paper, we provide a comprehensive and large-scale empirical study of successes and failures of influence functions in neural network models trained on datasets such as Iris, MNIST, CIFAR-10 and ImageNet. Through our extensive experiments, we show that the network architecture, its depth and width, as well as the extent of model parameterization and regularization techniques have strong effects in the accuracy of influence functions. In particular, we find that (i) influence estimates are fairly accurate for shallow networks, while for deeper networks the estimates are often erroneous; (ii) for certain network architectures and datasets, training with weight-decay regularization is important to get high-quality influence estimates; and (iii) the accuracy of influence estimates can vary significantly depending on the examined test points. These results suggest that in general influence functions in deep learning are fragile and call for developing improved influence estimation methods to mitigate these issues in non-convex setups.

Influence Functions in Deep Learning Are Fragile

TL;DR

The paper conducts a comprehensive empirical study of influence functions in deep learning, revealing that their accuracy is highly fragile in non-convex settings. By comparing exact Hessians and stochastic inverse-Hessian methods across Iris, MNIST, CIFAR-10/100, and ImageNet with architectures ranging from simple CNNs to ResNets, it shows that depth, width, and regularization (notably weight decay) critically affect estimation quality, and that test-point choice can drastically alter results. Ground-truth estimates via leave-one-out re-training remain noisy at scale, casting doubt on the reliability of influence signals for large models. The work calls for robust, scalable influence estimation methods and suggests considering group-level analyses to mitigate ground-truth and optimization challenges in deep learning applications.

Abstract

Influence functions approximate the effect of training samples in test-time predictions and have a wide variety of applications in machine learning interpretability and uncertainty estimation. A commonly-used (first-order) influence function can be implemented efficiently as a post-hoc method requiring access only to the gradients and Hessian of the model. For linear models, influence functions are well-defined due to the convexity of the underlying loss function and are generally accurate even across difficult settings where model changes are fairly large such as estimating group influences. Influence functions, however, are not well-understood in the context of deep learning with non-convex loss functions. In this paper, we provide a comprehensive and large-scale empirical study of successes and failures of influence functions in neural network models trained on datasets such as Iris, MNIST, CIFAR-10 and ImageNet. Through our extensive experiments, we show that the network architecture, its depth and width, as well as the extent of model parameterization and regularization techniques have strong effects in the accuracy of influence functions. In particular, we find that (i) influence estimates are fairly accurate for shallow networks, while for deeper networks the estimates are often erroneous; (ii) for certain network architectures and datasets, training with weight-decay regularization is important to get high-quality influence estimates; and (iii) the accuracy of influence estimates can vary significantly depending on the examined test points. These results suggest that in general influence functions in deep learning are fragile and call for developing improved influence estimation methods to mitigate these issues in non-convex setups.

Paper Structure

This paper contains 26 sections, 5 equations, 20 figures, 2 tables.

Figures (20)

  • Figure 1: Iris dataset experimental results - (a,b) Comparison of norm of parameter changes computed with influence function vs re-training; (a) trained with weight-decay; (b) trained without weight-decay. (c) Spearman correlation vs. network depth. (d) Spearman correlation vs. network width.
  • Figure 2: Iris dataset experimental results; (a) Spearman correlation of influence estimates with the ground-truth estimates computed with stochastic estimation vs. exact inverse-Hessian vector product. (b) Top eigenvalue of the Hessian vs. the network depth. (c) Spearman correlation between the norm of parameter changes computed with influence function vs. re-training.
  • Figure 3: Experiments on small MNIST using a CNN architecture. (a) Estimation of influence function with and without weight decay on (a) the top influential points, (b) training points at $30^{th}$ percentile of influence score distribution. (c) Correlation vs the weight decay factor (evaluated on the top influential points).
  • Figure 4: Influence for CIFAR-100
  • Figure 5: (a) Difference in norm of parameters obtained by re-training from scratch vs. re-training from optimal parameters. (b) Correlation estimates with re-training from scratch vs. re-training from optimal parameters.
  • ...and 15 more figures