Table of Contents
Fetching ...

Dynamic Influence Tracker: Measuring Time-Varying Sample Influence During Training

Jie Xu, Zihan Wu

TL;DR

Dynamic Influence Tracker (DIT) enables time-windowed measurement of training-sample influence during SGD without requiring convergence. By estimating parameter-change 94 beta_{-j}^{[t1,t2]} via a Hessian-informed projection and projecting changes with query vectors q(t), DIT quantifies how removing a sample affects losses, predictions, and gradients over arbitrary intervals. The approach comes with non-convex, convergence-free error bounds and demonstrates up to 0.99 correlation with ground-truth influence and >98% accuracy in corrupted-sample detection across diverse architectures. Empirically, DIT uncovers four influence-dynamics patterns, shows mid-training influence aligns with full-training influence, and scales to large models, enabling practical data-quality interventions and efficient training strategies.

Abstract

Existing methods for measuring training sample influence on models only provide static, overall measurements, overlooking how sample influence changes during training. We propose Dynamic Influence Tracker (DIT), which captures the time-varying sample influence across arbitrary time windows during training. DIT offers three key insights: 1) Samples show different time-varying influence patterns, with some samples important in the early training stage while others become important later. 2) Sample influences show a weak correlation between early and late stages, demonstrating that the model undergoes distinct learning phases with shifting priorities. 3) Analyzing influence during the convergence period provides more efficient and accurate detection of corrupted samples than full-training analysis. Supported by theoretical guarantees without assuming loss convexity or model convergence, DIT significantly outperforms existing methods, achieving up to 0.99 correlation with ground truth and above 98\% accuracy in detecting corrupted samples in complex architectures.

Dynamic Influence Tracker: Measuring Time-Varying Sample Influence During Training

TL;DR

Dynamic Influence Tracker (DIT) enables time-windowed measurement of training-sample influence during SGD without requiring convergence. By estimating parameter-change 94 beta_{-j}^{[t1,t2]} via a Hessian-informed projection and projecting changes with query vectors q(t), DIT quantifies how removing a sample affects losses, predictions, and gradients over arbitrary intervals. The approach comes with non-convex, convergence-free error bounds and demonstrates up to 0.99 correlation with ground-truth influence and >98% accuracy in corrupted-sample detection across diverse architectures. Empirically, DIT uncovers four influence-dynamics patterns, shows mid-training influence aligns with full-training influence, and scales to large models, enabling practical data-quality interventions and efficient training strategies.

Abstract

Existing methods for measuring training sample influence on models only provide static, overall measurements, overlooking how sample influence changes during training. We propose Dynamic Influence Tracker (DIT), which captures the time-varying sample influence across arbitrary time windows during training. DIT offers three key insights: 1) Samples show different time-varying influence patterns, with some samples important in the early training stage while others become important later. 2) Sample influences show a weak correlation between early and late stages, demonstrating that the model undergoes distinct learning phases with shifting priorities. 3) Analyzing influence during the convergence period provides more efficient and accurate detection of corrupted samples than full-training analysis. Supported by theoretical guarantees without assuming loss convexity or model convergence, DIT significantly outperforms existing methods, achieving up to 0.99 correlation with ground truth and above 98\% accuracy in detecting corrupted samples in complex architectures.

Paper Structure

This paper contains 42 sections, 5 theorems, 88 equations, 3 figures, 7 tables, 4 algorithms.

Key Result

Theorem 1

Let $\Delta \theta_{-j}^{[t_1,t_2]}$ be the true influence of excluding sample $z_j$ on the model parameters over the interval $[t_1, t_2]$ during SGD training. Let $\widehat{\Delta \theta}_{-j}^{[t_1,t_2]}$ be its approximation using DIT. Under the following assumptions: Then, the expected estimation error is bounded as follows: where: $\eta_{\max} = \max_{t \in [t_1,t_2]} \eta_t$, $\tilde{B} =

Figures (3)

  • Figure 1: Illustration of influence dynamics patterns for MNIST training using DNN
  • Figure 2: Comparison of influence estimates for DIT and IF vs. LOO ground truth across datasets using LR and DNN. The x-axis represents the ground truth influence values obtained from the LOO method. The y-axis shows DIT (blue) and IF (red) estimates.
  • Figure : DIT Training with Checkpoints

Theorems & Definitions (18)

  • Definition 1: Stochastic Gradient Descent (SGD)
  • Definition 2: Influence Function koh2017understanding
  • Definition 3: Counterfactual SGD
  • Definition 4: SGD-Influence hara2019data
  • Definition 5: Parameter Change in Time Window
  • Definition 6: Query-based Dynamic Influence Tracker
  • Theorem 1: Error Bound for DIT Parameter Change
  • proof
  • Remark 2
  • Remark 3
  • ...and 8 more