Scalable Data Attribution via Forward-Only Test-Time Inference
Sibo Ma, Julian Nyarko
TL;DR
This work tackles scalable data attribution by preserving the traditional influence-function target while eliminating per-query backprop at inference. It achieves this through forward-only attributions based on a training-time simulation: perturb each training example with short-horizon gradient updates, accumulate a curvature-aware parameter response, and read out a forward score that converges to the damped inverse influence, $-g_q^T H_ abla^{-1} g_b$, via a two-trajectory or single-trajectory readout. Empirically, the method matches or exceeds TRAK on standard attribution metrics (LOO and LDS) on MNIST-MLP with far lower inference cost, demonstrating practical, real-time data attribution at scale. By shifting computation to training and using a curvature-aware forward readout, the approach offers a principled, scalable framework for tracing data provenance and valuing data sources in large pretrained models.
Abstract
Data attribution seeks to trace model behavior back to the training examples that shaped it, enabling debugging, auditing, and data valuation at scale. Classical influence-function methods offer a principled foundation but remain impractical for modern networks because they require expensive backpropagation or Hessian inversion at inference. We propose a data attribution method that preserves the same first-order counterfactual target while eliminating per-query backward passes. Our approach simulates each training example's parameter influence through short-horizon gradient propagation during training and later reads out attributions for any query using only forward evaluations. This design shifts computation from inference to simulation, reflecting real deployment regimes where a model may serve billions of user queries but originate from a fixed, finite set of data sources (for example, a large language model trained on diverse corpora while compensating a specific publisher such as the New York Times). Empirically, on standard MLP benchmarks, our estimator matches or surpasses state-of-the-art baselines such as TRAK on standard attribution metrics (LOO and LDS) while offering orders-of-magnitude lower inference cost. By combining influence-function fidelity with first-order scalability, our method provides a theoretical framework for practical, real-time data attribution in large pretrained models.
