Table of Contents
Fetching ...

Daunce: Data Attribution through Uncertainty Estimation

Xingyuan Pan, Chenlu Ye, Joseph Melkonian, Jiaqi W. Ma, Tong Zhang

TL;DR

Daunce introduces a scalable, uncertainty-driven approach to training data attribution that avoids explicit second-order inversions by perturbing a target model and measuring the covariance of per-example losses across perturbations. Grounded in a theoretical link to influence functions, it yields an unbiased, Hessian- or Fisher-based interpretation and extends to black-box access, enabling attribution for proprietary LLMs. Empirically, Daunce outperforms state-of-the-art baselines on vision tasks and large-language-model fine-tuning, including challenging black-box and backdoor scenarios, with strong performance that scales rapidly with the number of perturbations. This work advances practical data debugging, curation, and valuation by providing a robust, scalable, and broadly applicable data attribution framework.

Abstract

Training data attribution (TDA) methods aim to identify which training examples influence a model's predictions on specific test data most. By quantifying these influences, TDA supports critical applications such as data debugging, curation, and valuation. Gradient-based TDA methods rely on gradients and second-order information, limiting their applicability at scale. While recent random projection-based methods improve scalability, they often suffer from degraded attribution accuracy. Motivated by connections between uncertainty and influence functions, we introduce Daunce - a simple yet effective data attribution approach through uncertainty estimation. Our method operates by fine-tuning a collection of perturbed models and computing the covariance of per-example losses across these models as the attribution score. Daunce is scalable to large language models (LLMs) and achieves more accurate attribution compared to existing TDA methods. We validate Daunce on tasks ranging from vision tasks to LLM fine-tuning, and further demonstrate its compatibility with black-box model access. Applied to OpenAI's GPT models, our method achieves, to our knowledge, the first instance of data attribution on proprietary LLMs.

Daunce: Data Attribution through Uncertainty Estimation

TL;DR

Daunce introduces a scalable, uncertainty-driven approach to training data attribution that avoids explicit second-order inversions by perturbing a target model and measuring the covariance of per-example losses across perturbations. Grounded in a theoretical link to influence functions, it yields an unbiased, Hessian- or Fisher-based interpretation and extends to black-box access, enabling attribution for proprietary LLMs. Empirically, Daunce outperforms state-of-the-art baselines on vision tasks and large-language-model fine-tuning, including challenging black-box and backdoor scenarios, with strong performance that scales rapidly with the number of perturbations. This work advances practical data debugging, curation, and valuation by providing a robust, scalable, and broadly applicable data attribution framework.

Abstract

Training data attribution (TDA) methods aim to identify which training examples influence a model's predictions on specific test data most. By quantifying these influences, TDA supports critical applications such as data debugging, curation, and valuation. Gradient-based TDA methods rely on gradients and second-order information, limiting their applicability at scale. While recent random projection-based methods improve scalability, they often suffer from degraded attribution accuracy. Motivated by connections between uncertainty and influence functions, we introduce Daunce - a simple yet effective data attribution approach through uncertainty estimation. Our method operates by fine-tuning a collection of perturbed models and computing the covariance of per-example losses across these models as the attribution score. Daunce is scalable to large language models (LLMs) and achieves more accurate attribution compared to existing TDA methods. We validate Daunce on tasks ranging from vision tasks to LLM fine-tuning, and further demonstrate its compatibility with black-box model access. Applied to OpenAI's GPT models, our method achieves, to our knowledge, the first instance of data attribution on proprietary LLMs.

Paper Structure

This paper contains 38 sections, 1 theorem, 29 equations, 6 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

For each $i,j=1,\ldots,n$, under Algorithm alg:mp-if, we have

Figures (6)

  • Figure 1: LDS results for our method variants and baselines. (a) Comparison in the white-box setting. (b) Results under API-based black-box access.
  • Figure 2: Most Influential Subset Removal results on MATH and IFEval benchmarks, comparing Daunce with LoGra. A higher score indicates more accurate identification of influential examples.
  • Figure 3: Batch-wise gradient norms during training. We plot the gradient norms of the empirical risk term, the first-order perturbation term, and the full objective as functions of training steps.
  • Figure 4: Example queries and their top retrieved influential training examples in the black-box setting. Backdoor triggers and outputs are highlighted in red, and semantically similar text between the query and retrieved examples is highlighted in green. Irrelevant content is omitted and replaced with <......>.
  • Figure 5: Example query image and top influential training images (positive and negative) identified by Daunce. Results are shown using both correlation and covariance as the uncertainty measures. Labels for each retrieved image are also provided.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof : Proof of Theorem \ref{['lem:unbiased estimator']}