Table of Contents
Fetching ...

Evaluating Data Influence in Meta Learning

Chenyang Ren, Huanyi Xie, Shu Yang, Meng Ding, Lijie Hu, Di Wang

TL;DR

This work introduces a data attribution framework for meta learning under bilevel optimization by deriving task-IF and instance-IF to quantify how tasks and data points influence both meta-parameters and task-specific parameters. It formulates the total gradient and Hessian to capture direct and indirect effects across inner and outer levels and proposes acceleration techniques, including Neumann-series approximations and a Γ-based total Hessian surrogate, to scale to large models. Empirical results on Omniglot, MNIST, and Mini-Imagenet show that task-IF can achieve near-retraining accuracy with substantial runtime savings, while instance-IF enables efficient evaluation and editing of training data, including effective identification and removal of harmful data. The framework thus enhances data quality, interpretability, and efficiency in meta learning, with broad applicability to data cleaning, task selection, and model editing in complex BLO-based systems.

Abstract

As one of the most fundamental models, meta learning aims to effectively address few-shot learning challenges. However, it still faces significant issues related to the training data, such as training inefficiencies due to numerous low-contribution tasks in large datasets and substantial noise from incorrect labels. Thus, training data attribution methods are needed for meta learning. However, the dual-layer structure of mata learning complicates the modeling of training data contributions because of the interdependent influence between meta-parameters and task-specific parameters, making existing data influence evaluation tools inapplicable or inaccurate. To address these challenges, based on the influence function, we propose a general data attribution evaluation framework for meta-learning within the bilevel optimization framework. Our approach introduces task influence functions (task-IF) and instance influence functions (instance-IF) to accurately assess the impact of specific tasks and individual data points in closed forms. This framework comprehensively models data contributions across both the inner and outer training processes, capturing the direct effects of data points on meta-parameters as well as their indirect influence through task-specific parameters. We also provide several strategies to enhance computational efficiency and scalability. Experimental results demonstrate the framework's effectiveness in training data evaluation via several downstream tasks.

Evaluating Data Influence in Meta Learning

TL;DR

This work introduces a data attribution framework for meta learning under bilevel optimization by deriving task-IF and instance-IF to quantify how tasks and data points influence both meta-parameters and task-specific parameters. It formulates the total gradient and Hessian to capture direct and indirect effects across inner and outer levels and proposes acceleration techniques, including Neumann-series approximations and a Γ-based total Hessian surrogate, to scale to large models. Empirical results on Omniglot, MNIST, and Mini-Imagenet show that task-IF can achieve near-retraining accuracy with substantial runtime savings, while instance-IF enables efficient evaluation and editing of training data, including effective identification and removal of harmful data. The framework thus enhances data quality, interpretability, and efficiency in meta learning, with broad applicability to data cleaning, task selection, and model editing in complex BLO-based systems.

Abstract

As one of the most fundamental models, meta learning aims to effectively address few-shot learning challenges. However, it still faces significant issues related to the training data, such as training inefficiencies due to numerous low-contribution tasks in large datasets and substantial noise from incorrect labels. Thus, training data attribution methods are needed for meta learning. However, the dual-layer structure of mata learning complicates the modeling of training data contributions because of the interdependent influence between meta-parameters and task-specific parameters, making existing data influence evaluation tools inapplicable or inaccurate. To address these challenges, based on the influence function, we propose a general data attribution evaluation framework for meta-learning within the bilevel optimization framework. Our approach introduces task influence functions (task-IF) and instance influence functions (instance-IF) to accurately assess the impact of specific tasks and individual data points in closed forms. This framework comprehensively models data contributions across both the inner and outer training processes, capturing the direct effects of data points on meta-parameters as well as their indirect influence through task-specific parameters. We also provide several strategies to enhance computational efficiency and scalability. Experimental results demonstrate the framework's effectiveness in training data evaluation via several downstream tasks.

Paper Structure

This paper contains 22 sections, 13 theorems, 95 equations, 4 figures, 1 table, 6 algorithms.

Key Result

Theorem 4.1

The total gradient of the $i$-th task related outer loss $L_O\left(\lambda, \theta_i(\lambda); D_i^{val}\right)$ with respect to $\lambda$ can be written as: The term $\frac{\mathrm{d}\theta_i(\lambda)}{\mathrm{d}\lambda}$ can be calculated by where $H_{i,\text{in}}$ is the $i$-th inner-level Hessian matrix, defined as $H_{i,\text{in}} = {\partial_{\theta_i}\partial_{\theta_i}L_I\left(\lambda, \

Figures (4)

  • Figure 1: Harmful tasks removal experiment. IS means using influence score to determine which tasks to remove. Random refers to randomly removing tasks.
  • Figure 2: Data attribution effectiveness on MNIST dataset at the task level.
  • Figure 3: Data attribution effectiveness on MNIST dataset at the instance level.
  • Figure :

Theorems & Definitions (27)

  • Theorem 4.1
  • Definition 4.2
  • Theorem 4.3
  • Remark 4.4
  • Theorem 4.5: Instance-IF for Validation Data
  • Theorem 4.6
  • Theorem 4.7
  • Proposition 4.8: Instance-IF for Training Data
  • Theorem 5.1
  • Definition 5.2: Evaluation Function for Meta Learning
  • ...and 17 more