Data-Efficient and Robust Task Selection for Meta-Learning

Donglin Zhan; James Anderson

Data-Efficient and Robust Task Selection for Meta-Learning

Donglin Zhan, James Anderson

TL;DR

The Data-Efficient and Robust Task Selection (DERTS) algorithm, which can be incorporated into both gradient and metric-based meta-learning algorithms, and which outperforms existing sampling strategies for meta-learning on both gradient-based and metric-based meta-learning algorithms in limited data budget and noisy task settings.

Abstract

Meta-learning methods typically learn tasks under the assumption that all tasks are equally important. However, this assumption is often not valid. In real-world applications, tasks can vary both in their importance during different training stages and in whether they contain noisy labeled data or not, making a uniform approach suboptimal. To address these issues, we propose the Data-Efficient and Robust Task Selection (DERTS) algorithm, which can be incorporated into both gradient and metric-based meta-learning algorithms. DERTS selects weighted subsets of tasks from task pools by minimizing the approximation error of the full gradient of task pools in the meta-training stage. The selected tasks are efficient for rapid training and robust towards noisy label scenarios. Unlike existing algorithms, DERTS does not require any architecture modification for training and can handle noisy label data in both the support and query sets. Analysis of DERTS shows that the algorithm follows similar training dynamics as learning on the full task pools. Experiments show that DERTS outperforms existing sampling strategies for meta-learning on both gradient-based and metric-based meta-learning algorithms in limited data budget and noisy task settings.

Data-Efficient and Robust Task Selection for Meta-Learning

TL;DR

Abstract

Paper Structure (31 sections, 2 theorems, 32 equations, 3 figures, 8 tables, 2 algorithms)

This paper contains 31 sections, 2 theorems, 32 equations, 3 figures, 8 tables, 2 algorithms.

Introduction
Related Work
Background
Efficient and Robust Task Selection
Full Gradient Approximation for Episodic Task Pools
Extracting Subsets Efficiently
DERTS with Noisy Tasks
Analysis for DERTS
Experiment Results
Dataset and Baseline:
Implementation Details:
Meta-Learning with Limited Budget
Experiment Setup:
Experiment Results with Fewer Tasks:
Experiment Results with Fewer Class for Generating Tasks:
...and 16 more sections

Key Result

Theorem 1

Assume that the loss function $\mathcal{L}(f,\mathcal{D})$ satisfies assumptions a1 --a3 and $\epsilon$ is an upper bound for the RHS of Eq.approx. Then, with the proper constant learning rate $\eta$ and $\eta'$ for outer and inner loop updates and a initialization point $\theta^0$, applying DERTS h where and

Figures (3)

Figure 1: DERTS requires task pools to store episodic tasks sampled from task distributions. With the efficient gradient estimation in sec.\ref{['sub2']}, the gradients of all the tasks stored in the task pool are computed. According to the approximation formulated in sec. \ref{['sub1']} and optimization objective in sec. \ref{['sub2']}, a subset of tasks with corresponding weights is constructed to approximate the task pool gradient. The meta-model then conducts a training process on the subsets instead of task pools.
Figure 2: Loss Residual and Accuracy for Noisy Task Settings (Early Stage). (a) Test Accuracy of $25\%$ Noise Setting on Mini-ImageNet of ANIL. (b) Training Loss of $25\%$ Noise Setting on Mini-ImageNet of ANIL. (c) Test Accuracy of $40\%$ Noise Setting on Mini-ImageNet of PN. (d) Training Loss of $40\%$ Noise Setting on Mini-ImageNet of PN.
Figure 3: Typical examples of selected tasks by DERTS and unselected tasks.

Theorems & Definitions (3)

Definition 1: Submodularity
Theorem 1: Training Dynamics
Proposition 1: Gradient Norm Upper Bound

Data-Efficient and Robust Task Selection for Meta-Learning

TL;DR

Abstract

Data-Efficient and Robust Task Selection for Meta-Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (3)