Table of Contents
Fetching ...

LiLAW: Lightweight Learnable Adaptive Weighting to Meta-Learn Sample Difficulty, Improve Noisy Training, Increase Fairness, and Effectively Use Synthetic Data

Abhishek Moturu, Muhammad Muzammil, Anna Goldenberg, Babak Taati

Abstract

Training deep neural networks with noise and data heterogeneity is a major challenge. We introduce Lightweight Learnable Adaptive Weighting (LiLAW), a method that dynamically adjusts the loss weight of each training sample based on its evolving difficulty, categorized as easy, moderate, or hard. Using only three learnable parameters, LiLAW adaptively prioritizes informative samples during training by updating these parameters using a single gradient descent step on a validation mini-batch after each training mini-batch. Experiments across multiple general and medical imaging datasets, noise levels/types, loss functions, and architectures with and without pretraining (with linear probing and full fine-tuning) demonstrate that LiLAW's effectiveness, even in high-noise environments, without excessive tuning. We also apply LiLAW to two recently introduced synthetic datasets: SynPAIN (synthetic facial expressions for automated pain detection) and GAITGen (synthetic gait sequences for Parkinson's disease severity estimation). We also validate on ECG5000, a time-series dataset for heartbeat classification, with simple augmentations. We obtain state-of-the-art results on these three datasets. We then use LiLAW on the Adult dataset to show improved fairness. LiLAW is effective without heavy reliance on advanced training techniques or data augmentations, highlighting its practicality, esp. in resource-constrained settings. It offers a computationally efficient solution to boost generalization and robustness in any neural network training setup.

LiLAW: Lightweight Learnable Adaptive Weighting to Meta-Learn Sample Difficulty, Improve Noisy Training, Increase Fairness, and Effectively Use Synthetic Data

Abstract

Training deep neural networks with noise and data heterogeneity is a major challenge. We introduce Lightweight Learnable Adaptive Weighting (LiLAW), a method that dynamically adjusts the loss weight of each training sample based on its evolving difficulty, categorized as easy, moderate, or hard. Using only three learnable parameters, LiLAW adaptively prioritizes informative samples during training by updating these parameters using a single gradient descent step on a validation mini-batch after each training mini-batch. Experiments across multiple general and medical imaging datasets, noise levels/types, loss functions, and architectures with and without pretraining (with linear probing and full fine-tuning) demonstrate that LiLAW's effectiveness, even in high-noise environments, without excessive tuning. We also apply LiLAW to two recently introduced synthetic datasets: SynPAIN (synthetic facial expressions for automated pain detection) and GAITGen (synthetic gait sequences for Parkinson's disease severity estimation). We also validate on ECG5000, a time-series dataset for heartbeat classification, with simple augmentations. We obtain state-of-the-art results on these three datasets. We then use LiLAW on the Adult dataset to show improved fairness. LiLAW is effective without heavy reliance on advanced training techniques or data augmentations, highlighting its practicality, esp. in resource-constrained settings. It offers a computationally efficient solution to boost generalization and robustness in any neural network training setup.

Paper Structure

This paper contains 48 sections, 2 theorems, 40 equations, 5 figures, 22 tables, 1 algorithm.

Key Result

Corollary 1.1

Let $c\ge 2$ and assume label noise with transition matrix $T\in[0,1]^{c\times c}$ where $T_{ij} := \mathbb{P}(\tilde{y}=j \mid y=i)$. Assume the diagonally dominant condition used in Theorems 1 and 2 in gui2021towards: Let $(x,\tilde{y})$ be an input and observed label pair, let $f^*(x)\in\{0,\dots,c-1\}$ be the true label function (referred to as the target concept in gui2021towards), let $g^*(

Figures (5)

  • Figure 1: Given noisy training and validation data, LiLAW learns to adaptively weight the loss of each sample based on three trainable parameters, $\alpha, \delta, \beta$, pertaining to learning easy, moderate, and hard samples, respectively, at different stages of training using meta-learning on the validation data.
  • Figure 2: A graphical representation of the LiLAW weighting method. Darker areas correspond to high loss regions (due to disagreement and/or unconfidence) and lighter areas correspond to low loss regions (due to agreement and/or confidence). The three weight functions in Equations (2), (3), and (4) correspond to the green, red, and orange lines, respectively, with the colored arrows representing the direction of descent for each corresponding weight function. Note also that Equations (2) and (3) use a sigmoid and Equation (4) uses a radial basis function. We also signify whether each weight is high or low in a given region.
  • Figure 3: Accuracy with and without LiLAW on ten 2D datasets from MedMNISTv2.
  • Figure 4: Plots showing how $\alpha,\beta,\delta$ change during training with 0% symmetric noise and 50% symmetric noise on CIFAR-100-M.
  • Figure 5: AUROC with and without LiLAW on ten 2D datasets from MedMNISTv2.

Theorems & Definitions (4)

  • Corollary 1.1
  • proof
  • Proposition 1.2
  • proof : Proof sketch