Optimal Layer Selection for Latent Data Augmentation

Tomoumi Takase; Ryo Karakida

Optimal Layer Selection for Latent Data Augmentation

Tomoumi Takase, Ryo Karakida

TL;DR

This work tackles the challenge of selecting latent-layer positions for data augmentation in neural networks. It introduces AdaLASE, a gradient-based method that assigns and dynamically updates per-layer acceptance ratios $q_i$ to determine where latent-DA should be applied, using a proxy loss $L_{DA}$ and a pseudo-validation objective to guide optimization. Across diverse datasets and models, AdaLASE achieves accuracy comparable to or better than Uniform DA, while revealing that optimal layers for augmentation depend on data regime and task. The proposed framework reduces heuristic layer selection and computational costs, with potential extensions to optimize augmentation types and multiple methods jointly, enhancing automated latent-DA policy search in transfer learning and beyond.

Abstract

While data augmentation (DA) is generally applied to input data, several studies have reported that applying DA to hidden layers in neural networks, i.e., feature augmentation, can improve performance. However, in previous studies, the layers to which DA is applied have not been carefully considered, often being applied randomly and uniformly or only to a specific layer, leaving room for arbitrariness. Thus, in this study, we investigated the trends of suitable layers for applying DA in various experimental configurations, e.g., training from scratch, transfer learning, various dataset settings, and different models. In addition, to adjust the suitable layers for DA automatically, we propose the adaptive layer selection (AdaLASE) method, which updates the ratio to perform DA for each layer based on the gradient descent method during training. The experimental results obtained on several image classification datasets indicate that the proposed AdaLASE method altered the ratio as expected and achieved high overall test accuracy.

Optimal Layer Selection for Latent Data Augmentation

TL;DR

to determine where latent-DA should be applied, using a proxy loss

and a pseudo-validation objective to guide optimization. Across diverse datasets and models, AdaLASE achieves accuracy comparable to or better than Uniform DA, while revealing that optimal layers for augmentation depend on data regime and task. The proposed framework reduces heuristic layer selection and computational costs, with potential extensions to optimize augmentation types and multiple methods jointly, enhancing automated latent-DA policy search in transfer learning and beyond.

Abstract

Paper Structure (10 sections, 7 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 10 sections, 7 equations, 7 figures, 2 tables, 1 algorithm.

Introduction
Related Work
Latent Data Augmentation
Proposed AdaLASE Method
Experiments
Accuracy Evaluation of Latent-DA
Verification of AdaLASE Method based on Acceptance Ratio
Comparison of Layer Selection between AdaLASE and Uniform DA
Sample Size Dependency in Transfer Learning
Conclusion

Figures (7)

Figure 1: Positions to apply DA in several neural networks.
Figure 2: Examples of latent-DA. The sample in Example 1 was selected from the COIL-20 dataset, and the sample in Example 2 was selected from the STL-10 dataset. These samples were augmented using cutout or translation at P0, P1, and P2 in ResNet18.
Figure 3: Transitions of the acceptance ratio for P0 in proposed AdaLASE method when training MLP on the CIFAR-10 dataset. The results of 20 runs with different initializations are shown. Test data were used rather than the pseudo-validation data in the AdaLASE method. The models were trained from scratch.
Figure 4: Difference in acceptance ratios when the lower limit was varied. MLP was trained on the CIFAR-10 dataset with cutout. Test data were used rather than the pseudo-validation dataset in the AdaLASE method. The models were trained from scratch.
Figure 5: Difference of numbers of the iterations that selected the worst layer between the AdaLASE method and the uniform ratio method. Test data were used rather than the pseudo-validation data in the AdaLASE method. The models were trained from scratch.
...and 2 more figures

Optimal Layer Selection for Latent Data Augmentation

TL;DR

Abstract

Optimal Layer Selection for Latent Data Augmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)