Are Sparse Neural Networks Better Hard Sample Learners?

Qiao Xiao; Boqian Wu; Lu Yin; Christopher Neil Gadzinski; Tianjin Huang; Mykola Pechenizkiy; Decebal Constantin Mocanu

Are Sparse Neural Networks Better Hard Sample Learners?

Qiao Xiao, Boqian Wu, Lu Yin, Christopher Neil Gadzinski, Tianjin Huang, Mykola Pechenizkiy, Decebal Constantin Mocanu

TL;DR

This paper addresses the challenge of learning from hard samples by evaluating unstructured Sparse Neural Networks (SNNs) across intrinsic complexity and external perturbations. It systematically compares diverse sparsification methods (GMP, LTH, OMP, SNIP, SET) on CIFAR-100 and TinyImageNet, using EL2N-based hard-sample filtering and varying data volumes. The main finding is that many SNNs can match or exceed dense-model accuracy at certain sparsity levels, with notable gains under limited data, and that keeping higher density in shallow layers is beneficial, especially when training from scratch. The work also shows robustness benefits under common corruptions and adversarial attacks, highlighting SET and SNIP as strong, compute-efficient options. Overall, the study provides practical insights into designing sparse learners for hard data and contributing to data-centric AI.

Abstract

While deep learning has demonstrated impressive progress, it remains a daunting challenge to learn from hard samples as these samples are usually noisy and intricate. These hard samples play a crucial role in the optimal performance of deep neural networks. Most research on Sparse Neural Networks (SNNs) has focused on standard training data, leaving gaps in understanding their effectiveness on complex and challenging data. This paper's extensive investigation across scenarios reveals that most SNNs trained on challenging samples can often match or surpass dense models in accuracy at certain sparsity levels, especially with limited data. We observe that layer-wise density ratios tend to play an important role in SNN performance, particularly for methods that train from scratch without pre-trained initialization. These insights enhance our understanding of SNNs' behavior and potential for efficient learning approaches in data-centric AI. Our code is publicly available at: \url{https://github.com/QiaoXiao7282/hard_sample_learners}.

Are Sparse Neural Networks Better Hard Sample Learners?

TL;DR

Abstract

Paper Structure (26 sections, 1 equation, 14 figures)

This paper contains 26 sections, 1 equation, 14 figures.

Introduction
Related work
Sparse Neural Networks
Learning on Hard Samples
Methodology and Evaluation
Sparse Neural Networks
Experiments on Samples with Intrinsic Complexity
EL2N Scores Based Measurements
Experiments on Samples with External Perturbing
Samples with Common Curruptions
Samples with Adversarial Attack
Empirical Analysis and Discussion
Which Layers Are Getting Sparsified?
The Role of Layer-wise Density Ratios in SNN Performance
SNNs Win Twice When Learned from Hard Examples
...and 11 more sections

Figures (14)

Figure 1: Examples of five CIFAR-100 training images for two randomly selected classes (apple and bus), showcasing those with the higher and lower EL2N scores. Images with lower scores typically feature simpler backgrounds and clear objects, whereas those with higher scores frequently display complex backgrounds or color biases.
Figure 2: Comparison of dense and SNNs models trained with EL2N score-filtered samples across CIFAR100 and TinyImageNet, with sparsity ratios from 10% to 90%. Sub-figures (a) and (b) display results trained with top 50% filtered samples, while sub-figures (c) and (d) show results from the top 30% filtered samples.
Figure 3: The comparison covers density ratios and data ratios ranging from 10% to 90% on CIFAR-100 dataset using ResNet18.
Figure 4: Comparison of dense and SNNs training on samples with common corruptions across CIFAR100 and TinyImageNet datasets with sparsity ratios ranging from 10% to 90%. The sub-figures (a) and (b) showcase experiments conducted on full data volume, while the last two (c) and (d) are conducted on a 30% data ratio.
Figure 5: Comparison of dense models and SNNs trained with samples with common corruptions using VGG19 on CIFAR-100, under full data volume (a) and 30% data volume (b). This comparison spans a range of sparsity ratios, from 10% to 90%.
...and 9 more figures

Are Sparse Neural Networks Better Hard Sample Learners?

TL;DR

Abstract

Are Sparse Neural Networks Better Hard Sample Learners?

Authors

TL;DR

Abstract

Table of Contents

Figures (14)