Table of Contents
Fetching ...

LCANets++: Robust Audio Classification using Multi-layer Neural Networks with Lateral Competition

Sayanton V. Dibbo, Juston S. Moore, Garrett T. Kenyon, Michael A. Teti

TL;DR

LCANets++ are introduced, which are CNNs that perform sparse coding in multiple layers via LCA that are more robust than standard CNNs and LCANets against perturbations, e.g., background noise, as well as black-box and white-box attacks.

Abstract

Audio classification aims at recognizing audio signals, including speech commands or sound events. However, current audio classifiers are susceptible to perturbations and adversarial attacks. In addition, real-world audio classification tasks often suffer from limited labeled data. To help bridge these gaps, previous work developed neuro-inspired convolutional neural networks (CNNs) with sparse coding via the Locally Competitive Algorithm (LCA) in the first layer (i.e., LCANets) for computer vision. LCANets learn in a combination of supervised and unsupervised learning, reducing dependency on labeled samples. Motivated by the fact that auditory cortex is also sparse, we extend LCANets to audio recognition tasks and introduce LCANets++, which are CNNs that perform sparse coding in multiple layers via LCA. We demonstrate that LCANets++ are more robust than standard CNNs and LCANets against perturbations, e.g., background noise, as well as black-box and white-box attacks, e.g., evasion and fast gradient sign (FGSM) attacks.

LCANets++: Robust Audio Classification using Multi-layer Neural Networks with Lateral Competition

TL;DR

LCANets++ are introduced, which are CNNs that perform sparse coding in multiple layers via LCA that are more robust than standard CNNs and LCANets against perturbations, e.g., background noise, as well as black-box and white-box attacks.

Abstract

Audio classification aims at recognizing audio signals, including speech commands or sound events. However, current audio classifiers are susceptible to perturbations and adversarial attacks. In addition, real-world audio classification tasks often suffer from limited labeled data. To help bridge these gaps, previous work developed neuro-inspired convolutional neural networks (CNNs) with sparse coding via the Locally Competitive Algorithm (LCA) in the first layer (i.e., LCANets) for computer vision. LCANets learn in a combination of supervised and unsupervised learning, reducing dependency on labeled samples. Motivated by the fact that auditory cortex is also sparse, we extend LCANets to audio recognition tasks and introduce LCANets++, which are CNNs that perform sparse coding in multiple layers via LCA. We demonstrate that LCANets++ are more robust than standard CNNs and LCANets against perturbations, e.g., background noise, as well as black-box and white-box attacks, e.g., evasion and fast gradient sign (FGSM) attacks.
Paper Structure (19 sections, 2 equations, 3 figures, 4 tables)

This paper contains 19 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: An overview of (a.) LCA frontend and (b.) pipeline of our proposed LCANets++, utilizing sparse coding via multiple LCA layers in the state-of-the-art (SOTA) CNN backbone, enabling lower misclassification on perturbed test sets or attacks.
  • Figure 2: Comparisons of our LCANets++ and other SOTA models against perturbations with background noise.
  • Figure 3: Comparisons of LCANets++ and SOTA models on $L_\infty$ norm white-box attacks.