Property Neurons in Self-Supervised Speech Transformers

Tzu-Quan Lin; Guan-Ting Lin; Hung-yi Lee; Hao Tang

Property Neurons in Self-Supervised Speech Transformers

Tzu-Quan Lin, Guan-Ting Lin, Hung-yi Lee, Hao Tang

TL;DR

This work identifies a set of property neurons in the feedforward layers of Transformers to study how speech-related properties, such as phones, gender, and pitch, are stored and shows that protecting property neurons during pruning is significantly more effective than normbased pruning.

Abstract

There have been many studies on analyzing self-supervised speech Transformers, in particular, with layer-wise analysis. It is, however, desirable to have an approach that can pinpoint exactly a subset of neurons that is responsible for a particular property of speech, being amenable to model pruning and model editing. In this work, we identify a set of property neurons in the feedforward layers of Transformers to study how speech-related properties, such as phones, gender, and pitch, are stored. When removing neurons of a particular property (a simple form of model editing), the respective downstream performance significantly degrades, showing the importance of the property neurons. We apply this approach to pruning the feedforward layers in Transformers, where most of the model parameters are. We show that protecting property neurons during pruning is significantly more effective than norm-based pruning. The code for identifying property neurons is available at https://github.com/nervjack2/PropertyNeurons.

Property Neurons in Self-Supervised Speech Transformers

TL;DR

Abstract

Paper Structure (17 sections, 5 equations, 9 figures, 1 table)

This paper contains 17 sections, 5 equations, 9 figures, 1 table.

Introduction
Feedforward layers of Transformers
Neuron activations
A definition of activation
Activation patterns of properties
Layer-wise analysis of activation patterns
Layer-wise analysis of other speech models
Property Neurons
Finding Property Neurons
Do property neurons really encode property?
How many property neurons are there?
Some neurons encode more than one property
Application of Property Neurons
Improving task-specific pruning
Erase speaker information for privacy
...and 2 more sections

Figures (9)

Figure 1: The illustration of how feed-forward networks in Transformers could be regard as a type of neural memory.
Figure 2: The probability of neurons activated when a phone [ah] is present. The neurons are sorted according to the probability.
Figure 3: The results of multidimensional scaling on the activation patterns of phones conditioned on broad phone classes, gender and pitch. Different colors represent different groups. For each condition, we show the layer with the highest silhouette score rousseeuw1987silhouettes, i.e., the 8th layer, the 1st layer, and the 1st layer, respectively. We consider [r], [y], [w] and [l] as voiced consonants here.
Figure 4: The result of performing multidimensional scaling on the activation patterns of phones for different properties of speech. We report silhouette score to measure cluster tightness.
Figure 5: The silhouette score of multidimensional scaling on the activation patterns of phones for different speech models. We report the highest score among all layers for each model and each property. MelHuBERT-PR and MelHuBERT-SID denote fine-tuned MelHuBERT on phoneme recognition and speaker identification respectively.
...and 4 more figures

Property Neurons in Self-Supervised Speech Transformers

TL;DR

Abstract

Property Neurons in Self-Supervised Speech Transformers

Authors

TL;DR

Abstract

Table of Contents

Figures (9)