Table of Contents
Fetching ...

Robust Text Classification: Analyzing Prototype-Based Networks

Zhivar Sourati, Darshan Deshpande, Filip Ilievski, Kiril Gashteovski, Sascha Saralajew

TL;DR

Whether the robustness properties of PBNs transfer to text classification tasks under both targeted and static adversarial attack settings is studied and PBNs, as a mere architectural variation of vanilla LMs, offer more robustness compared to vanilla LMs under both targeted and static settings.

Abstract

Downstream applications often require text classification models to be accurate and robust. While the accuracy of the state-of-the-art Language Models (LMs) approximates human performance, they often exhibit a drop in performance on noisy data found in the real world. This lack of robustness can be concerning, as even small perturbations in the text, irrelevant to the target task, can cause classifiers to incorrectly change their predictions. A potential solution can be the family of Prototype-Based Networks (PBNs) that classifies examples based on their similarity to prototypical examples of a class (prototypes) and has been shown to be robust to noise for computer vision tasks. In this paper, we study whether the robustness properties of PBNs transfer to text classification tasks under both targeted and static adversarial attack settings. Our results show that PBNs, as a mere architectural variation of vanilla LMs, offer more robustness compared to vanilla LMs under both targeted and static settings. We showcase how PBNs' interpretability can help us to understand PBNs' robustness properties. Finally, our ablation studies reveal the sensitivity of PBNs' robustness to how strictly clustering is done in the training phase, as tighter clustering results in less robust PBNs.

Robust Text Classification: Analyzing Prototype-Based Networks

TL;DR

Whether the robustness properties of PBNs transfer to text classification tasks under both targeted and static adversarial attack settings is studied and PBNs, as a mere architectural variation of vanilla LMs, offer more robustness compared to vanilla LMs under both targeted and static settings.

Abstract

Downstream applications often require text classification models to be accurate and robust. While the accuracy of the state-of-the-art Language Models (LMs) approximates human performance, they often exhibit a drop in performance on noisy data found in the real world. This lack of robustness can be concerning, as even small perturbations in the text, irrelevant to the target task, can cause classifiers to incorrectly change their predictions. A potential solution can be the family of Prototype-Based Networks (PBNs) that classifies examples based on their similarity to prototypical examples of a class (prototypes) and has been shown to be robust to noise for computer vision tasks. In this paper, we study whether the robustness properties of PBNs transfer to text classification tasks under both targeted and static adversarial attack settings. Our results show that PBNs, as a mere architectural variation of vanilla LMs, offer more robustness compared to vanilla LMs under both targeted and static settings. We showcase how PBNs' interpretability can help us to understand PBNs' robustness properties. Finally, our ablation studies reveal the sensitivity of PBNs' robustness to how strictly clustering is done in the training phase, as tighter clustering results in less robust PBNs.
Paper Structure (52 sections, 5 equations, 16 figures, 10 tables)

This paper contains 52 sections, 5 equations, 16 figures, 10 tables.

Figures (16)

  • Figure 1: Classification by a PBN. The model computes distances between the new point and prototypes, $d(e_j, P_k)$, and distances within prototypes, $d(P_k, P_l)$, for both inference and training. During training, the model minimizes the loss term, $\mathcal{L}$, consisting of $\mathcal{L}_{ce}$, $\lambda_c \mathcal{L}_{c}$, $\lambda_i \mathcal{L}_{i}$, $\lambda_s \mathcal{L}_{s}$, controlling the importance of accuracy, clustering, interpretability, and separation of prototypes, based on all the computed distances; during inference, distances between the new point and prototypes are used for classification by a fully connected layer.
  • Figure 2: Attack Success Rate (ASR %) of PBNs with different $\lambda_c$ values adjusting the importance of clustering in the trained PBNs, with other hyperparameters set to their best values, and averaged across other possible variables (e.g., backbone and attack type). The dashed line represents the ASR for the non-PBN model.
  • Figure 3: Attack Success Rate (ASR %) of PBNs with different numbers of prototypes, with other hyperparameters set to their best values, and averaged across other possible variables (e.g., backbone and attack type). dashed line represents the ASR for the non-PBN model.
  • Figure 4: Attack Success Rate (ASR %) of PBNs with different distance functions and other hyperparameters set to their best values and averaged across other possible variables (e.g., backbone and attack type). The dashed line represents the ASR for the vanilla LMs.
  • Figure 5: Average Percentage of Words Perturbed (APWP) of PBNs with different distance functions and other hyperparameters set to their best values and averaged across other possible variables (e.g., backbone and attack type). The dashed line represents the APWP for the vanilla LMs.
  • ...and 11 more figures