Table of Contents
Fetching ...

BinarySelect to Improve Accessibility of Black-Box Attack Research

Shatarupa Ghosh, Jonathan Rusert

TL;DR

The paper tackles the high query cost of black-box adversarial text attacks by introducing BinarySelect, a binary-search-inspired word selection method with a memory-augmented tree structure to reduce queries. Theoretical analysis shows best-case query complexity of $\log_2(n) * 2$, while empirical validation across Yelp, IMDB, and AG News demonstrates substantial query reductions with modest drops in attack effectiveness. Results indicate BinarySelect is especially beneficial for longer texts and resource-constrained researchers, and the authors also explore character-level attacks and hybrid strategies with GreedySelect. The work highlights a practical path toward more accessible adversarial research and suggests avenues for combining selection methods and improving replacement strategies.

Abstract

Adversarial text attack research is useful for testing the robustness of NLP models, however, the rise of transformers has greatly increased the time required to test attacks. Especially when researchers do not have access to adequate resources (e.g. GPUs). This can hinder attack research, as modifying one example for an attack can require hundreds of queries to a model, especially for black-box attacks. Often these attacks remove one token at a time to find the ideal one to change, requiring $n$ queries (the length of the text) right away. We propose a more efficient selection method called BinarySelect which combines binary search and attack selection methods to greatly reduce the number of queries needed to find a token. We find that BinarySelect only needs $\text{log}_2(n) * 2$ queries to find the first token compared to $n$ queries. We also test BinarySelect in an attack setting against 5 classifiers across 3 datasets and find a viable tradeoff between number of queries saved and attack effectiveness. For example, on the Yelp dataset, the number of queries is reduced by 32% (72 less) with a drop in attack effectiveness of only 5 points. We believe that BinarySelect can help future researchers study adversarial attacks and black-box problems more efficiently and opens the door for researchers with access to less resources.

BinarySelect to Improve Accessibility of Black-Box Attack Research

TL;DR

The paper tackles the high query cost of black-box adversarial text attacks by introducing BinarySelect, a binary-search-inspired word selection method with a memory-augmented tree structure to reduce queries. Theoretical analysis shows best-case query complexity of , while empirical validation across Yelp, IMDB, and AG News demonstrates substantial query reductions with modest drops in attack effectiveness. Results indicate BinarySelect is especially beneficial for longer texts and resource-constrained researchers, and the authors also explore character-level attacks and hybrid strategies with GreedySelect. The work highlights a practical path toward more accessible adversarial research and suggests avenues for combining selection methods and improving replacement strategies.

Abstract

Adversarial text attack research is useful for testing the robustness of NLP models, however, the rise of transformers has greatly increased the time required to test attacks. Especially when researchers do not have access to adequate resources (e.g. GPUs). This can hinder attack research, as modifying one example for an attack can require hundreds of queries to a model, especially for black-box attacks. Often these attacks remove one token at a time to find the ideal one to change, requiring queries (the length of the text) right away. We propose a more efficient selection method called BinarySelect which combines binary search and attack selection methods to greatly reduce the number of queries needed to find a token. We find that BinarySelect only needs queries to find the first token compared to queries. We also test BinarySelect in an attack setting against 5 classifiers across 3 datasets and find a viable tradeoff between number of queries saved and attack effectiveness. For example, on the Yelp dataset, the number of queries is reduced by 32% (72 less) with a drop in attack effectiveness of only 5 points. We believe that BinarySelect can help future researchers study adversarial attacks and black-box problems more efficiently and opens the door for researchers with access to less resources.

Paper Structure

This paper contains 33 sections, 6 equations, 5 figures, 11 tables, 2 algorithms.

Figures (5)

  • Figure 1: Visualization of GreedySelect versus BinarySelect. GreedySelect removes 1 word at a time and checks the change in probability. BinarySelect continuously splits the text in 2 and excludes the segments from the query. The excluded segment which causes the highest drop in target class probability is split again and so on. Eventually, the splitting leaves only 1 word which is chosen.
  • Figure 2: Visualization of Binary Tree leveraged to store the probabilities returned from BinarySelect. Note that the values indicate probability of target class. After the root node, the probabilities are calculated by removing the text at that node from the original text.
  • Figure 3: Number of queries required to find a word leveraged by the classifier. GreedySelect's queries increase linearly, while BinarySelect shows a log trend.
  • Figure 4: Effect of $k$ Values on EDR (Equation \ref{['eq:edr']}) for the successful attacks. Positive values indicate a better trade-off between reduction in queries versus loss of accuracy drop for BS.
  • Figure 5: Effect of $k$ Values on EDR (Equation \ref{['eq:edr']}) for the all attacks. Positive values indicate a better trade-off between reduction in queries versus loss of accuracy drop for BS.