PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

Yuzhi Liang; Shiliang Xiao; Jingsong Wei; Qiliang Lin; Xia Li

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

Yuzhi Liang, Shiliang Xiao, Jingsong Wei, Qiliang Lin, Xia Li

TL;DR

PivotAttack, a query-efficient"inside-out"framework that employs a Multi-Armed Bandit algorithm to identify Pivot Sets-combinatorial token groups acting as prediction anchors-and strategically perturbs them to induce label flips, which captures inter-word dependencies and minimizes query costs.

Abstract

Existing hard-label text attacks often rely on inefficient "outside-in" strategies that traverse vast search spaces. We propose PivotAttack, a query-efficient "inside-out" framework. It employs a Multi-Armed Bandit algorithm to identify Pivot Sets-combinatorial token groups acting as prediction anchors-and strategically perturbs them to induce label flips. This approach captures inter-word dependencies and minimizes query costs. Extensive experiments across traditional models and Large Language Models demonstrate that PivotAttack consistently outperforms state-of-the-art baselines in both Attack Success Rate and query efficiency.

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

TL;DR

Abstract

Paper Structure (27 sections, 13 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 27 sections, 13 equations, 7 figures, 8 tables, 1 algorithm.

Introduction
Related Work
White-box and Soft-label Attacks.
Hard-label Black-box Attacks.
Problem Formulation
Methodology
Pivot Set Identification
Non-Actionable Attack Culling
Construction of Pivot Set
Perturbation Execution
Experiment
Experimental Setup
Overall Result (RQ1)
Query Budget (RQ2)
Transferability (RQ3)
...and 12 more sections

Figures (7)

Figure 1: The overall workflow of PivotAttack. The framework first identifies Pivot Sets that anchor the model’s prediction, selecting sets with high retention precision and refining them via a multi-armed bandit. It then generates substitutions for the pivot words and selects the variant most similar to the original sentence as the final adversarial example.
Figure 2: Pivot Set Identification on MR
Figure 3: ASR vs. Query Budget: MR (Qwen2.5-FT)
Figure 4: Human Evaluation
Figure 5: Zero-shot LLM prompts used in hard-label evaluation on different classes. It shows the exact prompts we use during evaluation.
...and 2 more figures

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

TL;DR

Abstract

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

Authors

TL;DR

Abstract

Table of Contents

Figures (7)