Table of Contents
Fetching ...

EMPRA: Embedding Perturbation Rank Attack against Neural Ranking Models

Amin Bigdeli, Negar Arabzadeh, Ebrahim Bagheri, Charles L. A. Clarke

TL;DR

The paper addresses the vulnerability of neural ranking models to adversarial manipulation by proposing EMPRA, a surrogate-agnostic black-box attack that perturbs sentence-level embeddings to elevate targeted documents in rankings. EMPRA operates in two stages: first generating adversarial text through embedding-space transport and transformation toward query-related anchors, then constructing the adversarial document by strategically inserting these texts and selecting candidates via a coherence-and-relevance interpolated score. Across MS MARCO and additional out-of-distribution datasets, EMPRA achieves high attack efficacy (ASR > 99% in many settings, with substantial top-10/top-50 boosts) while preserving fluency and readability, outperforming baselines that rely on surrogate models. The results demonstrate EMPRA’s robustness and practicality in real-world black-box scenarios, highlighting the need for defenses against embedding-level perturbations in neural ranking systems.

Abstract

Recent research has shown that neural information retrieval techniques may be susceptible to adversarial attacks. Adversarial attacks seek to manipulate the ranking of documents, with the intention of exposing users to targeted content. In this paper, we introduce the Embedding Perturbation Rank Attack (EMPRA) method, a novel approach designed to perform adversarial attacks on black-box Neural Ranking Models (NRMs). EMPRA manipulates sentence-level embeddings, guiding them towards pertinent context related to the query while preserving semantic integrity. This process generates adversarial texts that seamlessly integrate with the original content and remain imperceptible to humans. Our extensive evaluation conducted on the widely-used MS MARCO V1 passage collection demonstrate the effectiveness of EMPRA against a wide range of state-of-the-art baselines in promoting a specific set of target documents within a given ranked results. Specifically, EMPRA successfully achieves a re-ranking of almost 96% of target documents originally ranked between 51-100 to rank within the top 10. Furthermore, EMPRA does not depend on surrogate models for adversarial text generation, enhancing its robustness against different NRMs in realistic settings.

EMPRA: Embedding Perturbation Rank Attack against Neural Ranking Models

TL;DR

The paper addresses the vulnerability of neural ranking models to adversarial manipulation by proposing EMPRA, a surrogate-agnostic black-box attack that perturbs sentence-level embeddings to elevate targeted documents in rankings. EMPRA operates in two stages: first generating adversarial text through embedding-space transport and transformation toward query-related anchors, then constructing the adversarial document by strategically inserting these texts and selecting candidates via a coherence-and-relevance interpolated score. Across MS MARCO and additional out-of-distribution datasets, EMPRA achieves high attack efficacy (ASR > 99% in many settings, with substantial top-10/top-50 boosts) while preserving fluency and readability, outperforming baselines that rely on surrogate models. The results demonstrate EMPRA’s robustness and practicality in real-world black-box scenarios, highlighting the need for defenses against embedding-level perturbations in neural ranking systems.

Abstract

Recent research has shown that neural information retrieval techniques may be susceptible to adversarial attacks. Adversarial attacks seek to manipulate the ranking of documents, with the intention of exposing users to targeted content. In this paper, we introduce the Embedding Perturbation Rank Attack (EMPRA) method, a novel approach designed to perform adversarial attacks on black-box Neural Ranking Models (NRMs). EMPRA manipulates sentence-level embeddings, guiding them towards pertinent context related to the query while preserving semantic integrity. This process generates adversarial texts that seamlessly integrate with the original content and remain imperceptible to humans. Our extensive evaluation conducted on the widely-used MS MARCO V1 passage collection demonstrate the effectiveness of EMPRA against a wide range of state-of-the-art baselines in promoting a specific set of target documents within a given ranked results. Specifically, EMPRA successfully achieves a re-ranking of almost 96% of target documents originally ranked between 51-100 to rank within the top 10. Furthermore, EMPRA does not depend on surrogate models for adversarial text generation, enhancing its robustness against different NRMs in realistic settings.

Paper Structure

This paper contains 31 sections, 13 equations, 3 figures, 7 tables, 1 algorithm.

Figures (3)

  • Figure 1: Prompt used by GPT-4 for generating adversarially perturbed documents.
  • Figure 2: Impact of the number of iterations.
  • Figure 3: Impact of the interpolation coefficient $\alpha$.