Table of Contents
Fetching ...

Ranking Manipulation for Conversational Search Engines

Samuel Pfrommer, Yatong Bai, Tanmay Gautam, Somayeh Sojoudi

TL;DR

This work introduces a focused dataset of real-world consumer product websites and formalizes conversational search ranking as an adversarial problem and presents a tree-of-attacks-based jailbreaking technique which reliably promotes low-ranked products.

Abstract

Major search engine providers are rapidly incorporating Large Language Model (LLM)-generated content in response to user queries. These conversational search engines operate by loading retrieved website text into the LLM context for summarization and interpretation. Recent research demonstrates that LLMs are highly vulnerable to jailbreaking and prompt injection attacks, which disrupt the safety and quality goals of LLMs using adversarial strings. This work investigates the impact of prompt injections on the ranking order of sources referenced by conversational search engines. To this end, we introduce a focused dataset of real-world consumer product websites and formalize conversational search ranking as an adversarial problem. Experimentally, we analyze conversational search rankings in the absence of adversarial injections and show that different LLMs vary significantly in prioritizing product name, document content, and context position. We then present a tree-of-attacks-based jailbreaking technique which reliably promotes low-ranked products. Importantly, these attacks transfer effectively to state-of-the-art conversational search engines such as perplexity$.$ai. Given the strong financial incentive for website owners to boost their search ranking, we argue that our problem formulation is of critical importance for future robustness work.

Ranking Manipulation for Conversational Search Engines

TL;DR

This work introduces a focused dataset of real-world consumer product websites and formalizes conversational search ranking as an adversarial problem and presents a tree-of-attacks-based jailbreaking technique which reliably promotes low-ranked products.

Abstract

Major search engine providers are rapidly incorporating Large Language Model (LLM)-generated content in response to user queries. These conversational search engines operate by loading retrieved website text into the LLM context for summarization and interpretation. Recent research demonstrates that LLMs are highly vulnerable to jailbreaking and prompt injection attacks, which disrupt the safety and quality goals of LLMs using adversarial strings. This work investigates the impact of prompt injections on the ranking order of sources referenced by conversational search engines. To this end, we introduce a focused dataset of real-world consumer product websites and formalize conversational search ranking as an adversarial problem. Experimentally, we analyze conversational search rankings in the absence of adversarial injections and show that different LLMs vary significantly in prioritizing product name, document content, and context position. We then present a tree-of-attacks-based jailbreaking technique which reliably promotes low-ranked products. Importantly, these attacks transfer effectively to state-of-the-art conversational search engines such as perplexityai. Given the strong financial incentive for website owners to boost their search ranking, we argue that our problem formulation is of critical importance for future robustness work.
Paper Structure (32 sections, 7 equations, 14 figures, 2 tables)

This paper contains 32 sections, 7 equations, 14 figures, 2 tables.

Figures (14)

  • Figure 1: An overview of prompt injection for conversational search engines. By injecting an adversarial prompt into Product B's website content (left), the LLM context can be directly hijacked (center left). This leads to responses which tend to list Product B first (center right). Over many randomized responses, this means Product B is at the top of the ranking distribution (right).
  • Figure 2: Experiments regarding conversational search engine ranking tendencies. (\ref{['fig: adversarial_example']}) Marginals of ranking distributions for tablets (GPT-4 Turbo). The Huawei and Samsung tablets tend to rank highly, whereas the CHUWI tablet ranks the lowest. Orange bars plot the adversarially manipulated distribution (see \ref{['sec:ranking_manipuation']}). (\ref{['fig: natural_rewritten_heatmap']}) Average rankings of combinations of product name and supporting document (GPT-4 Turbo). The CHUWI document ranks poorly for most featured products, whereas the Samsung product is highly ranked when paired with any document beside the CHUWI document. (\ref{['fig: natural_rewritten_fstatistic_scatter']}) F-statistics for grouping by product and grouping by document, one scatter point per product category (GPT-4 Turbo). Model-wise upper $5$th percentile of points along either axis excluded for readability. (\ref{['fig: natural_rewritten_fstatistic']}) Importance of product model and brand name, document content, and input context position in determining rank. The dot denotes the median F-statistic over $50$ product categories, with the range covering the first-to-third quartiles. To enhance readability, the context position median $\sim 127$ and upper quartile $\sim 252$ for Mixtral 8x22 exceed plot bounds.
  • Figure 3: Average rankings of promoted products before and after prompt injection. Sonar Large Online prompts are transferred from GPT-4 Turbo. For plotting purposes, $x$-axis natural scores are rounded to the nearest integer, with the center line reflecting the mean and the shaded area displaying half the standard deviation for readability.
  • Figure 4: Histogram of cosine similarities between arbitrary unperturbed document pair and original-adversarial document pairs.
  • Figure 5: Average ranking scores for various combinations of document and product brand / model name. The product categories are beard trimmers (first column), shampoo (second column), and blenders (third column).
  • ...and 9 more figures

Theorems & Definitions (2)

  • Example 1
  • Remark 2