Table of Contents
Fetching ...

Exposing Citation Vulnerabilities in Generative Engines

Riku Mochizuki, Shusuke Komatsu, Souta Noguchi, Kazuto Ataka

TL;DR

This work investigates poisoning vulnerabilities in Generative Engines (GEs) that cite web content, introducing publisher-centric evaluation criteria based on the content-injection barrier to assess poisoning risk. It proposes a two-part methodology: (i) classify cited sources by publisher attributes (primary vs secondary categories) and (ii) measure how faithfully those publisher categories are reflected in generated answers using a similarity-based reflection metric. Applying the method to political questions in Japan and the U.S. across three GE models, the study finds a higher reliance on primary sources in Japan ($60$–$65\%$) than in the U.S. ($25$–$45\%$), and shows that low-barrier sources are frequently cited yet poorly reflected in answers, increasing poisoning risk. The authors discuss strategies for primary information providers to increase citation exposure, note language-specific differences in GEO patterns, and propose a manifest-based approach to tailor citation balance by domain and task to mitigate disinformation risks.

Abstract

We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as the difficulty for attackers to manipulate answers to user prompts by placing malicious content on the web. GEs integrate two functions: web search and answer generation that cites web pages using large language models. Because anyone can publish information on the web, GEs are vulnerable to poisoning attacks. Existing studies of citation evaluation focus on how faithfully answer content reflects cited sources, leaving unexamined which web sources should be selected as citations to defend against poisoning attacks. To fill this gap, we introduce evaluation criteria that assess poisoning threats using the citation information contained in answers. Our criteria classify the publisher attributes of citations to estimate the content-injection barrier thereby revealing the threat of poisoning attacks in current GEs. We conduct experiments in political domains in Japan and the United States (U.S.) using our criteria and show that citations from official party websites (primary sources) are approximately \(25\%\)--\(45\%\) in the U.S. and \(60\%\)--\(65\%\) in Japan, indicating that U.S. political answers are at higher risk of poisoning attacks. We also find that sources with low content-injection barriers are frequently cited yet are poorly reflected in answer content. To mitigate this threat, we discuss how publishers of primary sources can increase exposure of their web content in answers and show that well-known techniques are limited by language differences.

Exposing Citation Vulnerabilities in Generative Engines

TL;DR

This work investigates poisoning vulnerabilities in Generative Engines (GEs) that cite web content, introducing publisher-centric evaluation criteria based on the content-injection barrier to assess poisoning risk. It proposes a two-part methodology: (i) classify cited sources by publisher attributes (primary vs secondary categories) and (ii) measure how faithfully those publisher categories are reflected in generated answers using a similarity-based reflection metric. Applying the method to political questions in Japan and the U.S. across three GE models, the study finds a higher reliance on primary sources in Japan () than in the U.S. (), and shows that low-barrier sources are frequently cited yet poorly reflected in answers, increasing poisoning risk. The authors discuss strategies for primary information providers to increase citation exposure, note language-specific differences in GEO patterns, and propose a manifest-based approach to tailor citation balance by domain and task to mitigate disinformation risks.

Abstract

We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as the difficulty for attackers to manipulate answers to user prompts by placing malicious content on the web. GEs integrate two functions: web search and answer generation that cites web pages using large language models. Because anyone can publish information on the web, GEs are vulnerable to poisoning attacks. Existing studies of citation evaluation focus on how faithfully answer content reflects cited sources, leaving unexamined which web sources should be selected as citations to defend against poisoning attacks. To fill this gap, we introduce evaluation criteria that assess poisoning threats using the citation information contained in answers. Our criteria classify the publisher attributes of citations to estimate the content-injection barrier thereby revealing the threat of poisoning attacks in current GEs. We conduct experiments in political domains in Japan and the United States (U.S.) using our criteria and show that citations from official party websites (primary sources) are approximately -- in the U.S. and -- in Japan, indicating that U.S. political answers are at higher risk of poisoning attacks. We also find that sources with low content-injection barriers are frequently cited yet are poorly reflected in answer content. To mitigate this threat, we discuss how publishers of primary sources can increase exposure of their web content in answers and show that well-known techniques are limited by language differences.

Paper Structure

This paper contains 19 sections, 1 equation, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Distribution of citation sources by party in the US (a) and Japan (b): Each stacked bar shows the proportion of cited sources (primary, opponent, and secondary information sources categorized by attack cost) for responses generated by different APIs. For each party, the three adjacent bars correspond to results from OpenAI (left), Gemini (middle), and Claude (right). The three bars on the far right of each panel show aggregated results across all parties. The "opponent source" in All Parties chart reffers to the citation sources from parties that are not the target party for each question. The numbers under each chart shows the amout of total citation constructing it.
  • Figure 2: Citation coverage by similarity level for Japanese and English political prompts across models: Each chart shows the distribution of citation source types (primary, opponent, and secondary information sources categorized by attack cost) across three similarity levels(low = [-1.0, 0.8], mid = (0.8, 0.9], and high = (0.9, 1.0]).
  • Figure 3: Distribution comparison of web structure features between Citations (left) and Sources (right) for U.S. parties (left two columns) and Japanese parties (right two columns), showing significant differences between cited and non-cited sources