Exposing Citation Vulnerabilities in Generative Engines
Riku Mochizuki, Shusuke Komatsu, Souta Noguchi, Kazuto Ataka
TL;DR
This work investigates poisoning vulnerabilities in Generative Engines (GEs) that cite web content, introducing publisher-centric evaluation criteria based on the content-injection barrier to assess poisoning risk. It proposes a two-part methodology: (i) classify cited sources by publisher attributes (primary vs secondary categories) and (ii) measure how faithfully those publisher categories are reflected in generated answers using a similarity-based reflection metric. Applying the method to political questions in Japan and the U.S. across three GE models, the study finds a higher reliance on primary sources in Japan ($60$–$65\%$) than in the U.S. ($25$–$45\%$), and shows that low-barrier sources are frequently cited yet poorly reflected in answers, increasing poisoning risk. The authors discuss strategies for primary information providers to increase citation exposure, note language-specific differences in GEO patterns, and propose a manifest-based approach to tailor citation balance by domain and task to mitigate disinformation risks.
Abstract
We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as the difficulty for attackers to manipulate answers to user prompts by placing malicious content on the web. GEs integrate two functions: web search and answer generation that cites web pages using large language models. Because anyone can publish information on the web, GEs are vulnerable to poisoning attacks. Existing studies of citation evaluation focus on how faithfully answer content reflects cited sources, leaving unexamined which web sources should be selected as citations to defend against poisoning attacks. To fill this gap, we introduce evaluation criteria that assess poisoning threats using the citation information contained in answers. Our criteria classify the publisher attributes of citations to estimate the content-injection barrier thereby revealing the threat of poisoning attacks in current GEs. We conduct experiments in political domains in Japan and the United States (U.S.) using our criteria and show that citations from official party websites (primary sources) are approximately \(25\%\)--\(45\%\) in the U.S. and \(60\%\)--\(65\%\) in Japan, indicating that U.S. political answers are at higher risk of poisoning attacks. We also find that sources with low content-injection barriers are frequently cited yet are poorly reflected in answer content. To mitigate this threat, we discuss how publishers of primary sources can increase exposure of their web content in answers and show that well-known techniques are limited by language differences.
