When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs
Hanna Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin, Kimin Lee
TL;DR
This work investigates the misuse potential of web-enabled LLM agents in cyberattacks that leverage personal data. It systematically evaluates three hallmark attacks—PII collection, impersonation post generation, and spear-phishing emails—across public models (GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Flash) using WebSearch and WebNav tools. Key findings show substantial capabilities when web tools are enabled (e.g., up to 535.6 PII items for professors, impersonation authenticity up to 93.9%, and phishing link CTRs up to 46.67%), while existing safeguards can be bypassed in some configurations, revealing significant security gaps. The paper underscores the urgent need for robust, scalable defenses and policy improvements to curb the malicious use of LLM agents.
Abstract
Recent advancements in Large Language Models (LLMs) have established them as agentic systems capable of planning and interacting with various tools. These LLM agents are often paired with web-based tools, enabling access to diverse sources and real-time information. Although these advancements offer significant benefits across various applications, they also increase the risk of malicious use, particularly in cyberattacks involving personal information. In this work, we investigate the risks associated with misuse of LLM agents in cyberattacks involving personal data. Specifically, we aim to understand: 1) how potent LLM agents can be when directed to conduct cyberattacks, 2) how cyberattacks are enhanced by web-based tools, and 3) how affordable and easy it becomes to launch cyberattacks using LLM agents. We examine three attack scenarios: the collection of Personally Identifiable Information (PII), the generation of impersonation posts, and the creation of spear-phishing emails. Our experiments reveal the effectiveness of LLM agents in these attacks: LLM agents achieved a precision of up to 95.9% in collecting PII, generated impersonation posts where 93.9% of them were deemed authentic, and boosted click rate of phishing links in spear phishing emails by 46.67%. Additionally, our findings underscore the limitations of existing safeguards in contemporary commercial LLMs, emphasizing the urgent need for robust security measures to prevent the misuse of LLM agents.
