Table of Contents
Fetching ...

ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls

Sanket Badhe

TL;DR

This work demonstrates that autonomous, memory-enabled language-agent systems can orchestrate multi-turn scam calls that evade traditional safety guardrails and translate into realistic audio via TTS. It introduces ScamAgent, a modular architecture with goal decomposition, planning, deception strategies, and memory, showing how such agents can sustain persuasive, contextually coherent interactions across several turns. The study reveals significant safety vulnerabilities, including reduced refusals and higher completion rates under agentic control, and discusses latency implications for real-time deployment. It further proposes multi-layered defense strategies and highlights the need for agent-level auditing, multi-turn moderation, and regulatory considerations to mitigate emergent risks in multimodal, autonomous AI systems.

Abstract

Large Language Models (LLMs) have demonstrated impressive fluency and reasoning capabilities, but their potential for misuse has raised growing concern. In this paper, we present ScamAgent, an autonomous multi-turn agent built on top of LLMs, capable of generating highly realistic scam call scripts that simulate real-world fraud scenarios. Unlike prior work focused on single-shot prompt misuse, ScamAgent maintains dialogue memory, adapts dynamically to simulated user responses, and employs deceptive persuasion strategies across conversational turns. We show that current LLM safety guardrails, including refusal mechanisms and content filters, are ineffective against such agent-based threats. Even models with strong prompt-level safeguards can be bypassed when prompts are decomposed, disguised, or delivered incrementally within an agent framework. We further demonstrate the transformation of scam scripts into lifelike voice calls using modern text-to-speech systems, completing a fully automated scam pipeline. Our findings highlight an urgent need for multi-turn safety auditing, agent-level control frameworks, and new methods to detect and disrupt conversational deception powered by generative AI.

ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls

TL;DR

This work demonstrates that autonomous, memory-enabled language-agent systems can orchestrate multi-turn scam calls that evade traditional safety guardrails and translate into realistic audio via TTS. It introduces ScamAgent, a modular architecture with goal decomposition, planning, deception strategies, and memory, showing how such agents can sustain persuasive, contextually coherent interactions across several turns. The study reveals significant safety vulnerabilities, including reduced refusals and higher completion rates under agentic control, and discusses latency implications for real-time deployment. It further proposes multi-layered defense strategies and highlights the need for agent-level auditing, multi-turn moderation, and regulatory considerations to mitigate emergent risks in multimodal, autonomous AI systems.

Abstract

Large Language Models (LLMs) have demonstrated impressive fluency and reasoning capabilities, but their potential for misuse has raised growing concern. In this paper, we present ScamAgent, an autonomous multi-turn agent built on top of LLMs, capable of generating highly realistic scam call scripts that simulate real-world fraud scenarios. Unlike prior work focused on single-shot prompt misuse, ScamAgent maintains dialogue memory, adapts dynamically to simulated user responses, and employs deceptive persuasion strategies across conversational turns. We show that current LLM safety guardrails, including refusal mechanisms and content filters, are ineffective against such agent-based threats. Even models with strong prompt-level safeguards can be bypassed when prompts are decomposed, disguised, or delivered incrementally within an agent framework. We further demonstrate the transformation of scam scripts into lifelike voice calls using modern text-to-speech systems, completing a fully automated scam pipeline. Our findings highlight an urgent need for multi-turn safety auditing, agent-level control frameworks, and new methods to detect and disrupt conversational deception powered by generative AI.

Paper Structure

This paper contains 20 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: ScamAgent System Architecture: The architecture consists of a Central Orchestrator that coordinates multi-turn dialogue planning, memory, state management, and goal tracking. User input is processed through modules for Goal Decomposition and Context Memory, enabling the transformation of high-level malicious objectives into innocuous subtasks. The Deception Layer constructs safety-evasive prompts via fictional framing, persona anchoring, and prompt manipulation, which are then passed to a foundational LLM for dialogue generation. Generated responses are synthesized into speech via the TTS module, with dynamic voice control. The output is delivered as a simulated scam call, enabling realistic, adaptive adversarial interactions.
  • Figure 2: Comparison of human evaluation scores for ScamAgent and real-world scam dialogues. The average plausibility score was 3.4 for ScamAgent and 3.6 for real-world transcripts. The average persuasiveness score was 3.6 for ScamAgent and 3.9 for real-world transcripts. All values were rated on a 5-point Likert scale by five independent human raters.
  • Figure 3: Comparison of refusal rates across three frontier models (GPT-4, Claude 3.7, and LLaMA3-70B) under single-prompt and ScamAgent scenarios. While single-prompt queries yielded high refusal rates (84–100%), the ScamAgent framework significantly reduced these rates (17–32%), demonstrating the effectiveness of agent-based evasion strategies.