Table of Contents
Fetching ...

Can MLLMs Detect Phishing? A Comprehensive Security Benchmark Suite Focusing on Dynamic Threats and Multimodal Evaluation in Academic Environments

Jingzhuo Zhou

TL;DR

This work introduces AdapT-Bench, a comprehensive benchmark to evaluate Multimodal Large Language Models under realistic phishing threats in academic settings. It assembles synthetic scholar profiles and three datasets (Kimi 1000, Strategies Batch 1, Filtered Clean) to simulate dynamic, multimodal, and cross-lingual attacks across varied emotional states. The study demonstrates a paradox where heavily filtered, well-crafted attacks can be easier for AI defenses to detect than unstructured ones, and it reveals strong cross-lingual and jailbreak vulnerabilities in leading models. By providing a standardized evaluation framework and rich, context-aware scenarios, AdapT-Bench enables more robust defense strategies and safer deployment of MLLMs in academic environments.

Abstract

The rapid proliferation of Multimodal Large Language Models (MLLMs) has introduced unprecedented security challenges, particularly in phishing detection within academic environments. Academic institutions and researchers are high-value targets, facing dynamic, multilingual, and context-dependent threats that leverage research backgrounds, academic collaborations, and personal information to craft highly tailored attacks. Existing security benchmarks largely rely on datasets that do not incorporate specific academic background information, making them inadequate for capturing the evolving attack patterns and human-centric vulnerability factors specific to academia. To address this gap, we present AdapT-Bench, a unified methodological framework and benchmark suite for systematically evaluating MLLM defense capabilities against dynamic phishing attacks in academic settings.

Can MLLMs Detect Phishing? A Comprehensive Security Benchmark Suite Focusing on Dynamic Threats and Multimodal Evaluation in Academic Environments

TL;DR

This work introduces AdapT-Bench, a comprehensive benchmark to evaluate Multimodal Large Language Models under realistic phishing threats in academic settings. It assembles synthetic scholar profiles and three datasets (Kimi 1000, Strategies Batch 1, Filtered Clean) to simulate dynamic, multimodal, and cross-lingual attacks across varied emotional states. The study demonstrates a paradox where heavily filtered, well-crafted attacks can be easier for AI defenses to detect than unstructured ones, and it reveals strong cross-lingual and jailbreak vulnerabilities in leading models. By providing a standardized evaluation framework and rich, context-aware scenarios, AdapT-Bench enables more robust defense strategies and safer deployment of MLLMs in academic environments.

Abstract

The rapid proliferation of Multimodal Large Language Models (MLLMs) has introduced unprecedented security challenges, particularly in phishing detection within academic environments. Academic institutions and researchers are high-value targets, facing dynamic, multilingual, and context-dependent threats that leverage research backgrounds, academic collaborations, and personal information to craft highly tailored attacks. Existing security benchmarks largely rely on datasets that do not incorporate specific academic background information, making them inadequate for capturing the evolving attack patterns and human-centric vulnerability factors specific to academia. To address this gap, we present AdapT-Bench, a unified methodological framework and benchmark suite for systematically evaluating MLLM defense capabilities against dynamic phishing attacks in academic settings.

Paper Structure

This paper contains 28 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: AdapT-Bench, showcasing both text-based and image-based attack scenarios.
  • Figure 2: Distribution of the 1000 generated text-based phishing attacks (Strategies Batch 1).
  • Figure 3: Strategy distribution for the 1,054 instances in the Filtered Clean dataset.