Research about the Ability of LLM in the Tamper-Detection Area
Xinyu Yang, Jizhe Zhou
TL;DR
This work assesses whether large language models can assist in tamper detection amid rising AI-generated content and sophisticated image forgery. By evaluating five LLMs (GPT-4, LLaVA, Bard, ERNIE Bot4, Tongyi Qianwen) on two tasks—AI-generated image detection and tamper detection—the study uses 100 AI-generated and 100 manipulated images and per-image chats to determine accuracy. The findings show limited effectiveness overall: only GPT-4 reaches up to about 70% accuracy on a random subset, while all models struggle with visually realistic tampering and deepfakes. These results highlight the current limitations of LLMs in tamper detection and reinforce the need to rely on traditional detection methods and continued DL-based research for robust security solutions.
Abstract
In recent years, particularly since the early 2020s, Large Language Models (LLMs) have emerged as the most powerful AI tools in addressing a diverse range of challenges, from natural language processing to complex problem-solving in various domains. In the field of tamper detection, LLMs are capable of identifying basic tampering activities.To assess the capabilities of LLMs in more specialized domains, we have collected five different LLMs developed by various companies: GPT-4, LLaMA, Bard, ERNIE Bot 4.0, and Tongyi Qianwen. This diverse range of models allows for a comprehensive evaluation of their performance in detecting sophisticated tampering instances.We devised two domains of detection: AI-Generated Content (AIGC) detection and manipulation detection. AIGC detection aims to test the ability to distinguish whether an image is real or AI-generated. Manipulation detection, on the other hand, focuses on identifying tampered images. According to our experiments, most LLMs can identify composite pictures that are inconsistent with logic, and only more powerful LLMs can distinguish logical, but visible signs of tampering to the human eye. All of the LLMs can't identify carefully forged images and very realistic images generated by AI. In the area of tamper detection, LLMs still have a long way to go, particularly in reliably identifying highly sophisticated forgeries and AI-generated images that closely mimic reality.
