BounTCHA: A CAPTCHA Utilizing Boundary Identification in Guided Generative AI-extended Videos
Lehao Lin, Ke Wang, Maha Abdallah, Wei Cai
TL;DR
The paper addresses the rising capability of AI-powered bots to defeat traditional CAPTCHAs by proposing BounTCHA, a boundary-identification CAPTCHA built on guided AI-extended videos. It combines a data-generation pipeline that uses content understanding, last-frame prompts, and AI video extension to create a recognizable boundary, then tests human performance and security against random, database, and multimodal-attacks. Key contributions include (i) a practical data-generation and prototype pipeline, (ii) empirical characterization of human time biases for boundary detection, and (iii) a comprehensive security analysis showing resilience against several attacker classes. The work demonstrates that human perception of video boundaries, amplified by controlled AI extensions, can effectively separate humans from bots, offering a scalable defense for web services in the AI-enhanced era.
Abstract
In recent years, the rapid development of artificial intelligence (AI) especially multi-modal Large Language Models (MLLMs), has enabled it to understand text, images, videos, and other multimedia data, allowing AI systems to execute various tasks based on human-provided prompts. However, AI-powered bots have increasingly been able to bypass most existing CAPTCHA systems, posing significant security threats to web applications. This makes the design of new CAPTCHA mechanisms an urgent priority. We observe that humans are highly sensitive to shifts and abrupt changes in videos, while current AI systems still struggle to comprehend and respond to such situations effectively. Based on this observation, we design and implement BounTCHA, a CAPTCHA mechanism that leverages human perception of boundaries in video transitions and disruptions. By utilizing generative AI's capability to extend original videos with prompts, we introduce unexpected twists and changes to create a pipeline for generating guided short videos for CAPTCHA purposes. We develop a prototype and conduct experiments to collect data on humans' time biases in boundary identification. This data serves as a basis for distinguishing between human users and bots. Additionally, we perform a detailed security analysis of BounTCHA, demonstrating its resilience against various types of attacks. We hope that BounTCHA will act as a robust defense, safeguarding millions of web applications in the AI-driven era.
