AI-Generated Faces in the Real World: A Large-Scale Case Study of Twitter Profile Images
Jonas Ricker, Dennis Assenmacher, Thorsten Holz, Asja Fischer, Erwin Quiring
TL;DR
This paper investigates the real-world prevalence and usage of AI-generated profile images on Twitter using a carefully designed, multi-stage detection pipeline trained on Twitter-processed data. It reports that about $0.052\%$ of nearly 15 million profile pictures are AI-generated, identifies 7,723 fake-image accounts, and reveals coordinated inauthentic behavior including spamming, cryptocurrency giveaways, and political discourse. The methodology combines a fast pre-filter, a CNN classifier tuned to processed profile images, manual labeling aided by alignment and GAN inversion, and multiple labeled data sources to estimate error rates. The analysis of accounts and tweets shows fake-image users tend to have fewer followers, shorter lifespans, higher suspension rates, and form large clusters with homogeneous patterns, indicating orchestrated networks. The work provides a scalable framework for real-world detection and contributes to the design of mitigation strategies, data and code release, and insights into the threats and topics tied to AI-generated social media content.
Abstract
Recent advances in the field of generative artificial intelligence (AI) have blurred the lines between authentic and machine-generated content, making it almost impossible for humans to distinguish between such media. One notable consequence is the use of AI-generated images for fake profiles on social media. While several types of disinformation campaigns and similar incidents have been reported in the past, a systematic analysis has been lacking. In this work, we conduct the first large-scale investigation of the prevalence of AI-generated profile pictures on Twitter. We tackle the challenges of a real-world measurement study by carefully integrating various data sources and designing a multi-stage detection pipeline. Our analysis of nearly 15 million Twitter profile pictures shows that 0.052% were artificially generated, confirming their notable presence on the platform. We comprehensively examine the characteristics of these accounts and their tweet content, and uncover patterns of coordinated inauthentic behavior. The results also reveal several motives, including spamming and political amplification campaigns. Our research reaffirms the need for effective detection and mitigation strategies to cope with the potential negative effects of generative AI in the future.
