GEO-Detective: Unveiling Location Privacy Risks in Images with LLM Agents

Xinyu Zhang; Yixin Wu; Boyang Zhang; Chenhao Lin; Chao Shen; Michael Backes; Yang Zhang

GEO-Detective: Unveiling Location Privacy Risks in Images with LLM Agents

Xinyu Zhang, Yixin Wu, Boyang Zhang, Chenhao Lin, Chao Shen, Michael Backes, Yang Zhang

TL;DR

The paper investigates privacy risks from geolocating common social media images using an autonomous agent, GEO-Detective, which combines LVLM reasoning with external tools through a four-stage pipeline (visual analysis, strategy execution, results synthesis, iterative refinement). It introduces a difficulty-based strategy selection mechanism and modules like visual feature segmentation and visual reverse search, achieving higher accuracy than strong LVLM baselines, especially on challenging images, and significantly reducing unknown predictions when external clues are available. It provides extensive ablations, assesses generalizability with multiple models, and evaluates defense strategies, finding watermarking to be the most effective at suppressing geolocation, while highlighting the need for robust privacy safeguards. Overall, the work demonstrates the amplified privacy risks posed by agentic geolocation and offers a foundation for developing and evaluating defenses against such tooling.

Abstract

Images shared on social media often expose geographic cues. While early geolocation methods required expert effort and lacked generalization, the rise of Large Vision Language Models (LVLMs) now enables accurate geolocation even for ordinary users. However, existing approaches are not optimized for this task. To explore the full potential and associated privacy risks, we present Geo-Detective, an agent that mimics human reasoning and tool use for image geolocation inference. It follows a procedure with four steps that adaptively selects strategies based on image difficulty and is equipped with specialized tools such as visual reverse search, which emulates how humans gather external geographic clues. Experimental results show that GEO-Detective outperforms baseline large vision language models (LVLMs) overall, particularly on images lacking visible geographic features. In country level geolocation tasks, it achieves an improvement of over 11.1% compared to baseline LLMs, and even at finer grained levels, it still provides around a 5.2% performance gain. Meanwhile, when equipped with external clues, GEO-Detective becomes more likely to produce accurate predictions, reducing the "unknown" prediction rate by more than 50.6%. We further explore multiple defense strategies and find that Geo-Detective exhibits stronger robustness, highlighting the need for more effective privacy safeguards.

GEO-Detective: Unveiling Location Privacy Risks in Images with LLM Agents

TL;DR

Abstract

GEO-Detective: Unveiling Location Privacy Risks in Images with LLM Agents

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)