Table of Contents
Fetching ...

Towards Context-Aware Image Anonymization with Multi-Agent Reasoning

Robert Aufschläger, Jakob Folz, Gautam Savaliya, Manjitha D Vidanalage, Michael Heigl, Martin Schramm

Abstract

Street-level imagery contains personally identifiable information (PII), some of which is context-dependent. Existing anonymization methods either over-process images or miss subtle identifiers, while API-based solutions compromise data sovereignty. We present an agentic framework CAIAMAR (\underline{C}ontext-\underline{A}ware \underline{I}mage \underline{A}nonymization with \underline{M}ulti-\underline{A}gent \underline{R}easoning) for context-aware PII segmentation with diffusion-based anonymization, combining pre-defined processing for high-confidence cases with multi-agent reasoning for indirect identifiers. Three specialized agents coordinate via round-robin speaker selection in a Plan-Do-Check-Act (PDCA) cycle, enabling large vision-language models to classify PII based on spatial context (private vs. public property) rather than rigid category rules. The agents implement spatially-filtered coarse-to-fine detection where a scout-and-zoom strategy identifies candidates, open-vocabulary segmentation processes localized crops, and $IoU$-based deduplication ($30\%$ threshold) prevents redundant processing. Modal-specific diffusion guidance with appearance decorrelation substantially reduces re-identification (Re-ID) risks. On CUHK03-NP, our method reduces person Re-ID risk by $73\%$ ($R1$: $16.9\%$ vs. $62.4\%$ baseline). For image quality preservation on CityScapes, we achieve KID: $0.001$, and FID: $9.1$, significantly outperforming existing anonymization. The agentic workflow detects non-direct PII instances across object categories, and downstream semantic segmentation is preserved. Operating entirely on-premise with open-source models, the framework generates human-interpretable audit trails supporting EU's GDPR transparency requirements while flagging failed cases for human review.

Towards Context-Aware Image Anonymization with Multi-Agent Reasoning

Abstract

Street-level imagery contains personally identifiable information (PII), some of which is context-dependent. Existing anonymization methods either over-process images or miss subtle identifiers, while API-based solutions compromise data sovereignty. We present an agentic framework CAIAMAR (\underline{C}ontext-\underline{A}ware \underline{I}mage \underline{A}nonymization with \underline{M}ulti-\underline{A}gent \underline{R}easoning) for context-aware PII segmentation with diffusion-based anonymization, combining pre-defined processing for high-confidence cases with multi-agent reasoning for indirect identifiers. Three specialized agents coordinate via round-robin speaker selection in a Plan-Do-Check-Act (PDCA) cycle, enabling large vision-language models to classify PII based on spatial context (private vs. public property) rather than rigid category rules. The agents implement spatially-filtered coarse-to-fine detection where a scout-and-zoom strategy identifies candidates, open-vocabulary segmentation processes localized crops, and -based deduplication ( threshold) prevents redundant processing. Modal-specific diffusion guidance with appearance decorrelation substantially reduces re-identification (Re-ID) risks. On CUHK03-NP, our method reduces person Re-ID risk by (: vs. baseline). For image quality preservation on CityScapes, we achieve KID: , and FID: , significantly outperforming existing anonymization. The agentic workflow detects non-direct PII instances across object categories, and downstream semantic segmentation is preserved. Operating entirely on-premise with open-source models, the framework generates human-interpretable audit trails supporting EU's GDPR transparency requirements while flagging failed cases for human review.

Paper Structure

This paper contains 97 sections, 3 equations, 12 figures, 9 tables.

Figures (12)

  • Figure 1: Two-phase agentic anonymization architecture. Phase 1 employs specialized models for direct PII (full body, license plates). Phase 2 implements multi-agent orchestration with round-robin coordination, where specialized agents handle classification (Auditor), synthesis (Generative), and workflow management (Orchestrator), implementing PDCA cycles.
  • Figure 2: Round-robin PDCA coordination with three specialized agents. Phase 1 applies deterministic preprocessing for direct PII. Phase 2 implements bounded iterative refinement.
  • Figure 3: Qualitative comparison of anonymization methods on a CUHK03-NP test example (bounding_box_test/0069_c2_658.png).
  • Figure 4: Pipeline output on CityScapes test images. Each row shows one street scene. Left: original; middle: detected PII with color-coded overlays (blue=persons, yellow=indirect PII vehicles, green=traffic signs, gray=license plates); right: anonymized output. Top (berlin_000002): 8 persons + 1 police vehicle + 2 traffic signs, 4.97% PII coverage, 3 PDCA iterations. Bottom (berlin_000472): multiple PII categories across two-phase pipeline.
  • Figure S1: Qualitative comparison of person anonymization methods on CUHK03-NP test examples. Each row shows the same person processed by different methods (left to right): Original, Gaussian Blur, DeepPrivacy2 (DP2), FADM, and our approach. Our method preserves pose and scene structure while effectively anonymizing identities with photorealistic results. Blur destroys facial details and overall image quality. DP2 produces synthetic faces with visible artifacts. FADM maintains high similarity to originals (privacy risk). Our method achieves the optimal balance between privacy protection and visual quality.
  • ...and 7 more figures