Controlled Territory and Conflict Tracking (CONTACT): (Geo-)Mapping Occupied Territory from Open Source Intelligence
Paul K. Mandal, Cole Leo, Connor Hurley
TL;DR
CONTACT tackles the problem of real-time territorial control inference from noisy OSINT by comparing SetFit and prompt-tuned BLOOMZ on a small, VIINA-labeled ISIS-news dataset. The approach demonstrates that a prompt-tuned, decoder-style model with label definitions embedded in the prompt can outperform traditional few-shot embeddings in low-resource settings. Key contributions include a lightweight, open-source data pipeline, a hand-labeled VIINA-style dataset, and evidence that prompt-based supervision reduces annotation burden while enabling structured multi-label inference. The work suggests practical impact for conflict monitoring workflows, provided larger-scale validation and robust generalization are pursued.
Abstract
Open-source intelligence provides a stream of unstructured textual data that can inform assessments of territorial control. We present CONTACT, a framework for territorial control prediction using large language models (LLMs) and minimal supervision. We evaluate two approaches: SetFit, an embedding-based few-shot classifier, and a prompt tuning method applied to BLOOMZ-560m, a multilingual generative LLM. Our model is trained on a small hand-labeled dataset of news articles covering ISIS activity in Syria and Iraq, using prompt-conditioned extraction of control-relevant signals such as military operations, casualties, and location references. We show that the BLOOMZ-based model outperforms the SetFit baseline, and that prompt-based supervision improves generalization in low-resource settings. CONTACT demonstrates that LLMs fine-tuned using few-shot methods can reduce annotation burdens and support structured inference from open-ended OSINT streams. Our code is available at https://github.com/PaulKMandal/CONTACT/.
