LINK-KG: LLM-Driven Coreference-Resolved Knowledge Graphs for Human Smuggling Networks
Dipak Meher, Carlotta Domeniconi, Guadalupe Correa-Cabrera
TL;DR
LINK-KG tackles the coreference and knowledge-graph construction challenge in long, narrative legal texts about human smuggling by introducing a memory-based, type-aware three-stage coreference pipeline guided by a persistent Prompt Cache. The approach combines NER with stage-wise cache construction and chunk-wise coreference resolution, followed by a GraphRAG-based KG extraction that enforces type-specific prompts and selective filtering to produce coherent graphs. Empirical results on 16 court cases show significant reductions in node duplication (average 45.21%) and noise (average 32.22%), demonstrating cleaner, more actionable graphs across short and long documents. The framework enables robust analysis of complex criminal networks and supports downstream tasks such as role attribution, temporal analysis, and event prediction in legal domains.
Abstract
Human smuggling networks are complex and constantly evolving, making them difficult to analyze comprehensively. Legal case documents offer rich factual and procedural insights into these networks but are often long, unstructured, and filled with ambiguous or shifting references, posing significant challenges for automated knowledge graph (KG) construction. Existing methods either overlook coreference resolution or fail to scale beyond short text spans, leading to fragmented graphs and inconsistent entity linking. We propose LINK-KG, a modular framework that integrates a three-stage, LLM-guided coreference resolution pipeline with downstream KG extraction. At the core of our approach is a type-specific Prompt Cache, which consistently tracks and resolves references across document chunks, enabling clean and disambiguated narratives for structured knowledge graph construction from both short and long legal texts. LINK-KG reduces average node duplication by 45.21% and noisy nodes by 32.22% compared to baseline methods, resulting in cleaner and more coherent graph structures. These improvements establish LINK-KG as a strong foundation for analyzing complex criminal networks.
