LINK-KG: LLM-Driven Coreference-Resolved Knowledge Graphs for Human Smuggling Networks

Dipak Meher; Carlotta Domeniconi; Guadalupe Correa-Cabrera

LINK-KG: LLM-Driven Coreference-Resolved Knowledge Graphs for Human Smuggling Networks

Dipak Meher, Carlotta Domeniconi, Guadalupe Correa-Cabrera

TL;DR

LINK-KG tackles the coreference and knowledge-graph construction challenge in long, narrative legal texts about human smuggling by introducing a memory-based, type-aware three-stage coreference pipeline guided by a persistent Prompt Cache. The approach combines NER with stage-wise cache construction and chunk-wise coreference resolution, followed by a GraphRAG-based KG extraction that enforces type-specific prompts and selective filtering to produce coherent graphs. Empirical results on 16 court cases show significant reductions in node duplication (average 45.21%) and noise (average 32.22%), demonstrating cleaner, more actionable graphs across short and long documents. The framework enables robust analysis of complex criminal networks and supports downstream tasks such as role attribution, temporal analysis, and event prediction in legal domains.

Abstract

Human smuggling networks are complex and constantly evolving, making them difficult to analyze comprehensively. Legal case documents offer rich factual and procedural insights into these networks but are often long, unstructured, and filled with ambiguous or shifting references, posing significant challenges for automated knowledge graph (KG) construction. Existing methods either overlook coreference resolution or fail to scale beyond short text spans, leading to fragmented graphs and inconsistent entity linking. We propose LINK-KG, a modular framework that integrates a three-stage, LLM-guided coreference resolution pipeline with downstream KG extraction. At the core of our approach is a type-specific Prompt Cache, which consistently tracks and resolves references across document chunks, enabling clean and disambiguated narratives for structured knowledge graph construction from both short and long legal texts. LINK-KG reduces average node duplication by 45.21% and noisy nodes by 32.22% compared to baseline methods, resulting in cleaner and more coherent graph structures. These improvements establish LINK-KG as a strong foundation for analyzing complex criminal networks.

LINK-KG: LLM-Driven Coreference-Resolved Knowledge Graphs for Human Smuggling Networks

TL;DR

Abstract

LINK-KG: LLM-Driven Coreference-Resolved Knowledge Graphs for Human Smuggling Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)