D4R -- Exploring and Querying Relational Graphs Using Natural Language and Large Language Models -- the Case of Historical Documents
Michel Boeglin, David Kahn, Josiane Mothe, Diego Ortiz, David Panzoli
TL;DR
D4R addresses the challenge of enabling non-technical historians to interrogate large textual corpora via graph representations and natural-language interfaces. It combines graph-based data modeling in Neo4j with LLM-driven translation of natural language into Cypher and an intuitive visualization interface. The paper demonstrates cross-domain adaptability (e.g., CoNLL04) and provides an expert mode for direct Cypher editing, confirming practical utility beyond historical documents. The approach supports incremental knowledge discovery while anchored provenance to source texts, highlighting AI as an augmentation rather than replacement for scholarly workflow.
Abstract
D4R is a digital platform designed to assist non-technical users, particularly historians, in exploring textual documents through advanced graphical tools for text analysis and knowledge extraction. By leveraging a large language model, D4R translates natural language questions into Cypher queries, enabling the retrieval of data from a Neo4J database. A user-friendly graphical interface allows for intuitive interaction, enabling users to navigate and analyse complex relational data extracted from unstructured textual documents. Originally designed to bridge the gap between AI technologies and historical research, D4R's capabilities extend to various other domains. A demonstration video and a live software demo are available.
