Table of Contents
Fetching ...

Graphilosophy: Graph-Based Digital Humanities Computing with The Four Books

Minh-Thu Do, Quynh-Chau Le-Tran, Duc-Duy Nguyen-Mai, Thien-Trang Nguyen, Khanh-Duy Le, Minh-Triet Tran, Tam V. Nguyen, Trung-Nghia Le

Abstract

The Four Books have shaped East Asian intellectual traditions, yet their multi-layered interpretive complexity limits their accessibility in the digital age. While traditional bilingual commentaries provide a vital pedagogical bridge, computational frameworks are needed to preserve and explore this wisdom. This paper bridges AI and classical philosophy by introducing Graphilosophy, an ontology-guided, multi-layered knowledge graph framework for modeling and interpreting The Four Books. Integrating natural language processing, multilingual semantic embeddings, and humanistic analysis, the framework transforms a bilingual Chinese-Vietnamese corpus into an interpretively grounded resource. Graphilosophy encodes linguistic, conceptual, and interpretive relationships across interconnected layers, enabling cross-lingual retrieval and AI-assisted reasoning while explicitly preserving scholarly nuance and interpretive plurality. The system also enables non-expert users to trace the evolution of ethical concepts across borders and languages, ensuring that ancient wisdom remains a living resource for modern moral discourse rather than a static relic of the past. Through an interactive interface, users can trace the evolution of ethical concepts across languages, ensuring ancient wisdom remains relevant for modern discourse. A preliminary user study suggests the system's capacity to enhance conceptual understanding and cross-cultural learning. By linking algorithmic representation with ethical inquiry, this research exemplifies how AI can serve as a methodological bridge, accommodating the ambiguity of cultural heritage rather than reducing it to static data. The Source code and data are released at https://github.com/ThuDoMinh1102/confucian-texts-knowledge-graph.

Graphilosophy: Graph-Based Digital Humanities Computing with The Four Books

Abstract

The Four Books have shaped East Asian intellectual traditions, yet their multi-layered interpretive complexity limits their accessibility in the digital age. While traditional bilingual commentaries provide a vital pedagogical bridge, computational frameworks are needed to preserve and explore this wisdom. This paper bridges AI and classical philosophy by introducing Graphilosophy, an ontology-guided, multi-layered knowledge graph framework for modeling and interpreting The Four Books. Integrating natural language processing, multilingual semantic embeddings, and humanistic analysis, the framework transforms a bilingual Chinese-Vietnamese corpus into an interpretively grounded resource. Graphilosophy encodes linguistic, conceptual, and interpretive relationships across interconnected layers, enabling cross-lingual retrieval and AI-assisted reasoning while explicitly preserving scholarly nuance and interpretive plurality. The system also enables non-expert users to trace the evolution of ethical concepts across borders and languages, ensuring that ancient wisdom remains a living resource for modern moral discourse rather than a static relic of the past. Through an interactive interface, users can trace the evolution of ethical concepts across languages, ensuring ancient wisdom remains relevant for modern discourse. A preliminary user study suggests the system's capacity to enhance conceptual understanding and cross-cultural learning. By linking algorithmic representation with ethical inquiry, this research exemplifies how AI can serve as a methodological bridge, accommodating the ambiguity of cultural heritage rather than reducing it to static data. The Source code and data are released at https://github.com/ThuDoMinh1102/confucian-texts-knowledge-graph.

Paper Structure

This paper contains 58 sections, 1 equation, 10 figures, 5 tables.

Figures (10)

  • Figure 1: The multi-layered ontology architecture of the Graphilosophy knowledge graph. The schema models the corpus across six distinct but interconnected layers to preserve structural, linguistic, and interpretive dimensions. Solid lines indicate intra-layer relationships, while dashed lines represent cross-layer unifications that enable complex, multi-hop reasoning.
  • Figure 2: User interface of the proposed system integrating layered KG visualization, semantic search, and commentary exploration. The interface enables cross-lingual retrieval and interpretive analysis of The Four Books through Gemini-powered natural language querying. Some instructions in the interface are displayed in Vietnamese to enhance usability for local users.
  • Figure 3: The high density of the Linguistic Layer reflects the substantial volume of nodes and edges within this layer, establishing a robust foundation for subsequent semantic analysis.
  • Figure 4: Semantic Layer structure. Visualization of the dense network formed by EMBEDDING nodes and SEMANTIC_CLUSTERs, confirming the reliance on the semantic layer for similarity-based retrieval.
  • Figure 5: Query-based focused visualization illustrating the BFS-based (depth = 1) search mechanism through Gemini model. Unlike full-layer visualizations (Figures \ref{['fig:linguistic_density']}, \ref{['fig:semantic_network']}) which display the entire graph structure, these focused subgraphs present only the immediate neighborhood of query-matched nodes, reducing visual complexity while preserving interpretive context. (a) An exact single-sentence query produces a star-shaped subgraph centered on the matched passage. (b) A semantic query over Vietnamese text retrieves multiple thematically related passages, forming distinct clusters connected through shared linguistic and conceptual nodes.
  • ...and 5 more figures