Ontological differentiation as a measure of semantic accuracy
Pablo Garcia-Cuadrillero, Fabio Revuelta, Jose Angel Capitan
TL;DR
This work introduces Ontological Differentiation (OD), a first-principles, definition-based measure of semantic distance built from recursive definitional expansions and cross-branch cancellations. It systematically compares SOD to cosine similarity derived from Random Inheritance Method embeddings on three Wiktionary-derived networks and shows that SOD is largely orthogonal to embedding-based similarity. Using cumulative SOD scores to evaluate navigation, the study finds that Semantic Navigation (SN) paths are consistently more definitional-coherent than Shortest-Path routes across diverse corpus-processing schemes, validating OD as both an evaluative benchmark and a potential navigation heuristic. The results advocate for integrating symbolic, definition-grounded metrics with vector-based approaches to analyze, validate, and construct semantics-driven navigation in lexical networks and beyond.
Abstract
Understanding semantic relationships within complex networks derived from lexical resources is fundamental for network science and language modeling. While network embedding methods capture contextual similarity, quantifying semantic distance based directly on explicit definitional structure remains challenging. Accurate measures of semantic similarity allow for navigation on lexical networks based on maximizing semantic similarity in each navigation jump (Semantic Navigation, SN). This work introduces Ontological Differentiation (OD), a formal method for measuring divergence between concepts by analyzing overlap during recursive definition expansion. The methodology is applied to networks extracted from the Simple English Wiktionary, comparing OD scores with other measures of semantic similarity proposed in the literature (cosine similarity based on random-walk network exploration). We find weak correlations between direct pairwise OD scores and cosine similarities across $\sim$~2 million word pairs, sampled from a pool representing over 50\% of the entries in the Wiktionary lexicon. This establishes OD as a largely independent, definition-based semantic metric, whose orthogonality to cosine similarity becomes more pronounced when low-semantic-content terms were removed from the dataset. Additionally, we use cumulative OD scores to evaluate paths generated by vector-based SN and structurally optimal Shortest Paths (SP) across networks. We find SN paths consistently exhibit significantly lower cumulative OD scores than shortest paths, suggesting that SN produces trajectories more coherent with the dictionary's definitional structure, as measured by OD. Ontological Differentiation thus provides a novel, definition-grounded tool for analyzing, validating, and potentially constructing navigation processes in lexical networks.
