Table of Contents
Fetching ...

Misalignment of Semantic Relation Knowledge between WordNet and Human Intuition

Zhihan Cao, Hiroaki Yamada, Simone Teufel, Takenobu Tokunaga

TL;DR

This study systematically assesses how WordNet's semantic relations align with human intuition by eliciting relational triplets from language users across six relations. Using carefully crafted templates and MTurk crowdsourcing, it computes match statuses, elicitation frequencies, and mismatch likelihood, revealing widespread misalignment and revealing that WordNet path length does not reflect intuitive relatedness. The results show a large fraction of elicited triplets are missing from WordNet, with notable mismatches across relations, particularly for synonymy and taxonomic links, suggesting concrete directions for WordNet augmentation. The work provides a scalable methodology and concrete metrics to guide structured enrichment of lexical semantics beyond expert-constructed resources.

Abstract

WordNet provides a carefully constructed repository of semantic relations, created by specialists. But there is another source of information on semantic relations, the intuition of language users. We present the first systematic study of the degree to which these two sources are aligned. Investigating the cases of misalignment could make proper use of WordNet and facilitate its improvement. Our analysis which uses templates to elicit responses from human participants, reveals a general misalignment of semantic relation knowledge between WordNet and human intuition. Further analyses find a systematic pattern of mismatch among synonymy and taxonomic relations~(hypernymy and hyponymy), together with the fact that WordNet path length does not serve as a reliable indicator of human intuition regarding hypernymy or hyponymy relations.

Misalignment of Semantic Relation Knowledge between WordNet and Human Intuition

TL;DR

This study systematically assesses how WordNet's semantic relations align with human intuition by eliciting relational triplets from language users across six relations. Using carefully crafted templates and MTurk crowdsourcing, it computes match statuses, elicitation frequencies, and mismatch likelihood, revealing widespread misalignment and revealing that WordNet path length does not reflect intuitive relatedness. The results show a large fraction of elicited triplets are missing from WordNet, with notable mismatches across relations, particularly for synonymy and taxonomic links, suggesting concrete directions for WordNet augmentation. The work provides a scalable methodology and concrete metrics to guide structured enrichment of lexical semantics beyond expert-constructed resources.

Abstract

WordNet provides a carefully constructed repository of semantic relations, created by specialists. But there is another source of information on semantic relations, the intuition of language users. We present the first systematic study of the degree to which these two sources are aligned. Investigating the cases of misalignment could make proper use of WordNet and facilitate its improvement. Our analysis which uses templates to elicit responses from human participants, reveals a general misalignment of semantic relation knowledge between WordNet and human intuition. Further analyses find a systematic pattern of mismatch among synonymy and taxonomic relations~(hypernymy and hyponymy), together with the fact that WordNet path length does not serve as a reliable indicator of human intuition regarding hypernymy or hyponymy relations.

Paper Structure

This paper contains 23 sections, 2 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Match status distribution per relation.
  • Figure 2: Human elicitation frequency vs. match rate.
  • Figure 3: Mismatch likelihood matrix.
  • Figure 4: Distances of matched triplets.
  • Figure 5: Gloss-based similarity of triplets per relation. UNR means unrelated triplets.
  • ...and 1 more figures