Table of Contents
Fetching ...

The quasi-semantic competence of LLMs: a case study on the part-whole relation

Mattia Proietti, Alessandro Lenci

TL;DR

The paper probes how far large language models understand the part–whole (meronymy) relationship, focusing on antisymmetry as a core inferential property. It combines behavioral prompting, probabilistic sentence plausibility, and representational geometry to evaluate meronymy knowledge using ConceptNet and McRae norms across LlaMA2-7b, LlaMA2-7b-chat, and GPT-4. Across tasks, results show strong surface-level knowledge but limited abstract generalization, with partial linear encoding in embeddings and substantial gaps relative to human meaning. The findings argue for a quasi-semantic competence in current LLMs and highlight the need for grounding or substructure-based representations to achieve robust meronymic reasoning and generalization.

Abstract

Understanding the extent and depth of the semantic competence of \emph{Large Language Models} (LLMs) is at the center of the current scientific agenda in Artificial Intelligence (AI) and Computational Linguistics (CL). We contribute to this endeavor by investigating their knowledge of the \emph{part-whole} relation, a.k.a. \emph{meronymy}, which plays a crucial role in lexical organization, but it is significantly understudied. We used data from ConceptNet relations \citep{speer2016conceptnet} and human-generated semantic feature norms \citep{McRae:2005} to explore the abilities of LLMs to deal with \textit{part-whole} relations. We employed several methods based on three levels of analysis: i.) \textbf{behavioral} testing via prompting, where we directly queried the models on their knowledge of meronymy, ii.) sentence \textbf{probability} scoring, where we tested models' abilities to discriminate correct (real) and incorrect (asymmetric counterfactual) \textit{part-whole} relations, and iii.) \textbf{concept representation} analysis in vector space, where we proved the linear organization of the \textit{part-whole} concept in the embedding and unembedding spaces. These analyses present a complex picture that reveals that the LLMs' knowledge of this relation is only partial. They have just a ``\emph{quasi}-semantic'' competence and still fall short of capturing deep inferential properties.

The quasi-semantic competence of LLMs: a case study on the part-whole relation

TL;DR

The paper probes how far large language models understand the part–whole (meronymy) relationship, focusing on antisymmetry as a core inferential property. It combines behavioral prompting, probabilistic sentence plausibility, and representational geometry to evaluate meronymy knowledge using ConceptNet and McRae norms across LlaMA2-7b, LlaMA2-7b-chat, and GPT-4. Across tasks, results show strong surface-level knowledge but limited abstract generalization, with partial linear encoding in embeddings and substantial gaps relative to human meaning. The findings argue for a quasi-semantic competence in current LLMs and highlight the need for grounding or substructure-based representations to achieve robust meronymic reasoning and generalization.

Abstract

Understanding the extent and depth of the semantic competence of \emph{Large Language Models} (LLMs) is at the center of the current scientific agenda in Artificial Intelligence (AI) and Computational Linguistics (CL). We contribute to this endeavor by investigating their knowledge of the \emph{part-whole} relation, a.k.a. \emph{meronymy}, which plays a crucial role in lexical organization, but it is significantly understudied. We used data from ConceptNet relations \citep{speer2016conceptnet} and human-generated semantic feature norms \citep{McRae:2005} to explore the abilities of LLMs to deal with \textit{part-whole} relations. We employed several methods based on three levels of analysis: i.) \textbf{behavioral} testing via prompting, where we directly queried the models on their knowledge of meronymy, ii.) sentence \textbf{probability} scoring, where we tested models' abilities to discriminate correct (real) and incorrect (asymmetric counterfactual) \textit{part-whole} relations, and iii.) \textbf{concept representation} analysis in vector space, where we proved the linear organization of the \textit{part-whole} concept in the embedding and unembedding spaces. These analyses present a complex picture that reveals that the LLMs' knowledge of this relation is only partial. They have just a ``\emph{quasi}-semantic'' competence and still fall short of capturing deep inferential properties.

Paper Structure

This paper contains 19 sections, 3 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Prompt templates used for Task 1: meronymy understanding.
  • Figure 2: Prompt template in 5-shot used for LlaMA2-7b in the first task
  • Figure 3: Prompt template for Task 2: part generation.
  • Figure 4: Models' accuracy for Task 1: question answering (top line), statement verification (bottom line), original pairs (left column), swapped items (right column)
  • Figure 5: Global accuracy of the LLMs in satisfying the Meronymy Knowledge Criterion.
  • ...and 7 more figures