Table of Contents
Fetching ...

From the New World of Word Embeddings: A Comparative Study of Small-World Lexico-Semantic Networks in LLMs

Zhu Liu, Ying Liu, KangYang Luo, Cunliang Kong, Maosong Sun

TL;DR

Decoder-only LLM input embeddings can be used to construct lexico-semantic networks over the full vocabulary. These networks exhibit small-world properties, with high clustering and short paths, though larger models show longer and more complex semantic routes, indicating richer relational structure. The authors validate the approach across three scenarios -- common concepts, WordNet based relations, and cross linguistic qualitative words -- finding partial alignment with human lexical knowledge and cross language patterns. The study provides a scalable method for building conceptual spaces and has implications for cognitive science, language typology, and semantic mapping in AI systems.

Abstract

Lexico-semantic networks represent words as nodes and their semantic relatedness as edges. While such networks are traditionally constructed using embeddings from encoder-based models or static vectors, embeddings from decoder-only large language models (LLMs) remain underexplored. Unlike encoder models, LLMs are trained with a next-token prediction objective, which does not directly encode the meaning of the current token. In this paper, we construct lexico-semantic networks from the input embeddings of LLMs with varying parameter scales and conduct a comparative analysis of their global and local structures. Our results show that these networks exhibit small-world properties, characterized by high clustering and short path lengths. Moreover, larger LLMs yield more intricate networks with less small-world effects and longer paths, reflecting richer semantic structures and relations. We further validate our approach through analyses of common conceptual pairs, structured lexical relations derived from WordNet, and a cross-lingual semantic network for qualitative words.

From the New World of Word Embeddings: A Comparative Study of Small-World Lexico-Semantic Networks in LLMs

TL;DR

Decoder-only LLM input embeddings can be used to construct lexico-semantic networks over the full vocabulary. These networks exhibit small-world properties, with high clustering and short paths, though larger models show longer and more complex semantic routes, indicating richer relational structure. The authors validate the approach across three scenarios -- common concepts, WordNet based relations, and cross linguistic qualitative words -- finding partial alignment with human lexical knowledge and cross language patterns. The study provides a scalable method for building conceptual spaces and has implications for cognitive science, language typology, and semantic mapping in AI systems.

Abstract

Lexico-semantic networks represent words as nodes and their semantic relatedness as edges. While such networks are traditionally constructed using embeddings from encoder-based models or static vectors, embeddings from decoder-only large language models (LLMs) remain underexplored. Unlike encoder models, LLMs are trained with a next-token prediction objective, which does not directly encode the meaning of the current token. In this paper, we construct lexico-semantic networks from the input embeddings of LLMs with varying parameter scales and conduct a comparative analysis of their global and local structures. Our results show that these networks exhibit small-world properties, characterized by high clustering and short path lengths. Moreover, larger LLMs yield more intricate networks with less small-world effects and longer paths, reflecting richer semantic structures and relations. We further validate our approach through analyses of common conceptual pairs, structured lexical relations derived from WordNet, and a cross-lingual semantic network for qualitative words.

Paper Structure

This paper contains 29 sections, 1 equation, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Outline of our lexico-semantic network construction. First, we extract the input word embeddings ($\mathcal{E}$) for the LLM vocabulary ($\mathcal{V}$). Next, we build a complete graph $\mathcal{C}$ by calculating the cosine similarity between all embedding pairs. Finally, we retain edges based on similarity, from highest to lowest, until the graph $\mathcal{G}$ is connected. We then focus on specific connected subgraphs $\mathcal{G'}$ representing certain domains at a local level.
  • Figure 2: Logarithm of the number of connected components as the top $K$ ratio increases for Llama2-7B and Llama2-70B. A value of zero indicates a fully connected network, while the dotted line marks the first ratio at which both models become connected.
  • Figure 3: Shortest path lengths among semantic groups for Llama2-7B (left) and Llama2-70B (right).
  • Figure 4: Shortest Path Length Difference (Llama2-70B Minus Llama2-7B) Across Semantic Groups. The number of stars indicate the degree of the significance level of the difference.
  • Figure 5: Averaged shortest path length across relation types for both models. The number of stars indicates the significance of the difference.
  • ...and 4 more figures