Table of Contents
Fetching ...

Probing Neural Topology of Large Language Models

Yu Zheng, Yuan Yuan, Yue Zhuo, Yong Li, Gabriel Kreiman, Tomaso Poggio, Paolo Santi

TL;DR

This work introduces graph probing to study the neural topology of large language models by constructing dynamic connectivity graphs from token-by-token neuron time series and relating them to language generation performance. Using simple linear or MLP probes on flattened adjacency matrices, the authors show that neural topology universally predicts perplexity and semantic representations across model families, often outperforming activation-based probes by large margins, even when only 1% of connections are retained. They provide causal evidence via interventions that hub neurons and a stable default network are functionally leveraged by LLMs, and they demonstrate practical applications in pruning and hallucination detection, as well as domain-specific topology and model fingerprinting. The findings highlight the rich information contained in topology over raw activations, with implications for more efficient, reliable, and interpretable AI systems, and they open avenues for extending graph probing to larger models and multimodal architectures. $\mathrm{PPL}(X) = \exp\left(-\frac{1}{t} \sum_{i=1}^{t} \log p_\theta(x_i \mid x_{<i})\right)$ is used to quantify generation performance, and the approach leverages topology-derived signals to guide pruning and safety improvements.

Abstract

Probing large language models (LLMs) has yielded valuable insights into their internal mechanisms by linking neural activations to interpretable semantics. However, the complex mechanisms that link neuron's functional co-activation with the emergent model capabilities remains largely unknown, hindering a deeper understanding and safer development of LLMs. In this work, we introduce graph probing, a method for uncovering the functional connectivity of LLM neurons and relating it to language generation performance. By probing models across diverse LLM families and scales, we discover a universal predictability of language generation and understanding performance using only neural topology, which persists even when retaining just 1% of neuron connections. Strikingly, probing on topology outperforms probing on activation by up to 130.4% and 67.7% on perplexity and space/time semantic regression respectively, suggesting that neural topology contains orders of richer information of LLM performance than neural activation, which can be easily extracted with simple linear or MLP probes. To explain the dependence between neural topology and language performance, we identify default networks and hub neurons in LLMs and provide causal evidence by interventional experiments on multiple benchmarks, showing that LLMs actually exploit these topological information. Further analyses suggest that graph probing can be effectively leveraged to improve the efficiency and reliability of LLMs through proof-of-concept applications in model pruning and hallucination detection. Codes and data for the graph probing toolbox are available at https://github.com/DavyMorgan/llm-graph-probing.

Probing Neural Topology of Large Language Models

TL;DR

This work introduces graph probing to study the neural topology of large language models by constructing dynamic connectivity graphs from token-by-token neuron time series and relating them to language generation performance. Using simple linear or MLP probes on flattened adjacency matrices, the authors show that neural topology universally predicts perplexity and semantic representations across model families, often outperforming activation-based probes by large margins, even when only 1% of connections are retained. They provide causal evidence via interventions that hub neurons and a stable default network are functionally leveraged by LLMs, and they demonstrate practical applications in pruning and hallucination detection, as well as domain-specific topology and model fingerprinting. The findings highlight the rich information contained in topology over raw activations, with implications for more efficient, reliable, and interpretable AI systems, and they open avenues for extending graph probing to larger models and multimodal architectures. is used to quantify generation performance, and the approach leverages topology-derived signals to guide pruning and safety improvements.

Abstract

Probing large language models (LLMs) has yielded valuable insights into their internal mechanisms by linking neural activations to interpretable semantics. However, the complex mechanisms that link neuron's functional co-activation with the emergent model capabilities remains largely unknown, hindering a deeper understanding and safer development of LLMs. In this work, we introduce graph probing, a method for uncovering the functional connectivity of LLM neurons and relating it to language generation performance. By probing models across diverse LLM families and scales, we discover a universal predictability of language generation and understanding performance using only neural topology, which persists even when retaining just 1% of neuron connections. Strikingly, probing on topology outperforms probing on activation by up to 130.4% and 67.7% on perplexity and space/time semantic regression respectively, suggesting that neural topology contains orders of richer information of LLM performance than neural activation, which can be easily extracted with simple linear or MLP probes. To explain the dependence between neural topology and language performance, we identify default networks and hub neurons in LLMs and provide causal evidence by interventional experiments on multiple benchmarks, showing that LLMs actually exploit these topological information. Further analyses suggest that graph probing can be effectively leveraged to improve the efficiency and reliability of LLMs through proof-of-concept applications in model pruning and hallucination detection. Codes and data for the graph probing toolbox are available at https://github.com/DavyMorgan/llm-graph-probing.

Paper Structure

This paper contains 19 sections, 10 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Overview of graph probing. We extract the neuron activity time series in an LLM as it processes text token by token. We then compute temporal and functional correlations between neural activations to obtain topological connectivity graphs of neurons. Unlike existing probing methods that take neural activation as input, we train linear or MLP probes on flattened neural topology to predict the language generation performance for the input token sequence.
  • Figure 2: Out-of-sample performance of linear and MLP probing on the test set for (a) GPT-2 (b) Pythia-160M (c) Qwen2.5-0.5B. We compare activation-based probing and our topology-based probing. The correlation between the perplexity predicted by probing and the ground-truth perplexity reflects how well LLM performance can be inferred from neural activation or topology.
  • Figure 3: (a) Out-of-sample graph probing performance on neural topology of different sparsity levels. (b) Out-of-sample probing performance on LLMs of different sizes. (c) Out-of-sample performance of graph probing on different layers of LLMs.
  • Figure 4: (a-b) Occurrence frequency of hub nodes in (a) Qwen2.5-0.5B and (b) Qwen2.5-1.5B on MMLU benchmark. (c) Accuracy on MMLU benchmark of Qwen2.5 models (0.5B, 1.5B, 3B, 7B, 14B) under different interventions of top 1% neurons.
  • Figure 5: (a) Accuracy on MMLU benchmark under different levels of model pruning based on neural topology and activation (WANDA). (b) Accuracy of hallucination detection of different approaches on TruthfulQA dataset. (c) Coupling index of neural topology for hallucination on TruthfulQA datset.
  • ...and 4 more figures