Table of Contents
Fetching ...

Efficient Knowledge Probing of Large Language Models by Adapting Pre-trained Embeddings

Kartik Sharma, Yiqiao Jin, Rakshit Trivedi, Srijan Kumar

TL;DR

The paper tackles the challenge of mapping what large language models actually know, given their stochastic training and the cost of probing via forward passes. It introduces PEEK, a framework that adapts pre-trained embedding models with a lightweight linear head to estimate an LLM's knowledge from a knowledge base without querying the model. The authors define four probing functions (binary generation, logits, activation, and fact generation) and two embedding families (sentence embeddings and knowledge-graph encoders), and show that adapted embeddings can predict LLM knowledge on held-out facts with up to about 90% accuracy across multiple LLMs and datasets. The work demonstrates that sentence-embedding proxies outperform graph-based encoders, highlights LoRA's limited advantage for tuning, and suggests practical uses in identifying knowledge gaps and guiding retrieval-augmented generation.

Abstract

Large language models (LLMs) acquire knowledge across diverse domains such as science, history, and geography encountered during generative pre-training. However, due to their stochasticity, it is difficult to predict what LLMs have acquired. Prior work has developed different ways to probe this knowledge by investigating the hidden representations, crafting specific task prompts, curating representative samples, and estimating their uncertainty. However, these methods require making forward passes through the underlying model to probe the LLM's knowledge about a specific fact, making them computationally expensive and time-consuming. To bridge this gap, we propose $\textbf{PEEK}$ or $\textbf{P}$roxy $\textbf{E}$mbeddings to $\textbf{E}$stimate $\textbf{K}$nowledge of LLMs, by leveraging the pre-trained embedding models that effectively encode factual knowledge as text or graphs as proxies for LLMs. First, we identify a training set of facts known by LLMs through various probing strategies and then adapt embedding models to predict the LLM outputs with a linear decoder layer. Comprehensive evaluation on $3$ Wikipedia-derived datasets, $4$ LLMs, and $7$ embedding models shows that embeddings can predict LLM knowledge on a held-out set with up to 90 % accuracy. Furthermore, we find that sentence embedding models are more suitable than graph embeddings to predict LLM knowledge, shedding light on the underlying representation of the factual landscape. Thus, we believe that knowledge-adapted embeddings can be used to identify knowledge gaps in LLMs at scale and can provide deeper insights into LLMs' internal inductive bias. The code and data are made available at https://github.com/claws-lab/peek.

Efficient Knowledge Probing of Large Language Models by Adapting Pre-trained Embeddings

TL;DR

The paper tackles the challenge of mapping what large language models actually know, given their stochastic training and the cost of probing via forward passes. It introduces PEEK, a framework that adapts pre-trained embedding models with a lightweight linear head to estimate an LLM's knowledge from a knowledge base without querying the model. The authors define four probing functions (binary generation, logits, activation, and fact generation) and two embedding families (sentence embeddings and knowledge-graph encoders), and show that adapted embeddings can predict LLM knowledge on held-out facts with up to about 90% accuracy across multiple LLMs and datasets. The work demonstrates that sentence-embedding proxies outperform graph-based encoders, highlights LoRA's limited advantage for tuning, and suggests practical uses in identifying knowledge gaps and guiding retrieval-augmented generation.

Abstract

Large language models (LLMs) acquire knowledge across diverse domains such as science, history, and geography encountered during generative pre-training. However, due to their stochasticity, it is difficult to predict what LLMs have acquired. Prior work has developed different ways to probe this knowledge by investigating the hidden representations, crafting specific task prompts, curating representative samples, and estimating their uncertainty. However, these methods require making forward passes through the underlying model to probe the LLM's knowledge about a specific fact, making them computationally expensive and time-consuming. To bridge this gap, we propose or roxy mbeddings to stimate nowledge of LLMs, by leveraging the pre-trained embedding models that effectively encode factual knowledge as text or graphs as proxies for LLMs. First, we identify a training set of facts known by LLMs through various probing strategies and then adapt embedding models to predict the LLM outputs with a linear decoder layer. Comprehensive evaluation on Wikipedia-derived datasets, LLMs, and embedding models shows that embeddings can predict LLM knowledge on a held-out set with up to 90 % accuracy. Furthermore, we find that sentence embedding models are more suitable than graph embeddings to predict LLM knowledge, shedding light on the underlying representation of the factual landscape. Thus, we believe that knowledge-adapted embeddings can be used to identify knowledge gaps in LLMs at scale and can provide deeper insights into LLMs' internal inductive bias. The code and data are made available at https://github.com/claws-lab/peek.

Paper Structure

This paper contains 42 sections, 1 theorem, 6 figures, 11 tables.

Key Result

Proposition 1

We are given a large language model $\mathcal{M}\xspace$ and a knowledge base (a set of facts) $\mathcal{K}\xspace$. Let $\mathcal{P}\xspace_{\mathcal{M}\xspace}: \mathcal{K}\xspace \rightarrow \mathcal{O}\xspace$ be a probing function that queries the LLM to determine if it knows a fact $f \in \mat

Figures (6)

  • Figure 1: Comparison of our proposed approach, Proxy Embeddings to Estimate Knowledge (PEEK) with other knowledge probing approaches.
  • Figure 2: Proxy Embeddings to Estimate Knowledge (PEEK): In this framework, pre-trained embedding models are adapted to match the LLM knowledge for a training set of facts identified using different probing mechanisms. On a held-out set, we can then predict whether an LLM knows a fact or not by using the fact's embedding.
  • Figure 3: Effect of changing the number of negative samples in GPT models for knowledge graphs.
  • Figure 4: Effect of changing the number of negative samples in Llama3.1-8B for knowledge graphs.
  • Figure 5: LoRA v/s Linear tuning on $1\%$ DBP100k
  • ...and 1 more figures

Theorems & Definitions (1)

  • Proposition 1: PEEK