Table of Contents
Fetching ...

Context versus Prior Knowledge in Language Models

Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell

TL;DR

The paper addresses how language models integrate prior knowledge with contextual information by introducing two information-theoretic metrics, the persuasion score $\psi$ and the susceptibility score $\chi$, to quantify context influence relative to prior knowledge. It provides a formal probabilistic framework, alongside entity-independent extensions, to measure how context and entity affect model answers using a synthetic 122-relation dataset derived from YAGO and multiple Pythia models. Empirical validation shows that context relevance and assertiveness boost persuasion, while familiarity and training-data frequency relate to susceptibility, with real entities generally less susceptible than unfamiliar ones. The work demonstrates practical applications for social science measurement and bias analysis, discusses limitations, and points to future work in retrieval-augmented generation and model control, highlighting the importance of transparent tools to analyze context–knowledge interactions in LLMs.

Abstract

To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer distribution about an entity. We empirically test our metrics for their validity and reliability. Finally, we explore and find a relationship between the scores and the model's expected familiarity with an entity, and provide two use cases to illustrate their benefits.

Context versus Prior Knowledge in Language Models

TL;DR

The paper addresses how language models integrate prior knowledge with contextual information by introducing two information-theoretic metrics, the persuasion score and the susceptibility score , to quantify context influence relative to prior knowledge. It provides a formal probabilistic framework, alongside entity-independent extensions, to measure how context and entity affect model answers using a synthetic 122-relation dataset derived from YAGO and multiple Pythia models. Empirical validation shows that context relevance and assertiveness boost persuasion, while familiarity and training-data frequency relate to susceptibility, with real entities generally less susceptible than unfamiliar ones. The work demonstrates practical applications for social science measurement and bias analysis, discusses limitations, and points to future work in retrieval-augmented generation and model control, highlighting the importance of transparent tools to analyze context–knowledge interactions in LLMs.

Abstract

To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer distribution about an entity. We empirically test our metrics for their validity and reliability. Finally, we explore and find a relationship between the scores and the model's expected familiarity with an entity, and provide two use cases to illustrate their benefits.
Paper Structure (69 sections, 4 theorems, 13 equations, 18 figures, 1 table)

This paper contains 69 sections, 4 theorems, 13 equations, 18 figures, 1 table.

Key Result

Proposition 1

Given random variable $X$ over the discrete space ${\mathcal{X}}$, random variable $Y$ over the discrete space ${\mathcal{Y}}$, and $x \in {\mathcal{X}}$, then:

Figures (18)

  • Figure 1: In answering a given query, a model may be more susceptible to context for some entities than others, while some contexts may be more persuasive than others (as indicated in this figure by color darkness in the rightmost column). We introduce mutual information-based metrics to evaluate how much impact the context has relative to the prior knowledge of a model.
  • Figure 2: The x-axis represents bins for whether the model's answer agreed with its prior, the context, or neither. For open (left) queries, the persuasion scores of contexts that persuaded the model to output an answer matching the context (Context) are higher than those of contexts that did not (Original, Other).
  • Figure 3: Susceptibility score ($y$-axis) against the MR divided into 5 bins between 0 and 1 ($x$-axis) for all entities and queries. The opacity represents the proportion of points in each bin. For both open queries ($\bullet$) and closed queries ($\bullet$), we see a decreasing upper bound between the MR and susceptibility score. While the quartiles of the open queries generally decrease (except for the lowest bin), the opposite occurs for closed queries.
  • Figure 4: Summarizing across all 122 queries, we display the variance of susceptibility scores (blue) and persuasion scores (orange), across three random seeds (left) and across two query forms (right), and stratified for both closed and open queries ($x$-axis). The variance is very low across random seeds for both query types, and, for closed queries, across the specific query form. Variance is high for the different open query forms.
  • Figure 5: The plots in \ref{['fig:models_sig_prop']} indicate the proportion of queries for which ($\color{MyBronze}{\blacktriangledown}$) relevant contexts are significantly more persuasive than irrelevant contexts, ($\color{MyBlue}{\bullet}$) unfamiliar entities are significantly more susceptible than familiar entities, ($\color{MyOrange}{\blacksquare}$) assertive contexts are significantly more persuasive than base contexts, and ($\color{MyRed}{\blacktriangle}$) negation contexts are significantly more persuasive than base contexts. We further provide the average effect size over queries of those comparisons in \ref{['fig:models_effect_sz']}. We highlight specific findings in \ref{['sec:persuasion_pred_val']} and \ref{['sec:sus_pred_validity_real_vs_fake']}.
  • ...and 13 more figures

Theorems & Definitions (8)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Corollary 1
  • proof