Rare Event Analysis of Large Language Models
Jake McAllister Dorman, Edward Gillman, Dominic C. Rose, Jamie F. Mair, Juan P. Garrahan
TL;DR
The paper tackles the problem of understanding rare, impactful completions in large language models by presenting a practical Rare Event Analysis (REA) framework that integrates stochastic-process modeling, importance sampling, exponential tilting, Transition Path Sampling (TPS), and MBAR to estimate tail probabilities and explore atypical outputs. It demonstrates the approach on the TinyStories-8M model using two observables, the Automated Readability Index $ARI$ and the Logarithm of completion probability $Log\text{-}Prob$, showing how biased sampling and MBAR can reveal tail behavior inaccessible to direct sampling. Key contributions include a complete end-to-end REA workflow, a practical guide for implementation, tail-probability estimates for two observables, exploratory data analysis of rare completions, and a roadmap for extending these methods to other models and contexts. The work highlights the importance of robust tail analysis for safety and reliability in deployment, offering scalable methodologies and forward-looking directions such as adaptive biases, parallel tempering, infilling proposals, and prompt-based exploration for red-teaming and safety evaluation.
Abstract
Being probabilistic models, during inference large language models (LLMs) display rare events: behaviour that is far from typical but highly significant. By definition all rare events are hard to see, but the enormous scale of LLM usage means that events completely unobserved during development are likely to become prominent in deployment. Here we present an end-to-end framework for the systematic analysis of rare events in LLMs. We provide a practical implementation spanning theory, efficient generation strategies, probability estimation and error analysis, which we illustrate with concrete examples. We outline extensions and applications to other models and contexts, highlighting the generality of the concepts and techniques presented here.
