QCD in Language Models: What do they really know about QCD?

Antonin Sulc; Patrick L. S. Connor

QCD in Language Models: What do they really know about QCD?

Antonin Sulc, Patrick L. S. Connor

TL;DR

The study assesses whether open-weight LLMs encode QCD knowledge and can assist physics research by applying a perplexity-based probing framework to models like Llama, Qwen, and Gemma. It investigates numerical constants such as $ ext{α}_s$, spin classifications, mediator associations, and context-dependent quark masses, revealing encoded knowledge as well as notable limitations. Key findings show a perceptible alignment with experimental values for $ ext{α}_s$ and correct force-mediator mappings, but mass knowledge is highly context-dependent and model performance varies by size. The work also introduces a Standard Model–grounded validation tool to support reliable scientific assistance, outlining practical implications and future improvement paths for open-weight LLMs in high-energy physics.

Abstract

This study presents an analysis of modern open-source large language models (LLMs) -- including Llama, Qwen, and Gemma -- to evaluate their encoded knowledge of Quantum Chromodynamics (QCD). Through reverse engineering of these models' representations, we uncover the naturally idiosyncratic patterns in how foundational QCD concepts are embedded within their parameter spaces. Our methodology combines targeted probing techniques and knowledge extraction protocols to assess the models' understanding of critical QCD principles like color confinement, asymptotic freedom, and the running coupling constant. This work provides a tool for utilizing LLMs as an assistant in physics research, while also highlighting current limitations in their representation of advanced quantum field theory concepts that future model development should address.

QCD in Language Models: What do they really know about QCD?

TL;DR

Abstract

QCD in Language Models: What do they really know about QCD?

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)