Notes on the Mathematical Structure of GPT LLM Architectures
Spencer Becker-Kahn
Abstract
An exposition of the mathematics underpinning the neural network architecture of a GPT-3-style LLM.
Spencer Becker-Kahn
An exposition of the mathematics underpinning the neural network architecture of a GPT-3-style LLM.
This paper contains 16 sections, 43 equations.