Transcendence: Generative Models Can Outperform The Experts That Train Them

Edwin Zhang; Vincent Zhu; Naomi Saphra; Anat Kleiman; Benjamin L. Edelman; Milind Tambe; Sham M. Kakade; Eran Malach

Transcendence: Generative Models Can Outperform The Experts That Train Them

Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham M. Kakade, Eran Malach

TL;DR

It is theoretically prove that transcendence can be enabled by low-temperature sampling, and rigorously assess this claim experimentally.

Abstract

Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on. Therefore, when trained on data generated by humans, we may not expect the artificial model to outperform the humans on their original objectives. In this work, we study the phenomenon of transcendence: when a generative model achieves capabilities that surpass the abilities of the experts generating its data. We demonstrate transcendence by training an autoregressive transformer to play chess from game transcripts, and show that the trained model can sometimes achieve better performance than all players in the dataset. We theoretically prove that transcendence can be enabled by low-temperature sampling, and rigorously assess this claim experimentally. Finally, we discuss other sources of transcendence, laying the groundwork for future investigation of this phenomenon in a broader setting.

Transcendence: Generative Models Can Outperform The Experts That Train Them

TL;DR

It is theoretically prove that transcendence can be enabled by low-temperature sampling, and rigorously assess this claim experimentally.

Abstract

Paper Structure (33 sections, 5 theorems, 19 equations, 14 figures, 2 tables)

This paper contains 33 sections, 5 theorems, 19 equations, 14 figures, 2 tables.

Introduction
Definition of Transcendence
Conditions for Transcendence
Low-Temperature Sampling is Necessary for Transcendence
Transcendence with Low-Temperature Sampling
Denoising a Single Expert
Transcendence from Multiple Experts
Experiments
Experimental Setup
Training Details.
Evaluation.
Experimental Results
Main Result: Low-temperature sampling enables transcendence.
Lowering temperature increases rewards in expectation on specific states, leading to transcendence over the full game.
Dataset diversity is essential for transcendence.
...and 18 more sections

Key Result

Proposition 1

For all choice of $f_1, \dots, f_k$ and $p_{\mathrm{test}}$, there exists some $f_i$ s.t. $R_{p_{\mathrm{test}}}(f_i) \ge R_{p_{\mathrm{test}}}(\hat{f})$.

Figures (14)

Figure 1: Ratings of our autoregressive decoder-only transformer, ChessFormer, over several different temperatures. We refer to our models as "ChessFormer <Maximum Glicko-2 rating seen during training>" to easily distinguish between different models in subsequent sections. Each model is trained only on games with players up to a certain rating ($1000$, $1300$, $1500$, respectively). We report 95% confidence intervals calculated through taking $\pm 1.96 \sigma$.
Figure 2: Visualizing the denoising effects of low temperature on the action distribution: an example of ChessFormer shifting probability mass towards the high reward move of trapping the queen with the rook as the temperature $\tau$ decreases. Opacity of the red arrows represent the probability mass given to different moves. The color of the square represent the reward that would be given for taking the action that moves the given piece to that state. Purple here is high reward, while blue is low. For more visualizations, see \ref{['app:denoising-viz']}.
Figure 3: Inspired by mnih2015human, we generate a t-SNE embedding van2008visualizing of ChessFormer's last hidden layer latent representations of game transcripts during training time. The colors represent the probability of winning, with $+1$ corresponding to a state where White has won and $0$ to Black. Probabiliy of winning is computed through the Stockfish analysis engine. We also visualize several board states associated with different clusters in the t-SNE embedding, and their associated expected reward when following the expert Stockfish distribution. Note that the model distinguishes between states where the outcome has already been determined (the two left boards), versus opening states that are extremely similar (the two right boards). See the full t-SNE in \ref{['app:full-tsne']}.
Figure 4: The favor probability distribution, or change in expected reward by setting temperature lower than $\tau = 1.0$. We plot the favor distribution across two different temperatures: setting $\tau = .75$ and $\tau = 0.001$ by running the Stockfish analysis engine across $100$ total Chessformer $1000$ games played at $0.001$ temperature against Stockfish level $1$ (as theoretically justified by PDL Kakade2002ApproximatelyOA). We calculate favor by sampling $100$ counterfactual potential moves at $\tau=1.0$ per actual move made at $\tau=0.001$ to compute a baseline expected reward. In total, we gather an empirical probability distribution with $n = 382,000$ total samples per $\tau$ ($38.2$ moves on average per game). Note that we plot the distributions with transparency, so the brownish area is where the two overlap. We visualize several long-tail examples in \ref{['app:denoising-viz']}.
Figure 5: Action distribution diversity, as measured by the average normalized entropy over different chess rating dataset cutoffs with $n = 2681, 3037, 3169$ common states for ratings $1000, 1300, 1500$, respectively. These entropies are calculated directly from the empiricial frequencies of our dataset, and are model-agnostic.
...and 9 more figures

Theorems & Definitions (12)

Definition 1
Remark 1
Proposition 1
Proposition 2
Proposition 3
Proposition 4
proof
proof
proof
Proposition 5
...and 2 more

Transcendence: Generative Models Can Outperform The Experts That Train Them

TL;DR

Abstract

Transcendence: Generative Models Can Outperform The Experts That Train Them

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (12)