Table of Contents
Fetching ...

Position: Stop Making Unscientific AGI Performance Claims

Patrick Altmeyer, Andrew M. Demetriou, Antony Bartlett, Cynthia C. S. Liem

TL;DR

This paper critiques the rising practice of making unscientific AGI performance claims driven by powerful AI models and public discourse. It argues that patterns found in latent spaces—whether via random projections, PCA, or linear probes—do not constitute evidence of genuine world understanding or general intelligence. Through a series of experiments and adversarial probes, the authors show that such signals can arise from data correlations or memorization, not intrinsic cognition, and they emphasize the influence of human biases like anthropomorphism and confirmation bias. The authors propose structural and cultural reforms—rigorous hypothesis testing, explicit bias handling, definitional clarity, and contributorship with open review and red-teaming—to foster more reliable and ethically responsible AI research and communication.

Abstract

Developments in the field of Artificial Intelligence (AI), and particularly large language models (LLMs), have created a 'perfect storm' for observing 'sparks' of Artificial General Intelligence (AGI) that are spurious. Like simpler models, LLMs distill meaningful representations in their latent embeddings that have been shown to correlate with external variables. Nonetheless, the correlation of such representations has often been linked to human-like intelligence in the latter but not the former. We probe models of varying complexity including random projections, matrix decompositions, deep autoencoders and transformers: all of them successfully distill information that can be used to predict latent or external variables and yet none of them have previously been linked to AGI. We argue and empirically demonstrate that the finding of meaningful patterns in latent spaces of models cannot be seen as evidence in favor of AGI. Additionally, we review literature from the social sciences that shows that humans are prone to seek such patterns and anthropomorphize. We conclude that both the methodological setup and common public image of AI are ideal for the misinterpretation that correlations between model representations and some variables of interest are 'caused' by the model's understanding of underlying 'ground truth' relationships. We, therefore, call for the academic community to exercise extra caution, and to be keenly aware of principles of academic integrity, in interpreting and communicating about AI research outcomes.

Position: Stop Making Unscientific AGI Performance Claims

TL;DR

This paper critiques the rising practice of making unscientific AGI performance claims driven by powerful AI models and public discourse. It argues that patterns found in latent spaces—whether via random projections, PCA, or linear probes—do not constitute evidence of genuine world understanding or general intelligence. Through a series of experiments and adversarial probes, the authors show that such signals can arise from data correlations or memorization, not intrinsic cognition, and they emphasize the influence of human biases like anthropomorphism and confirmation bias. The authors propose structural and cultural reforms—rigorous hypothesis testing, explicit bias handling, definitional clarity, and contributorship with open review and red-teaming—to foster more reliable and ethically responsible AI research and communication.

Abstract

Developments in the field of Artificial Intelligence (AI), and particularly large language models (LLMs), have created a 'perfect storm' for observing 'sparks' of Artificial General Intelligence (AGI) that are spurious. Like simpler models, LLMs distill meaningful representations in their latent embeddings that have been shown to correlate with external variables. Nonetheless, the correlation of such representations has often been linked to human-like intelligence in the latter but not the former. We probe models of varying complexity including random projections, matrix decompositions, deep autoencoders and transformers: all of them successfully distill information that can be used to predict latent or external variables and yet none of them have previously been linked to AGI. We argue and empirically demonstrate that the finding of meaningful patterns in latent spaces of models cannot be seen as evidence in favor of AGI. Additionally, we review literature from the social sciences that shows that humans are prone to seek such patterns and anthropomorphize. We conclude that both the methodological setup and common public image of AI are ideal for the misinterpretation that correlations between model representations and some variables of interest are 'caused' by the model's understanding of underlying 'ground truth' relationships. We, therefore, call for the academic community to exercise extra caution, and to be keenly aware of principles of academic integrity, in interpreting and communicating about AI research outcomes.
Paper Structure (29 sections, 1 theorem, 15 figures, 1 table)

This paper contains 29 sections, 1 theorem, 15 figures, 1 table.

Key Result

Proposition 2.1

Figures (15)

  • Figure 1: Predicted coordinate values (out-of-sample) from a linear probe on final-layer activations of an untrained neural network.
  • Figure 2: Top chart: The first two principal components of US Treasury yields over time at daily frequency. Bottom chart: Observed average level and 10yr-3mo spread of the yield curve. Vertical stalks roughly indicate the onset ($|$GFC) and the beginning of the aftermath (GFC$|$) of the Global Financial Crisis.
  • Figure 3: Out-of-sample root mean squared error (RMSE) for the linear probe plotted against FOMC-RoBERTa's $n$-th layer for different indicators. The values correspond to averages computed across cross-validation folds, where we have used an expanding window approach to split the time series. As expected, model performance tends to be higher (average prediction errors are lower) for layers near the end of the transformer model.
  • Figure 4: Probe predictions for sentences about inflation of prices (IP), deflation of prices (DP), inflation of birds (IB) and deflation of birds (DB). The vertical axis shows predicted inflation levels subtracted by the average predicted value of the probe for random noise.
  • Figure 5: The left chart shows the actual GDP growth and fitted values from the autoencoder model. The right chart shows the observed average level and spread of the yield curve (solid) along with the predicted values (in-sample) from the linear probe based on the latent embeddings (dashed).
  • ...and 10 more figures

Theorems & Definitions (1)

  • Proposition 2.1: Parrot Test