Normality and the Turing Test
Alexandre Kabbach
TL;DR
Reframes the Turing test as a study of normal/average intelligence judged by a jury, grounded in the concept of statistical normality ($N(\mu,\sigma^2)$) and the aggregation of judgments across interrogators. It distinguishes intelligence as a developmental faculty and 'smartness' as an ideal of exceptional performance, arguing the test targets normal behavior rather than universal cognition. The author argues that large language models such as ChatGPT exemplify artificial smartness and are unlikely to pass the Turing test because they optimize for exceptional/ideal responses, though passing would require generating normal/average mistakes; historically, Turing predicted that an average interrogator would have about a $P=0.7$ chance of correct identification after sufficient questioning. The paper also critiques the test's game configuration, showing it tends to objectify normative ideals of normal behavior and raises concerns about cultural sampling and the meaningfulness of a single universal 'normal'.
Abstract
This paper proposes to revisit the Turing test through the concept of normality. Its core argument is that the Turing test is a test of normal intelligence as assessed by a normal judge. First, in the sense that the Turing test targets normal/average rather than exceptional human intelligence, so that successfully passing the test requires machines to "make mistakes" and display imperfect behavior just like normal/average humans. Second, in the sense that the Turing test is a statistical test where judgments of intelligence are never carried out by a single "average" judge (understood as non-expert) but always by a full jury. As such, the notion of "average human interrogator" that Turing talks about in his original paper should be understood primarily as referring to a mathematical abstraction made of the normalized aggregate of individual judgments of multiple judges. Its conclusions are twofold. First, it argues that large language models such as ChatGPT are unlikely to pass the Turing test as those models precisely target exceptional rather than normal/average human intelligence. As such, they constitute models of what it proposes to call artificial smartness rather than artificial intelligence, insofar as they deviate from the original goal of Turing for the modeling of artificial minds. Second, it argues that the objectivization of normal human behavior in the Turing test fails due to the game configuration of the test which ends up objectivizing normative ideals of normal behavior rather than normal behavior per se.
