Table of Contents
Fetching ...

The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History?

Jérémie Sublime

TL;DR

This paper argues that modern ML/DL systems frequently misinterpret correlations as causation, effectively reviving pseudosciences under a polished AI veneer. It surveys high-stakes applications in justice, security, and sociology to show how high performance metrics can obscure real harms, particularly via false positives. It critiques the reliance on theory-free, data-driven approaches and biased datasets, advocating for harm-focused metrics and continuous human oversight. By drawing on statistical history, the authors call for rethinking AI models, evaluation criteria, and domain-aligned ethics training to prevent discriminatory or dangerous outcomes.

Abstract

In today's world, AI programs powered by Machine Learning are ubiquitous, and have achieved seemingly exceptional performance across a broad range of tasks, from medical diagnosis and credit rating in banking, to theft detection via video analysis, and even predicting political or sexual orientation from facial images. These predominantly deep learning methods excel due to their extraordinary capacity to process vast amounts of complex data to extract complex correlations and relationship from different levels of features. In this paper, we contend that the designers and final users of these ML methods have forgotten a fundamental lesson from statistics: correlation does not imply causation. Not only do most state-of-the-art methods neglect this crucial principle, but by doing so they often produce nonsensical or flawed causal models, akin to social astrology or physiognomy. Consequently, we argue that current efforts to make AI models more ethical by merely reducing biases in the training data are insufficient. Through examples, we will demonstrate that the potential for harm posed by these methods can only be mitigated by a complete rethinking of their core models, improved quality assessment metrics and policies, and by maintaining humans oversight throughout the process.

The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History?

TL;DR

This paper argues that modern ML/DL systems frequently misinterpret correlations as causation, effectively reviving pseudosciences under a polished AI veneer. It surveys high-stakes applications in justice, security, and sociology to show how high performance metrics can obscure real harms, particularly via false positives. It critiques the reliance on theory-free, data-driven approaches and biased datasets, advocating for harm-focused metrics and continuous human oversight. By drawing on statistical history, the authors call for rethinking AI models, evaluation criteria, and domain-aligned ethics training to prevent discriminatory or dangerous outcomes.

Abstract

In today's world, AI programs powered by Machine Learning are ubiquitous, and have achieved seemingly exceptional performance across a broad range of tasks, from medical diagnosis and credit rating in banking, to theft detection via video analysis, and even predicting political or sexual orientation from facial images. These predominantly deep learning methods excel due to their extraordinary capacity to process vast amounts of complex data to extract complex correlations and relationship from different levels of features. In this paper, we contend that the designers and final users of these ML methods have forgotten a fundamental lesson from statistics: correlation does not imply causation. Not only do most state-of-the-art methods neglect this crucial principle, but by doing so they often produce nonsensical or flawed causal models, akin to social astrology or physiognomy. Consequently, we argue that current efforts to make AI models more ethical by merely reducing biases in the training data are insufficient. Through examples, we will demonstrate that the potential for harm posed by these methods can only be mitigated by a complete rethinking of their core models, improved quality assessment metrics and policies, and by maintaining humans oversight throughout the process.

Paper Structure

This paper contains 13 sections, 7 equations, 1 figure, 3 tables.

Table of Contents

  1. Introduction
  2. State of the Art on potentially misguided and harmful AI applications
  3. The reanimation of pseudosciences by ML methods and its ethical implications
  4. Basics of Deep Learning and ML inference
  5. The silent return of physiognomy, Lombrosianism, phrenology, distorted sociobiology, social astrology and other quackeries with a new AI polish
  6. Physiognomy is "the facility to identify, from the form and constitution of external parts of the human body, chiefly the face, exclusive of all temporary signs of emotions, the constitution of the mind and the heart." -- Georg Christoph Lichtenberg, 1778
  7. Phrenology -or craniology- involves the measurement of bumps on the skull to predict mental traits.
  8. Lombrosianism is a theory in criminology developed in the late 19th century by Italian physician Cesare Lombroso. This theory suggest that criminal behavior is innate and can be identified through physical traits. Mister Lombroso believed that criminals were biologically different from non-criminals, often marked by "atavistic" features that resembled earlier stages of human evolution (such as certain facial structures or body types). This theory supports the idea that criminals are "born," not made, and could be distinguished by these primitive traits.
  9. Impact of ML quality metrics on the social harm potential of AI algorithms
  10. ML quality metrics for classification
  11. Assessing the real impact of error made by AI systems
  12. The Myth of theory-free inference and unbiased training data
  13. Conclusion

Figures (1)

  • Figure 1: On the left: a basic neural unit with 3 weighted inputs. - On the right: a simple network with a 3 features input layer, a 2-class output layer and a single neural layer in the middle. -- This figure shows how complex linear combinations of the original input features can be computed using different layers with activation functions $f(\cdot)$.