Table of Contents
Fetching ...

Hallucination, reliability, and the role of generative AI in science

Charles Rathkopf

TL;DR

The paper addresses hallucination as a reliability threat in scientific AI and argues that a data-centric view is insufficient. It advocates a phenomenon-centric assessment and demonstrates, through AlphaFold 3 and GenCast, how theory-guided training and confidence-based (or ensemble-based) uncertainty management can convert model-generated errors into bounded, manageable risk within theoretically mature domains. By defining hallucination as a non-strategic misrepresentation of the target phenomenon and situating reliability in the workflow rather than in internal parameters, the work extends computational reliabilism to generative AI in science. The result is a framework where opaque models can still support reliable inference and discovery when embedded in disciplined, theory-rooted, uncertainty-aware practices, although applicability depends on domain maturity and robust validation infrastructure.

Abstract

Generative AI increasingly supports scientific inference, from protein structure prediction to weather forecasting. Yet its distinctive failure mode, hallucination, raises epistemic alarm bells. I argue that this failure mode can be addressed by shifting from data-centric to phenomenon-centric assessment. Through case studies of AlphaFold and GenCast, I show how scientific workflows discipline generative models through theory-guided training and confidence-based error screening. These strategies convert hallucination from an unmanageable epistemic threat into bounded risk. When embedded in such workflows, generative models support reliable inference despite opacity, provided they operate in theoretically mature domains.

Hallucination, reliability, and the role of generative AI in science

TL;DR

The paper addresses hallucination as a reliability threat in scientific AI and argues that a data-centric view is insufficient. It advocates a phenomenon-centric assessment and demonstrates, through AlphaFold 3 and GenCast, how theory-guided training and confidence-based (or ensemble-based) uncertainty management can convert model-generated errors into bounded, manageable risk within theoretically mature domains. By defining hallucination as a non-strategic misrepresentation of the target phenomenon and situating reliability in the workflow rather than in internal parameters, the work extends computational reliabilism to generative AI in science. The result is a framework where opaque models can still support reliable inference and discovery when embedded in disciplined, theory-rooted, uncertainty-aware practices, although applicability depends on domain maturity and robust validation infrastructure.

Abstract

Generative AI increasingly supports scientific inference, from protein structure prediction to weather forecasting. Yet its distinctive failure mode, hallucination, raises epistemic alarm bells. I argue that this failure mode can be addressed by shifting from data-centric to phenomenon-centric assessment. Through case studies of AlphaFold and GenCast, I show how scientific workflows discipline generative models through theory-guided training and confidence-based error screening. These strategies convert hallucination from an unmanageable epistemic threat into bounded risk. When embedded in such workflows, generative models support reliable inference despite opacity, provided they operate in theoretically mature domains.

Paper Structure

This paper contains 14 sections, 1 equation, 1 figure.

Figures (1)

  • Figure 1: Diagram illustrating the relationship between a target phenomenon, a dataset constructed from observations of the target (second box), a generative deep neural network (DNN), and the DNN's output. The DNN produces outputs that resemble samples from the training data but do not represent them. Whether an output functions as a representation depends on our inferential practices, and in scientific contexts, these practices are aimed at understanding the target phenomenon—not merely reconstructing the training data. The backward arrow ("similarity without representation") indicates that while the model output may exhibit statistical similarity to training data, it is not used as a representation of the training data itself. Rightward arrows indicate causal rather than representational relations.