Hallucination, reliability, and the role of generative AI in science

Charles Rathkopf

Hallucination, reliability, and the role of generative AI in science

Charles Rathkopf

TL;DR

The paper addresses hallucination as a reliability threat in scientific AI and argues that a data-centric view is insufficient. It advocates a phenomenon-centric assessment and demonstrates, through AlphaFold 3 and GenCast, how theory-guided training and confidence-based (or ensemble-based) uncertainty management can convert model-generated errors into bounded, manageable risk within theoretically mature domains. By defining hallucination as a non-strategic misrepresentation of the target phenomenon and situating reliability in the workflow rather than in internal parameters, the work extends computational reliabilism to generative AI in science. The result is a framework where opaque models can still support reliable inference and discovery when embedded in disciplined, theory-rooted, uncertainty-aware practices, although applicability depends on domain maturity and robust validation infrastructure.

Abstract

Generative AI increasingly supports scientific inference, from protein structure prediction to weather forecasting. Yet its distinctive failure mode, hallucination, raises epistemic alarm bells. I argue that this failure mode can be addressed by shifting from data-centric to phenomenon-centric assessment. Through case studies of AlphaFold and GenCast, I show how scientific workflows discipline generative models through theory-guided training and confidence-based error screening. These strategies convert hallucination from an unmanageable epistemic threat into bounded risk. When embedded in such workflows, generative models support reliable inference despite opacity, provided they operate in theoretically mature domains.

Hallucination, reliability, and the role of generative AI in science

TL;DR

Abstract

Hallucination, reliability, and the role of generative AI in science

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)