Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

Robert Yang

Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

Robert Yang

TL;DR

This paper reframes unlearning as an ablation to test constructive knowledge generation in large language models, defining a target $T$ and its forget-closure $\mathcal{F}(T)$ as the basis for a falsifiable rediscovery test. It outlines a formal framework where strong unlearning of $\mathcal{F}(T)$ is followed by re-derivation from permitted axioms/tools, verifiable by external checkers. A Minimal Pilot in mathematics and algorithms is proposed, along with the Ablation-to-Discovery (A2D) benchmark concept and plans to extend to physics, chemistry, and biology. By tying unlearning fidelity to discovery feasibility, the work aims to establish a principled, falsifiable path toward measuring genuine AI-driven scientific discovery.

Abstract

Bold claims about AI's role in science-from "AGI will cure all diseases" to promises of radically accelerated discovery-raise a central epistemic question: do large language models (LLMs) truly generate new knowledge, or do they merely remix memorized fragments? We propose unlearning-as-ablation as a falsifiable probe of constructive scientific discovery. The idea is to systematically remove a target result together with its forget-closure (supporting lemmas, paraphrases, and multi-hop entailments) and then evaluate whether the model can re-derive the result from only permitted axioms and tools. Success would indicate generative capability beyond recall; failure would expose current limits. Unlike prevailing motivations for unlearning-privacy, copyright, or safety-our framing repositions it as an epistemic probe for AI-for-Science. We outline a minimal pilot in mathematics and algorithms to illustrate feasibility, and sketch how the same approach could later be extended to domains such as physics or chemistry. This is a position paper: our contribution is conceptual and methodological, not empirical. We aim to stimulate discussion on how principled ablation tests could help distinguish models that reconstruct knowledge from those that merely retrieve it, and how such probes might guide the next generation of AI-for-Science benchmarks.

Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

TL;DR

Abstract

Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

TL;DR

Abstract

Paper Structure

Table of Contents