REFACTOR: Learning to Extract Theorems from Proofs
Jin Peng Zhou, Yuhuai Wu, Qiyang Li, Roger Grosse
TL;DR
This work introduces REFACTOR, a neural method that learns to extract reusable theorems from formal proofs by expanding human proofs and training a graph-based model to identify embedded theorems. The approach yields unseen-theorem extraction accuracy of $19.6\%$, discovers hundreds of new theorems, and, when integrated into the Metamath library, enables significant proof refactoring and compression. A neural prover trained on the refactored dataset proves more test theorems (e.g., $+75$) and relies on a diverse set of newly extracted theorems (average usage $31.0\%$ of proved theorems). Overall, REFACTOR demonstrates the viability of data-driven theorem extraction to improve both library quality and automated proving performance, with broad potential for extending to other formal systems and program synthesis.
Abstract
Human mathematicians are often good at recognizing modular and reusable theorems that make complex mathematical results within reach. In this paper, we propose a novel method called theoREm-from-prooF extrACTOR (REFACTOR) for training neural networks to mimic this ability in formal mathematical theorem proving. We show on a set of unseen proofs, REFACTOR is able to extract 19.6% of the theorems that humans would use to write the proofs. When applying the model to the existing Metamath library, REFACTOR extracted 16 new theorems. With newly extracted theorems, we show that the existing proofs in the MetaMath database can be refactored. The new theorems are used very frequently after refactoring, with an average usage of 733.5 times, and help shorten the proof lengths. Lastly, we demonstrate that the prover trained on the new-theorem refactored dataset proves more test theorems and outperforms state-of-the-art baselines by frequently leveraging a diverse set of newly extracted theorems. Code can be found at https://github.com/jinpz/refactor.
