A Survey on Deep Learning for Theorem Proving
Zhaoyu Li, Jialiang Sun, Logan Murphy, Qidong Su, Zenan Li, Xian Zhang, Kaiyu Yang, Xujie Si
TL;DR
This survey critically maps the landscape of deep learning for theorem proving, covering five DL-enabled tasks, diverse datasets, and evaluation paradigms across informal and formal proving settings. It highlights how autoformalization, premise selection, proofstep generation, and proof search integrate with ATP/ITP systems, aided by modern LLMs, GNNs, and structured prompting. Key contributions include a comprehensive catalog of datasets and generation strategies, state-of-the-art performance summaries, and a candid discussion of challenges such as data scarcity and evaluation fragmentation, along with forward-looking directions like conjecturing and verified code generation. The work serves as a foundational reference for researchers aiming to advance AI-assisted formal reasoning and its practical deployment in mathematics and software verification.
Abstract
Theorem proving is a fundamental aspect of mathematics, spanning from informal reasoning in natural language to rigorous derivations in formal systems. In recent years, the advancement of deep learning, especially the emergence of large language models, has sparked a notable surge of research exploring these techniques to enhance the process of theorem proving. This paper presents a comprehensive survey of deep learning for theorem proving by offering (i) a thorough review of existing approaches across various tasks such as autoformalization, premise selection, proofstep generation, and proof search; (ii) an extensive summary of curated datasets and strategies for synthetic data generation; (iii) a detailed analysis of evaluation metrics and the performance of state-of-the-art methods; and (iv) a critical discussion on the persistent challenges and the promising avenues for future exploration. Our survey aims to serve as a foundational reference for deep learning approaches in theorem proving, inspiring and catalyzing further research endeavors in this rapidly growing field. A curated list of papers is available at https://github.com/zhaoyu-li/DL4TP.
