Accelerating mathematical research with language models: A case study of an interaction with GPT-5-Pro on a convex analysis problem
Adil Salim
TL;DR
The work addresses a convex-analysis problem arising in optimal transport by proving a local, first-order expansion for the biconjugate under a smooth perturbation: for a strictly convex $\phi$ with $\nabla^2\phi(x)\succ0$ and a compactly supported $h$, there exists a pointwise threshold $t_x>0$ such that $|t|<t_x$ satisfies $(\phi+th)^{**}(x)=(\phi+th)(x)$ and $\nabla(\phi+th)^{**}(x)=\nabla\phi(x)+t\nabla h(x)$. The authors document their collaboration with GPT-5-pro, highlighting both accelerated progress and the necessity of careful supervision, and discuss the limitations of current language-model collaborators in performing rigorous mathematics beyond this single-case study. This work serves as a qualitative case study for AI-assisted mathematical reasoning, illustrating how local convex-analytic arguments can be stabilized and validated with human oversight. Overall, it demonstrates the potential of LLMs to contribute to mathematical research while underscoring the need for robust evaluation frameworks and transparent propagation of errors. The results have implications for future AI-assisted proofs in convex analysis and optimal transport, particularly for velocity field computations in pushforward schemes.
Abstract
Recent progress in large language models has made them increasingly capable research assistants in mathematics. Yet, as their reasoning abilities improve, evaluating their mathematical competence becomes increasingly challenging. The problems used for assessment must be neither too easy nor too difficult, their performance can no longer be summarized by a single numerical score, and meaningful evaluation requires expert oversight. In this work, we study an interaction between the author and a large language model in proving a lemma from convex optimization. Specifically, we establish a Taylor expansion for the gradient of the biconjugation operator--that is, the operator obtained by applying the Fenchel transform twice--around a strictly convex function, with assistance from GPT-5-pro, OpenAI's latest model. Beyond the mathematical result itself, whose novelty we do not claim with certainty, our main contribution lies in documenting the collaborative reasoning process. GPT-5-pro accelerated our progress by suggesting, relevant research directions and by proving some intermediate results. However, its reasoning still required careful supervision, particularly to correct subtle mistakes. While limited to a single mathematical problem and a single language model, this experiment illustrates both the promise and the current limitations of large language models as mathematical collaborators.
