Early Discoveries of Algorithmist I: Promise of Provable Algorithm Synthesis at Scale

Janardhan Kulkarni

Early Discoveries of Algorithmist I: Promise of Provable Algorithm Synthesis at Scale

Janardhan Kulkarni

Abstract

Designing algorithms with provable guarantees that also work well in practice remains difficult, requiring both mathematical reasoning and careful implementation. Existing approaches that bridge worst-case theory and empirical performance, such as beyond-worst-case analysis and data-driven algorithm selection, typically assume prior distributional knowledge or restrict attention to a fixed pool of algorithms. Recent progress in LLMs suggests a new possibility: provable algorithm synthesis on the fly. To study this, we built Algorithmist, an autonomous researcher agent on top of GitHub Copilot that runs a multi-agent research-and-review loop, with separate stages for idea generation, algorithm and proof development, proof-guided implementation, and review of proofs, code, and their alignment. We evaluate Algorithmist on research-level tasks in private data analysis and clustering. When asked to design practical methods that jointly satisfy privacy, approximation, and interpretability requirements, it produced provably sound and empirically effective algorithms, together with research-style writeups and audited implementations. It also found improved algorithms in some settings, explained principled barriers in others, and uncovered a subtle proof bug in prior published work. More broadly, our results suggest a new paradigm in which LLM systems generate research-paper-quality algorithmic artifacts tailored to each dataset and deployment setting. They also point to a proof-first code-synthesis paradigm, in which code is developed alongside a structured natural-language proof intermediate representation and kept aligned with it throughout synthesis.

Early Discoveries of Algorithmist I: Promise of Provable Algorithm Synthesis at Scale

Abstract

Paper Structure (179 sections, 47 theorems, 122 equations, 3 figures, 8 tables, 1 algorithm)

This paper contains 179 sections, 47 theorems, 122 equations, 3 figures, 8 tables, 1 algorithm.

Introduction
Our Results
LLM-Generated Proofs and Self-Improvement
Framework Overview
Researcher Agents
Reviewer Agents
Meta Reviewer
Aha Catalyst
Orchestrator
Dynamic Role Generation
Experimental Setup
Case Studies
Privacy Algorithm Design
Differential Privacy (DP)
Set Union and n-gram extraction problems
...and 164 more sections

Key Result

Proposition B.3

If $f$ has $\ell_2$-sensitivity $\Delta_2$, the mechanism $M(D) = f(D)+Z$, $Z\sim\mathcal{N}(0,\sigma^2 I)$, is $(\varepsilon,\delta)$-DP if and only if $\Phi\bigl(\frac{\Delta_2}{2\sigma}-\frac{\varepsilon\sigma}{\Delta_2}\bigr)-e^\varepsilon\Phi\bigl(-\frac{\Delta_2}{2\sigma}-\frac{\varepsilon\sig

Figures (3)

Figure 1: Proof-first algorithm synthesis and coding by autonomous research agents.
Figure 2: Algorithmist's architecture: multi-agent workflow for structured NLP proof and audited code.
Figure 3: Proof or mathematical analysis helps prune costly downstream experiments.

Theorems & Definitions (124)

Definition 5.1: Differential Privacy dwork2014dp
Remark 5.4
Remark 5.5
Claim A.1
Definition B.1: Differential Privacy appB:DMNS06appB:DR14
Definition B.2: DPSU Problem
Proposition B.3: Gaussian Mechanism appB:BalleW18
Theorem B.4: Proposition 5.1 of appB:gopi2020dpsu is false
proof
Proposition B.5
...and 114 more

Early Discoveries of Algorithmist I: Promise of Provable Algorithm Synthesis at Scale

Abstract

Early Discoveries of Algorithmist I: Promise of Provable Algorithm Synthesis at Scale

Authors

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (124)