Amplifying human performance in combinatorial competitive programming
Petar Veličković, Alex Vitvitskyi, Larisa Markeeva, Borja Ibarz, Lars Buesing, Matej Balog, Alexander Novikov
TL;DR
The paper investigates boosting human performance in combinatorial competitive programming by coupling human-designed backbones with AI-driven scoring-function evolution using FunSearch. It validates the approach on Hash Code tasks and a held-out AtCoder contest, showing that evolved scoring functions substantially elevate backbone performance and can even outperform top human teams in several rounds. The findings demonstrate that human-AI collaboration in NP-hard optimization is a practical, scalable path to achieving elite results on real contest data. The approach remains effective under limited compute (two-hour evolution) and generalizes to substantially different problem settings, highlighting its potential for broader adoption in algorithmic optimization.
Abstract
Recent years have seen a significant surge in complex AI systems for competitive programming, capable of performing at admirable levels against human competitors. While steady progress has been made, the highest percentiles still remain out of reach for these methods on standard competition platforms such as Codeforces. Here we instead focus on combinatorial competitive programming, where the target is to find as-good-as-possible solutions to otherwise computationally intractable problems, over specific given inputs. We hypothesise that this scenario offers a unique testbed for human-AI synergy, as human programmers can write a backbone of a heuristic solution, after which AI can be used to optimise the scoring function used by the heuristic. We deploy our approach on previous iterations of Hash Code, a global team programming competition inspired by NP-hard software engineering problems at Google, and we leverage FunSearch to evolve our scoring functions. Our evolved solutions significantly improve the attained scores from their baseline, successfully breaking into the top percentile on all previous Hash Code online qualification rounds, and outperforming the top human teams on several. Our method is also performant on an optimisation problem that featured in a recent held-out AtCoder contest.
