Automated Repair of AI Code with Large Language Models and Formal Verification
Yiannis Charalambous, Edoardo Manino, Lucas C. Cordeiro
TL;DR
This work tackles safety guarantees for AI software by combining formal verification with automated repair. It expands NeuroCodeBench to about $81k$ C-based neural network implementations using Mull mutations, and uses ESBMC to label memory-safety vulnerabilities, treating it as an oracle for repair. Large language models are then employed to repair the AI code, with a comprehensive study of prompts, contexts, and iterative strategies (Forward History, LSO, etc.), revealing that carefully designed prompts can achieve meaningful repair rates, with iterative APR yielding around a $7\%$ improvement over non-iterative approaches. The findings highlight the potential and current limits of integrating formal verification and LLM-based program repair for AI code, and point to future work on broader LLMs and cost-conscious iterative strategies. The approach provides a practical pathway toward safer, verifiable AI software by tightly coupling vulnerability detection with automated, model-guided repair under formal guarantees.
Abstract
The next generation of AI systems requires strong safety guarantees. This report looks at the software implementation of neural networks and related memory safety properties, including NULL pointer deference, out-of-bound access, double-free, and memory leaks. Our goal is to detect these vulnerabilities, and automatically repair them with the help of large language models. To this end, we first expand the size of NeuroCodeBench, an existing dataset of neural network code, to about 81k programs via an automated process of program mutation. Then, we verify the memory safety of the mutated neural network implementations with ESBMC, a state-of-the-art software verifier. Whenever ESBMC spots a vulnerability, we invoke a large language model to repair the source code. For the latest task, we compare the performance of various state-of-the-art prompt engineering techniques, and an iterative approach that repeatedly calls the large language model.
