UnitTenX: Generating Tests for Legacy Packages with AI Agents Powered by Formal Verification
Yiannis Charalambous, Claudionor N. Coelho, Luis Lamb, Lucas C. Cordeiro
TL;DR
The paper addresses the challenge of testing and maintaining legacy software by introducing UnitTenX, an AI-agent framework that integrates formal verification with LLM-driven test generation to automatically produce unit tests for undocumented C interfaces. By combining ESBMC-based symbolic analysis with transformer-based code generation, UnitTenX identifies sensitization conditions and edge cases to maximize code coverage and reveal crash states while generating regression-oriented tests. Experimental evaluation on a real-world DNS server demonstrates substantial coverage improvements, robust error handling, and a productive reflection loop that incrementally enhances test quality. The approach has practical impact for software maintenance and modernization, offering a path to higher reliability and clearer documentation for legacy codebases, though broader validation across diverse systems is needed.
Abstract
This paper introduces UnitTenX, a state-of-the-art open-source AI multi-agent system designed to generate unit tests for legacy code, enhancing test coverage and critical value testing. UnitTenX leverages a combination of AI agents, formal methods, and Large Language Models (LLMs) to automate test generation, addressing the challenges posed by complex and legacy codebases. Despite the limitations of LLMs in bug detection, UnitTenX offers a robust framework for improving software reliability and maintainability. Our results demonstrate the effectiveness of this approach in generating high-quality tests and identifying potential issues. Additionally, our approach enhances the readability and documentation of legacy code.
