InfCode: Adversarial Iterative Refinement of Tests and Patches for Reliable Software Issue Resolution

KeFan Li; Mengfei Wang; Hengzhi Zhang; Zhichao Li; Yuan Yuan; Mu Li; Xiang Gao; Hailong Sun; Chunming Hu; Weifeng Lv

InfCode: Adversarial Iterative Refinement of Tests and Patches for Reliable Software Issue Resolution

KeFan Li, Mengfei Wang, Hengzhi Zhang, Zhichao Li, Yuan Yuan, Mu Li, Xiang Gao, Hailong Sun, Chunming Hu, Weifeng Lv

TL;DR

InfCode tackles the challenge of reliably resolving repository-level software issues with LLMs by coupling two adversarial agents that iteratively strengthen tests and code patches. A Test Patch Generator and a Code Patch Generator engage in adversarial refinement, guided by a Selector that picks the most reliable patch within a containerized, tool-rich environment. Experiments on SWE-bench Lite and SWE-bench Verified show state-of-the-art performance, reaching 79.4% on SWE-bench Verified with Claude 4.5 Sonnet and demonstrating robustness under adversarial test strengthening. The work demonstrates a principled approach to repository-level repair with practical implications for automated debugging in real-world codebases.

Abstract

Large language models have advanced software engineering automation, yet resolving real-world software issues remains difficult because it requires repository-level reasoning, accurate diagnostics, and strong verification signals. Existing agent-based and pipeline-based methods often rely on insufficient tests, which can lead to patches that satisfy verification but fail to fix the underlying defect. We present InfCode, an adversarial multi-agent framework for automated repository-level issue resolution. InfCode iteratively refines both tests and patches through adversarial interaction between a Test Patch Generator and a Code Patch Generator, while a Selector agent identifies the most reliable fix. The framework runs inside a containerized environment that supports realistic repository inspection, modification, and validation. Experiments on SWE-bench Lite and SWE-bench Verified using models such as DeepSeek-V3 and Claude 4.5 Sonnet show that InfCode consistently outperforms strong baselines. It achieves 79.4% performance on SWE-bench Verified, establishing a new state-of-the-art. We have released InfCode as an open-source project at https://github.com/Tokfinity/InfCode.

InfCode: Adversarial Iterative Refinement of Tests and Patches for Reliable Software Issue Resolution

TL;DR

Abstract

InfCode: Adversarial Iterative Refinement of Tests and Patches for Reliable Software Issue Resolution

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)