Autonomous Legacy Web Application Upgrades Using a Multi-Agent System
Valtteri Ala-Salmi, Zeeshan Rasheed, Abdul Malik Sami, Zheying Zhang, Kai-Kristian Kemell, Jussi Rasku, Shahbaz Siddeeq, Mika Saari, Pekka Abrahamsson
TL;DR
Legacy web applications often suffer from deprecated components that threaten security and reliability. The paper introduces a CodePori-inspired multi-agent pipeline that uses LLMs to autonomously upgrade legacy CakePHP applications, dividing tasks among managers, executors, verifiers, and finalizers, and evaluating with Zero-Shot and One-Shot prompts on five view files. Results show the system can maintain task context and improve outcomes in some cases, but complex updates often align with or fall behind traditional ZSL/OSL prompts, underscoring task-dependent performance and the need for further refinement. The work provides a practical foundation for AI-assisted legacy code updates, with publicly available evaluation data and a GitHub implementation to support future improvements and broader benchmarking.
Abstract
The use of Large Language Models (LLMs) for autonomous code generation is gaining attention in emerging technologies. As LLM capabilities expand, they offer new possibilities such as code refactoring, security enhancements, and legacy application upgrades. Many outdated web applications pose security and reliability challenges, yet companies continue using them due to the complexity and cost of upgrades. To address this, we propose an LLM-based multi-agent system that autonomously upgrades legacy web applications to the latest versions. The system distributes tasks across multiple phases, updating all relevant files. To evaluate its effectiveness, we employed Zero-Shot Learning (ZSL) and One-Shot Learning (OSL) prompts, applying identical instructions in both cases. The evaluation involved updating view files and measuring the number and types of errors in the output. For complex tasks, we counted the successfully met requirements. The experiments compared the proposed system with standalone LLM execution, repeated multiple times to account for stochastic behavior. Results indicate that our system maintains context across tasks and agents, improving solution quality over the base model in some cases. This study provides a foundation for future model implementations in legacy code updates. Additionally, findings highlight LLMs' ability to update small outdated files with high precision, even with basic prompts. The source code is publicly available on GitHub: https://github.com/alasalm1/Multi-agent-pipeline.
