Using Copilot Agent Mode to Automate Library Migration: A Quantitative Assessment

Aylton Almeida; Laerte Xavier; Marco Tulio Valente

Using Copilot Agent Mode to Automate Library Migration: A Quantitative Assessment

Aylton Almeida, Laerte Xavier, Marco Tulio Valente

TL;DR

This work evaluates an autonomous AI-driven approach to library migration by applying GitHub Copilot Agent Mode to upgrade SQLAlchemy from v1 to v2 across ten real-world Python projects. The study introduces Migration Coverage to quantify API usage transformations and assesses multiple dimensions, including test results, compilation success, and code quality. Results show strong migration-coverage performance (median 100%) but relatively low test-pass rates (median 39.75%), with substantial variance driven by runtime behavior and library compatibility issues. The findings suggest promising potential for agent-based automated migrations while highlighting the need for improved runtime understanding and possible human-in-the-loop interventions to preserve functional correctness.

Abstract

Keeping software systems up to date is essential to avoid technical debt, security vulnerabilities, and the rigidity typical of legacy systems. However, updating libraries and frameworks remains a time consuming and error-prone process. Recent advances in Large Language Models (LLMs) and agentic coding systems offer new opportunities for automating such maintenance tasks. In this paper, we evaluate the update of a well-known Python library, SQLAlchemy, across a dataset of ten client applications. For this task, we use the Github's Copilot Agent Mode, an autonomous AI systema capable of planning and executing multi-step migration workflows. To assess the effectiveness of the automated migration, we also introduce Migration Coverage, a metric that quantifies the proportion of API usage points correctly migrated. The results of our study show that the LLM agent was capable of migrating functionalities and API usages between SQLAlchemy versions (migration coverage: 100%, median), but failed to maintain the application functionality, leading to a low test-pass rate (39.75%, median).

Using Copilot Agent Mode to Automate Library Migration: A Quantitative Assessment

TL;DR

Abstract

Using Copilot Agent Mode to Automate Library Migration: A Quantitative Assessment

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)