Bridging the Prototype-Production Gap: A Multi-Agent System for Notebooks Transformation
Hanya Elhashemy, Youssef Lotfy, Yongjian Tang
TL;DR
The paper tackles the notebook-to-production gap in data science by introducing Codelevate, a multi-agent system (MAS) with Architect, Developer, and Structure agents that share a dependency graph to transform notebooks into modular, production-ready Python repositories. Central to the approach are Architecture Design Records (ADRs) that capture design decisions and guide refactoring under DRY and SOLID principles, integrated through a pipeline of preprocessing, dependency analysis, ADR generation, refactoring, and modularization. The authors demonstrate that autonomous code transformation preserves semantic equivalence while improving code quality metrics, enabling scalable, maintainable, and CI-friendly notebook-based workflows. This work has practical impact by enabling automated, architecture-aware notebook migrations suitable for collaborative data science and production pipelines.
Abstract
The increasing adoption of Jupyter notebooks in data science and machine learning workflows has created a gap between exploratory code development and production-ready software systems. While notebooks excel at iterative development and visualization, they often lack proper software engineering principles, making their transition to production environments challenging. This paper presents Codelevate, a novel multi-agent system that automatically transforms Jupyter notebooks into well-structured, maintainable Python code repositories. Our system employs three specialized agents - Architect, Developer, and Structure - working in concert through a shared dependency tree to ensure architectural coherence and code quality. Our experimental results validate Codelevate's capability to bridge the prototype-to-production gap through autonomous code transformation, yielding quantifiable improvements in code quality metrics while preserving computational semantics.
