Code Semantic Zooming
Jinsheng Ba, Sverrir Thorgeirsson, Zhendong Su
TL;DR
The paper tackles the challenge of limited control and validation in LLM-assisted code generation driven by natural language prompts. It proposes Code Semantic Zooming (CodeZoom), a high-level abstraction using a formal pseudocode grammar to enable multi-layer semantic zooming and iterative refinement of code. A bidirectional translation between pseudocode and source code is implemented in a VS Code extension, with an emphasis on controllable edits, structural constraints, and repeatable iterations demonstrated in two real-world case studies. The work offers a path toward more interpretable, reproducible, and human-in-the-loop software development in the era of LLM-enabled programming.
Abstract
Recent advances in Large Language Models (LLMs) have introduced a new paradigm for software development, where source code is generated directly from natural language prompts. While this paradigm significantly boosts development productivity, building complex, real-world software systems remains challenging because natural language offers limited control over the generated code. Inspired by the historical evolution of programming languages toward higher levels of abstraction, we advocate for a high-level abstraction language that gives developers greater control over LLM-assisted code writing. To this end, we propose Code Semantic Zooming, a novel approach based on pseudocode that allows developers to iteratively explore, understand, and refine code across multiple layers of semantic abstraction. We implemented Code Semantic Zooming as a VS Code extension and demonstrated its effectiveness through two real-world case studies.
