Table of Contents
Fetching ...

Natural Language Outlines for Code: Literate Programming in the LLM Era

Kensen Shi, Deniz Altınbüken, Saswat Anand, Mihai Christodorescu, Katja Grünwedel, Alexa Koenings, Sai Naidu, Anurag Pathak, Marc Rasi, Fredde Ribeiro, Brandon Ruffin, Siddhant Sanyam, Maxim Tabachnyk, Sara Toth, Roy Tu, Tobias Welp, Pengcheng Yin, Manzil Zaheer, Satish Chandra, Charles Sutton

TL;DR

The paper introduces natural language outlines (NL outlines) as a literate-programming-inspired modality for AI-assisted software development, where prose statements aligned to code partition and summarize implementation while preserving the code as the source of truth. It details three outline-generation strategies (Interleaved Generation, Constrained Generation, Line Number Infilling), discusses bidirectional synchronization between code and NL, and explores a wide range of use cases from understanding to maintenance and code review. Through human studies and two case studies (Android security and code review), the authors demonstrate that modern LLMs can generate high-quality NL outlines, with substantial user-perceived usefulness and concrete accuracy metrics. The work also addresses practical deployment considerations, such as verification, storage formats, and UX design, and outlines future directions for scaling NL outlines in real-world tooling.

Abstract

We propose using natural language outlines as a novel modality and interaction surface for providing AI assistance to developers throughout the software development process. An NL outline for a code function comprises multiple statements written in concise prose, which partition the code and summarize its main ideas in the style of literate programming. Crucially, we find that modern LLMs can generate accurate and high-quality NL outlines in practice. Moreover, NL outlines enable a bidirectional sync between code and NL, where a developer can change either code or NL and have the LLM automatically update the other. We discuss many use cases for NL outlines: they can accelerate understanding and navigation of code and diffs, simplify code maintenance, augment code search, steer code generation, and more. We then propose and compare multiple LLM prompting techniques for generating outlines and ask professional developers to judge outline quality. Finally, we present two case studies applying NL outlines toward code review and malware detection.

Natural Language Outlines for Code: Literate Programming in the LLM Era

TL;DR

The paper introduces natural language outlines (NL outlines) as a literate-programming-inspired modality for AI-assisted software development, where prose statements aligned to code partition and summarize implementation while preserving the code as the source of truth. It details three outline-generation strategies (Interleaved Generation, Constrained Generation, Line Number Infilling), discusses bidirectional synchronization between code and NL, and explores a wide range of use cases from understanding to maintenance and code review. Through human studies and two case studies (Android security and code review), the authors demonstrate that modern LLMs can generate high-quality NL outlines, with substantial user-perceived usefulness and concrete accuracy metrics. The work also addresses practical deployment considerations, such as verification, storage formats, and UX design, and outlines future directions for scaling NL outlines in real-world tooling.

Abstract

We propose using natural language outlines as a novel modality and interaction surface for providing AI assistance to developers throughout the software development process. An NL outline for a code function comprises multiple statements written in concise prose, which partition the code and summarize its main ideas in the style of literate programming. Crucially, we find that modern LLMs can generate accurate and high-quality NL outlines in practice. Moreover, NL outlines enable a bidirectional sync between code and NL, where a developer can change either code or NL and have the LLM automatically update the other. We discuss many use cases for NL outlines: they can accelerate understanding and navigation of code and diffs, simplify code maintenance, augment code search, steer code generation, and more. We then propose and compare multiple LLM prompting techniques for generating outlines and ask professional developers to judge outline quality. Finally, we present two case studies applying NL outlines toward code review and malware detection.
Paper Structure (22 sections, 19 figures, 1 table)

This paper contains 22 sections, 19 figures, 1 table.

Figures (19)

  • Figure 1: NL outlines can enable a huge variety of AI-based developer assistance features.  Code Understanding: An LLM can generate an NL outline for a code function, providing a high-level overview (as in \ref{['subfig:interleaved-outline']}) that is aligned with the code and more interpretable than a paragraph summary.  Code Maintenance: After the developer begins to edit the code or the outline, the LLM can complete the edit to keep everything in sync (as in \ref{['fig:finish-changes']}), offering both automatic documentation and programming via NL.  Developer Experience: NL outlines can accelerate code browsing and navigation, enable code search from natural language and contextualize the results, allow users to preview and steer code generation, and summarize changes during code review.
  • Figure 2: An example NL outline for a Python function (a), either displayed without code (b) or interleaved with code (c).
  • Figure 3: A mockup of how NL outlines could be used in an IDE, with real outlines predicted by Gemini 1.5 Flash.  [2]1 NL outlines for each function can be displayed in the list of symbols. Clicking an outline statement can move the cursor to the corresponding code location.  [2]2 In the main editor, NL outlines can be shown interleaved with the code, either as non-code annotations or as code comments.  [2]3 NL outlines can offer intuitive code folding.  [2]4 NL outlines can help navigation, e.g., by emphasizing the current outline statement or via shortcuts to jump to the previous or next outline statement.
  • Figure 4: Example usage of a Finish Changes prototype feature using Gemini 1.5 Flash. The user can concisely specify the key idea of a change (by editing the code as in \ref{['subfig:finish-changes-a']} or the outline as in \ref{['subfig:finish-changes-b']}) and let the LLM finish the job by propagating the changes. \ref{['app:finish-changes']} contains another more complex example.
  • Figure 5: Quality survey results, showing the frequency of answers to survey questions for outlines produced by a given LLM and generation technique.
  • ...and 14 more figures