Code Documentation and Analysis to Secure Software Development
Paul Attie, Anas Obeidat, Nathaniel Oh, Ian Yelle
TL;DR
CoDAT introduces an IntelliJ IDEA plugin that maintains alignment between code, code sketches, and documentation by linking comments to code blocks and flagging out-of-date or semantically inconsistent documentation. It proposes a hierarchical, multi-level documentation model (functional specs, code sketches, inline comments) and uses a large language model to verify consistency across levels, aiming to support stepwise code refinement. The architecture relies on modular layers and data structures such as CommentEntity and CommentNode to manage hierarchical documentation, with change-flagging and IDE-integrated navigation to streamline code reviews. A worked example with a document search engine demonstrates how CoDAT tracks updates across levels and surfaces documentation issues, while future work targets stronger LLM integration, advanced analysis, and deeper source-control integration to enhance maintainability and developer productivity.
Abstract
We present the Code Documentation and Analysis Tool (CoDAT). CoDAT is a tool designed to maintain consistency between the various levels of code documentation, e.g. if a line in a code sketch is changed, the comment that documents the corresponding code is also changed. That is, comments are linked and updated so as to remain internally consistent and also consistent with the code. By flagging "out of date" comments, CoDAT alerts the developer to maintain up-to-date documentation. We use a large language model to check the semantic consistency between a fragment of code and the comments that describe it. Thus we also flag semantic inconsistency as well as out of date comments. This helps programers write code that correctly implements a code sketch, and so provides machine support for a step-wise refinement approach, starting with a code sketch and proceeding down to code through one or more refinement iterations. CoDAT is implemented in the Intellij IDEA IDE where we use the Code Insight daemon package alongside a custom regular expression algorithm to mark tagged comments whose corresponding code blocks have changed. CoDAT's backend is structurally decentralized to allow a distributed ledger framework for code consistency and architectural compilation tracking.
