LLM-based Automated Architecture View Generation: Where Are We Now?

Miryala Sathvika; Rudra Dhar; Karthik Vaidhyanathan

LLM-based Automated Architecture View Generation: Where Are We Now?

Miryala Sathvika, Rudra Dhar, Karthik Vaidhyanathan

Abstract

Architecture views are essential for software architecture documentation, yet their manual creation is labor intensive and often leads to outdated artifacts. As systems grow in complexity, the automated generation of views from source code becomes increasingly valuable. Goal: We empirically evaluate the ability of LLMs and agentic approaches to generate architecture views from source code. Method: We analyze 340 open-source repositories across 13 experimental configurations using 3 LLMs with 3 prompting techniques and 2 agentic approaches, yielding 4,137 generated views. We evaluate the generated views by comparing them with the ground-truth using a combination of automated metrics complemented by human evaluations. Results: Prompting strategies offer marginal improvements. Few-shot prompting reduces clarity failures by 9.2% compared to zero-shot baselines. The custom agentic approach consistently outperforms the general-purpose agent, achieving the best clarity (22.6% failure rate) and level-of-detail success (50%). Conclusions: LLM and agentic approaches demonstrate capabilities in generating syntactically valid architecture views. However, they consistently exhibit granularity mismatches, operating at the code level rather than architectural abstractions. This suggests that there is still a need for human expertise, positioning LLMs and agents as assistive tools rather than autonomous architects.

LLM-based Automated Architecture View Generation: Where Are We Now?

Abstract

Paper Structure (20 sections, 6 figures, 4 tables)

This paper contains 20 sections, 6 figures, 4 tables.

Introduction
Background
Prompting Techniques
LLMs and Agents
Study Design
Goal
Research Questions
Pilot Study
Key Findings from the Pilot Study
Experiment Workflow
Results
RQ3: How does performance vary across architectural concerns and quality attributes?
Discussion
Key Findings
Implications for Researchers
...and 5 more sections

Figures (6)

Figure 1: Study Diagram
Figure 2: Overall strategy comparison across SSIM, LLM Composite, Clarity, and Completeness metrics.
Figure 3: LLM Quality across different notations
Figure 4: Granularity trends for AV, FS, GPA. LLM Quality is the average of all LLM evaluated metrics
Figure 5: Comparison of model performance across quality attributes and architectural concerns.
...and 1 more figures

LLM-based Automated Architecture View Generation: Where Are We Now?

Abstract

LLM-based Automated Architecture View Generation: Where Are We Now?

Authors

Abstract

Table of Contents

Figures (6)