Leveraging LLMs to support co-evolution between definitions and instances of textual DSLs
Weixing Zhang, Regina Hebig, Daniel Strüber
TL;DR
The paper investigates using LLMs to co-evolve grammar definitions and textual DSL instances, focusing on preserving auxiliary information such as comments and formatting. It implements two LLM-based pipelines (Claude-3.5 and GPT-4o) and evaluates them on seven Xtext-based DSL case studies, revealing promising results for small instances and notable scalability challenges for larger grammars and more substantial changes. The findings emphasize the potential of LLMs for textual DSL evolution while highlighting issues like output truncation, non-evolution copying, and variable performance across models. The work points to future directions in prompt design, scalable migration approaches, and broader DSL coverage to advance practical applicability.
Abstract
Software languages evolve over time for various reasons, such as the addition of new features. When the language's grammar definition evolves, textual instances that originally conformed to the grammar become outdated. For DSLs in a model-driven engineering context, there exists a plethora of techniques to co-evolve models with the evolving metamodel. However, these techniques are not geared to support DSLs with a textual syntax -- applying them to textual language definitions and instances may lead to the loss of information from the original instances, such as comments and layout information, which are valuable for software comprehension and maintenance. This study explores the potential of Large Language Model (LLM)-based solutions in achieving grammar and instance co-evolution, with attention to their ability to preserve auxiliary information when directly processing textual instances. By applying two advanced language models, Claude-3.5 and GPT-4o, and conducting experiments across seven case languages, we evaluated the feasibility and limitations of this approach. Our results indicate a good ability of the considered LLMs for migrating textual instances in small-scale cases with limited instance size, which are representative of a subset of cases encountered in practice. In addition, we observe significant challenges with the scalability of LLM-based solutions to larger instances, leading to insights that are useful for informing future research.
