Large Language Models are Pattern Matchers: Editing Semi-Structured and Structured Documents with ChatGPT
Irene Weber
TL;DR
The paper investigates whether large language models can edit structured and semi-structured documents with minimal effort by conducting two qualitative case studies using ChatGPT: restructuring a LaTeX table and converting RIS records to OPUS XML. It highlights that explicit structuring in prompts enhances an LLM’s ability to understand tasks and demonstrates notable pattern-matching skills in cross-format transformations, while also revealing variability in outputs across prompts. The findings suggest practical pathways for integrating LLMs into document-processing workflows, potentially reducing development costs, but they acknowledge the need for broader experiments to generalize beyond the studied tasks and models. Overall, the work contributes insights into prompt design for structured editing and raises important questions about stability and robustness in LLM-driven document processing.
Abstract
Large Language Models (LLMs) offer numerous applications, the full extent of which is not yet understood. This paper investigates if LLMs can be applied for editing structured and semi-structured documents with minimal effort. Using a qualitative research approach, we conduct two case studies with ChatGPT and thoroughly analyze the results. Our experiments indicate that LLMs can effectively edit structured and semi-structured documents when provided with basic, straightforward prompts. ChatGPT demonstrates a strong ability to recognize and process the structure of annotated documents. This suggests that explicitly structuring tasks and data in prompts might enhance an LLM's ability to understand and solve tasks. Furthermore, the experiments also reveal impressive pattern matching skills in ChatGPT. This observation deserves further investigation, as it may contribute to understanding the processes leading to hallucinations in LLMs.
