Table of Contents
Fetching ...

LLM Based Multi-Agent Generation of Semi-structured Documents from Semantic Templates in the Public Administration Domain

Emanuele Musumeci, Michele Brienza, Vincenzo Suriani, Daniele Nardi, Domenico Daniele Bloisi

TL;DR

The paper addresses the challenge of generating semi-structured public administration documents by proposing a modular, multi-agent framework in which three specialized LLM-based agents (Semantics Identification, Information Retrieval, and Content Generation) collaboratively process a template-derived structure. A prompt-engineering-centric workflow accumulates a task prompt and drives section-by-section content generation with minimal human supervision, aided by template pre-processing and post-processing steps. Experimental evaluation using GPT-3.5 Turbo demonstrates that engineered prompts and agent coordination improve semantic fidelity and reduce hallucinations, by enabling data retrieval from the accumulated prompt and guiding content assembly. The approach aligns with AI-as-a-Service trends, offering a scalable, extensible method for automating PA document generation while preserving structure and semantics across diverse document types.

Abstract

In the last years' digitalization process, the creation and management of documents in various domains, particularly in Public Administration (PA), have become increasingly complex and diverse. This complexity arises from the need to handle a wide range of document types, often characterized by semi-structured forms. Semi-structured documents present a fixed set of data without a fixed format. As a consequence, a template-based solution cannot be used, as understanding a document requires the extraction of the data structure. The recent introduction of Large Language Models (LLMs) has enabled the creation of customized text output satisfying user requests. In this work, we propose a novel approach that combines the LLMs with prompt engineering and multi-agent systems for generating new documents compliant with a desired structure. The main contribution of this work concerns replacing the commonly used manual prompting with a task description generated by semantic retrieval from an LLM. The potential of this approach is demonstrated through a series of experiments and case studies, showcasing its effectiveness in real-world PA scenarios.

LLM Based Multi-Agent Generation of Semi-structured Documents from Semantic Templates in the Public Administration Domain

TL;DR

The paper addresses the challenge of generating semi-structured public administration documents by proposing a modular, multi-agent framework in which three specialized LLM-based agents (Semantics Identification, Information Retrieval, and Content Generation) collaboratively process a template-derived structure. A prompt-engineering-centric workflow accumulates a task prompt and drives section-by-section content generation with minimal human supervision, aided by template pre-processing and post-processing steps. Experimental evaluation using GPT-3.5 Turbo demonstrates that engineered prompts and agent coordination improve semantic fidelity and reduce hallucinations, by enabling data retrieval from the accumulated prompt and guiding content assembly. The approach aligns with AI-as-a-Service trends, offering a scalable, extensible method for automating PA document generation while preserving structure and semantics across diverse document types.

Abstract

In the last years' digitalization process, the creation and management of documents in various domains, particularly in Public Administration (PA), have become increasingly complex and diverse. This complexity arises from the need to handle a wide range of document types, often characterized by semi-structured forms. Semi-structured documents present a fixed set of data without a fixed format. As a consequence, a template-based solution cannot be used, as understanding a document requires the extraction of the data structure. The recent introduction of Large Language Models (LLMs) has enabled the creation of customized text output satisfying user requests. In this work, we propose a novel approach that combines the LLMs with prompt engineering and multi-agent systems for generating new documents compliant with a desired structure. The main contribution of this work concerns replacing the commonly used manual prompting with a task description generated by semantic retrieval from an LLM. The potential of this approach is demonstrated through a series of experiments and case studies, showcasing its effectiveness in real-world PA scenarios.
Paper Structure (13 sections, 5 figures, 6 tables)

This paper contains 13 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: The presented multi-agent architecture with the LLMs used in prompt engineering and multi-agent fashion for generating new documents.
  • Figure 2: Representation of our multi-agent architecture. The workflow for the generic generation step is highlighted by the bold black arrows.
  • Figure 3: Representation of a generation step instance. Notice how the accumulated prompt is enriched with the missing data provided by the user.
  • Figure 4: Agent responses throughout a single generation step, starting from the template text: "Your name". The figure represents a single generation step for a section of the document, showing an example of successful generation using the engineered prompt. In this case all information is retrieved in the original user prompt, so no information is asked to the user.
  • Figure 5: Agent responses throughout a single generation step, starting from the template text: "Dear Mr./Ms.(Lastname)". In this case, user intervention is required to add missing information.