Table of Contents
Fetching ...

ATLAS: Artifact Generation Through Layered Constraints and LLM x MDE Synergy

Tong Ma, Hui Lai, Hui Wang, Zhenhu Tian, Jizhou Wang, Haichao Wu, Yongfan Gao, Chaochao Li, Fengjie Xu, Ling Fang

TL;DR

ATLAS produces structurally valid, auditable artifacts that integrate with existing tooling and substantially reduce manual remediation effort, validating a graduated automation paradigm that automates routine construction while empowering experts to resolve complex semantic ambiguities through machine-checkable evidence.

Abstract

ATLAS unifies Large Language Models with Model-Driven Engineering to generate regulator-ready artifacts and machine-checkable evidence for safety- and compliance-critical domains. ATLAS integrates three pillars: a Unified Meta-Model (UMM) reconciles heterogeneous schemas and regulatory text into a single semantic space; an Integrated Constraint Model (ICM) extends our prior Dual-Stage(S2D2) extraction logic to compile layered requirements into deterministic generation-time automata (Layer~1) and post-generation validators (Layer~2); and Constraint-Guided Verifiable Generation (CVG) applies these through two-layer enforcement -- Layer~1 structural constraints drive prefix-safe decoding while Layer~2 semantic/logical validation produces machine-checkable certificates. When violations occur, ATLAS performs audit-guided repair and records generation traces for compliance review. We evaluate ATLAS in automotive software engineering (AUTOSAR) and cross-border legal jurisdiction (Brussels~I~bis). ATLAS produces structurally valid, auditable artifacts that integrate with existing tooling and substantially reduce manual remediation effort, validating a graduated automation paradigm that automates routine construction while empowering experts to resolve complex semantic ambiguities through machine-checkable evidence.

ATLAS: Artifact Generation Through Layered Constraints and LLM x MDE Synergy

TL;DR

ATLAS produces structurally valid, auditable artifacts that integrate with existing tooling and substantially reduce manual remediation effort, validating a graduated automation paradigm that automates routine construction while empowering experts to resolve complex semantic ambiguities through machine-checkable evidence.

Abstract

ATLAS unifies Large Language Models with Model-Driven Engineering to generate regulator-ready artifacts and machine-checkable evidence for safety- and compliance-critical domains. ATLAS integrates three pillars: a Unified Meta-Model (UMM) reconciles heterogeneous schemas and regulatory text into a single semantic space; an Integrated Constraint Model (ICM) extends our prior Dual-Stage(S2D2) extraction logic to compile layered requirements into deterministic generation-time automata (Layer~1) and post-generation validators (Layer~2); and Constraint-Guided Verifiable Generation (CVG) applies these through two-layer enforcement -- Layer~1 structural constraints drive prefix-safe decoding while Layer~2 semantic/logical validation produces machine-checkable certificates. When violations occur, ATLAS performs audit-guided repair and records generation traces for compliance review. We evaluate ATLAS in automotive software engineering (AUTOSAR) and cross-border legal jurisdiction (Brussels~I~bis). ATLAS produces structurally valid, auditable artifacts that integrate with existing tooling and substantially reduce manual remediation effort, validating a graduated automation paradigm that automates routine construction while empowering experts to resolve complex semantic ambiguities through machine-checkable evidence.

Paper Structure

This paper contains 116 sections, 6 theorems, 22 equations, 10 figures, 7 tables, 4 algorithms.

Key Result

Theorem 1

For any valid specification corpus $D$ and its corresponding meta-model $M$, the extracted constraint sets satisfy and every constraint $c \in \mathrm{Extract}^{\mathrm{LLM}}(D)$ obeys where $\alpha$ is the partial alignment map (Definition def:partial-align), $U$ is the set of UMM sorts, and $\mathcal{C}_{\mathrm{abs}}$ is the quarantined abstract layer for rules that have no direct UMM anchor.

Figures (10)

  • Figure 1: ATLAS links domain requirements to verifiable generation. Natural-language requirements and existing meta-models are consolidated into a Unified Meta-Model (UMM) and an Integrated Constraint Model (ICM). Generation is guided by structural constraints (Layer-1, enforced during decoding) and then checked by semantic / logical validators (Layer-2) after decoding. The validation signals, together with the decode-time audit record, form evidence that can be used to automatically repair the artifact in a closed loop.
  • Figure 2: Two ways to build the Unified Meta-Model (UMM). Path S1 handles structured sources: when a machine-readable meta-model (for example AUTOSAR, AADL, OPC UA) already exists, ATLAS ingests and normalizes it. Path S2 handles unstructured sources: when only natural-language requirements are available, ATLAS uses an LLM to induce domain entities and relations directly from text. Deployments typically follow one of these two paths, depending on which input is available.
  • Figure 3:
  • Figure 4: ICM as a multi-sorted algebra layered over the UMM. White nodes denote UMM entities; gray nodes encode semantic and logical constraints extracted from Channel 1 (deductive) and Channel 2 (LLM-assisted). Morphisms relate abstract constraints to class-/attribute-level constraints, enabling cross-layer reasoning and repair prioritization.
  • Figure 5: Stratified compilation with LLM fallback and repository policy. Left: Channel 1 programmatic mapping compiles UMM/ICM constraints to L1 validators (JSON Schema, DFA, GBNF) and L2 validators (SHACL, SMT), preserving semantics. Right: when programmatic mapping is incomplete, an LLM-based pipeline maSpecGenAutomatedGeneration2025 synthesizes missing constraints with solver-backed validation; only verified outputs are persisted in the ICM storage.
  • ...and 5 more figures

Theorems & Definitions (16)

  • Definition 3.1: Meta-Model as Typed Graph
  • Definition 3.2: Constraint Space
  • Definition 3.3: Partial Alignment Mapping
  • Definition 3.4: Integrated Constraint Model
  • Definition 3.5: Semantic Compatibility
  • Theorem 1: Extraction Consistency with Partial Alignment
  • Theorem 2: Conditional Soundness and Completeness
  • Definition 3.6: Constraint Enforcement Hierarchy
  • Theorem 3: Maximum Expressiveness of Unconstrained Strategy
  • Theorem 4: Prefix Safety and Bounded Structural Repair
  • ...and 6 more