Table of Contents
Fetching ...

From Natural Language to Certified H-infinity Controllers: Integrating LLM Agents with LMI-Based Synthesis

Shihao Li, Jiachen Li, Jiamin Xu, Dongmei Chen

TL;DR

The paper presents S2C, a multi-agent framework that translates natural-language specifications into certified $\mathcal{H}_\infty$ state-feedback controllers using LMI-based synthesis. It decomposes the workflow into specialized agents (SpecIntAgent, SolvAgent, TesterAgent, AdaptAgent, CodeGen) with a memory-driven iterative loop and gamma-floor guardrails to enforce decay-rate and robustness constraints. Empirical results on 14 COMPleib benchmarks show 100% synthesis and 100% convergence within six iterations, with stronger decay-rate satisfaction and improved disturbance rejection compared to baselines. The study demonstrates that LLM-driven design can yield rapid, end-to-end prototyping while preserving formal guarantees, albeit within the scope of state-feedback and nominal LTI systems. The authors also discuss limitations and propose extensions to enable output-feedback, mixed synthesis, robust uncertainty handling, and native discrete-time formulations to broaden applicability and guarantee coverage of transient performance."

Abstract

We present \textsc{S2C} (Specification-to-Certified-Controller), a multi-agent framework that maps natural-language requirements to certified $\mathcal{H}_\infty$ state-feedback controllers via LMI synthesis. \textsc{S2C} coordinates five roles -- \textit{SpecInt} (spec extraction), \textit{Solv} (bounded-real lemma (BRL) LMI), \textit{Tester} (Monte Carlo and frequency-domain checks), \textit{Adapt} (spec refinement), and \textit{CodeGen} (deployable code). The loop is stabilized by a severity- and iteration-aware $γ$-floor guardrail and a decay-rate region constraint enforcing $\Reλ(A{+}BK)<-α$ with $α=3.9/T_s$ derived from settling-time targets. For state feedback, verification reports disturbance rejection $\big\|C\,(sI-(A{+}BK))^{-1}E\big\|_\infty$ alongside time-domain statistics; discrete benchmarks are converted to continuous time via a Tustin (bilinear) transform when needed. On 14 COMPleib problems, \textsc{S2C} attains \textbf{100\%} synthesis success and \textbf{100\%} convergence within six iterations, with strong decay-rate satisfaction and near-target certified $\mathcal{H}_\infty$ levels; it improves robustness metrics relative to single-shot BRL and BRL+$α$ baselines. An ablation over LLM backbones (GPT-5, GPT-5 mini, DeepSeek-V3, Qwen-2.5-72B, Llama-4 Maverick) shows the pipeline is robust across models, while stronger models yield the highest effectiveness. These results indicate that LLM agents can integrate certificate-bearing control synthesis from high-level intent, enabling rapid end-to-end prototyping without sacrificing formal guarantees.

From Natural Language to Certified H-infinity Controllers: Integrating LLM Agents with LMI-Based Synthesis

TL;DR

The paper presents S2C, a multi-agent framework that translates natural-language specifications into certified state-feedback controllers using LMI-based synthesis. It decomposes the workflow into specialized agents (SpecIntAgent, SolvAgent, TesterAgent, AdaptAgent, CodeGen) with a memory-driven iterative loop and gamma-floor guardrails to enforce decay-rate and robustness constraints. Empirical results on 14 COMPleib benchmarks show 100% synthesis and 100% convergence within six iterations, with stronger decay-rate satisfaction and improved disturbance rejection compared to baselines. The study demonstrates that LLM-driven design can yield rapid, end-to-end prototyping while preserving formal guarantees, albeit within the scope of state-feedback and nominal LTI systems. The authors also discuss limitations and propose extensions to enable output-feedback, mixed synthesis, robust uncertainty handling, and native discrete-time formulations to broaden applicability and guarantee coverage of transient performance."

Abstract

We present \textsc{S2C} (Specification-to-Certified-Controller), a multi-agent framework that maps natural-language requirements to certified state-feedback controllers via LMI synthesis. \textsc{S2C} coordinates five roles -- \textit{SpecInt} (spec extraction), \textit{Solv} (bounded-real lemma (BRL) LMI), \textit{Tester} (Monte Carlo and frequency-domain checks), \textit{Adapt} (spec refinement), and \textit{CodeGen} (deployable code). The loop is stabilized by a severity- and iteration-aware -floor guardrail and a decay-rate region constraint enforcing with derived from settling-time targets. For state feedback, verification reports disturbance rejection alongside time-domain statistics; discrete benchmarks are converted to continuous time via a Tustin (bilinear) transform when needed. On 14 COMPleib problems, \textsc{S2C} attains \textbf{100\%} synthesis success and \textbf{100\%} convergence within six iterations, with strong decay-rate satisfaction and near-target certified levels; it improves robustness metrics relative to single-shot BRL and BRL+ baselines. An ablation over LLM backbones (GPT-5, GPT-5 mini, DeepSeek-V3, Qwen-2.5-72B, Llama-4 Maverick) shows the pipeline is robust across models, while stronger models yield the highest effectiveness. These results indicate that LLM agents can integrate certificate-bearing control synthesis from high-level intent, enabling rapid end-to-end prototyping without sacrificing formal guarantees.

Paper Structure

This paper contains 85 sections, 13 equations, 6 figures, 3 tables, 3 algorithms.

Figures (6)

  • Figure 1: S2C multi-agent architecture.
  • Figure 2: S2C multi-agent architecture.
  • Figure 3: Aggregated experimental results across 14 COMPleib problems. Panels (a–e) report synthesis success, disturbance rejection, normalized certified $\gamma$, decay‑rate satisfaction, and convergence within 6 iterations.
  • Figure 4: LLM backbone ablation across five metrics (higher is better for success, $\le 6$ iterations, and decay-rate satisfaction; lower is better for normalized $\gamma$ and $\|H_{\mathrm{cl}}\|_\infty$). Bars show medians across 14 problems.
  • Figure 5: Evolution of S2C design for NN1 benchmark (GPT-5 baseline). Left: Measured overshoot decreases from 93.9% to 34.2% (target: 10%, shown as green dashed line). Center: Settling time error reduces from +16% to -0.6% (specification met). Right: Decay rate satisfaction improves from 86% to 127% of target $\alpha$.
  • ...and 1 more figures