Automata-Based Steering of Large Language Models for Diverse Structured Generation

Xiaokun Luan; Zeming Wei; Yihao Zhang; Meng Sun

Automata-Based Steering of Large Language Models for Diverse Structured Generation

Xiaokun Luan, Zeming Wei, Yihao Zhang, Meng Sun

TL;DR

This work tackles the limited diversity observed in automaton-constrained generation by LLMs. It introduces an automaton-history-guided approach that tracks traversal through the guiding DFA and adaptively adjusts token logits with a global transition counter, local counters, and reward/penalty terms, controlled by an adaptive gamma parameter. Across four grammars, the method substantially boosts structural and content diversity (e.g., higher $\text{StateCov}$, $\text{TransCov}$, $\text{PathCov}$, and $Vendi$ scores) while incurring a modest efficiency cost (about $88.8\%$ of baseline tokens-per-second). Ablation studies confirm the necessity of each component, and a case study on test-case generation demonstrates improved branch coverage for open-source libraries. The approach is generalizable to CFGs and potentially broader grammar formalisms, offering a practical path to richer, yet valid, structured outputs from LLMs.

Abstract

Large language models (LLMs) are increasingly tasked with generating structured outputs. While structured generation methods ensure validity, they often lack output diversity, a critical limitation that we confirm in our preliminary study. We propose a novel method to enhance diversity in automaton-based structured generation. Our approach utilizes automata traversal history to steer LLMs towards novel structural patterns. Evaluations show our method significantly improves structural and content diversity while maintaining comparable generation efficiency. Furthermore, we conduct a case study showcasing the effectiveness of our method in generating diverse test cases for testing open-source libraries.

Automata-Based Steering of Large Language Models for Diverse Structured Generation

TL;DR

Abstract

Automata-Based Steering of Large Language Models for Diverse Structured Generation

TL;DR

Abstract

Paper Structure

Table of Contents