Table of Contents
Fetching ...

Enforcing Monotonic Progress in Legal Cross-Examination: Preventing Long-Horizon Stagnation in LLM-Based Inquiry

Hsien-Jyh Liao

TL;DR

This paper addresses long-horizon legal inquiry and the risk of procedural stagnation in LLM-based cross-examinations. It introduces Soft-FSM, a neuro-symbolic architecture that externalizes procedural state control to enforce monotonic information gain over a DAG of KIUs, ensuring completion even when language models favor local coherence. Empirical results on three real-world Taiwanese homicide cases show that Soft-FSM achieves >97% completeness with minimal redundancy, while baseline methods collapse as task depth increases due to the Complexity Cliff. The work demonstrates that reliable long-horizon task completion in procedurally constrained domains requires explicit external state enforcement rather than relying on emergent LLM behavior, with implications for designing robust legal AI systems.

Abstract

Large language models (LLMs) exhibit impressive linguistic fluency but struggle to reliably complete long-horizon tasks under explicit procedural constraints. In legal cross-examination, purely proba-bilistic generation often maintains behavioral coherence while failing to ensure procedural advancement. We characterize this failure as procedural stagnation and propose Soft-FSM, a neuro-symbolic architecture that enforces monotonic progress over accumulated Key Information Units (KIUs) via an external deterministic state controller. Experiments on three real-world Taiwanese criminal homicide cases show that baseline methods collapse below 40% completeness, while Soft-FSM consistently achieves over 97% with near-zero redundancy. These results suggest that, in such domains, reliable task completion cannot be guaranteed by emergent LLM behavior alone, and can be reliably enforced through explicit and verifiable external state control.

Enforcing Monotonic Progress in Legal Cross-Examination: Preventing Long-Horizon Stagnation in LLM-Based Inquiry

TL;DR

This paper addresses long-horizon legal inquiry and the risk of procedural stagnation in LLM-based cross-examinations. It introduces Soft-FSM, a neuro-symbolic architecture that externalizes procedural state control to enforce monotonic information gain over a DAG of KIUs, ensuring completion even when language models favor local coherence. Empirical results on three real-world Taiwanese homicide cases show that Soft-FSM achieves >97% completeness with minimal redundancy, while baseline methods collapse as task depth increases due to the Complexity Cliff. The work demonstrates that reliable long-horizon task completion in procedurally constrained domains requires explicit external state enforcement rather than relying on emergent LLM behavior, with implications for designing robust legal AI systems.

Abstract

Large language models (LLMs) exhibit impressive linguistic fluency but struggle to reliably complete long-horizon tasks under explicit procedural constraints. In legal cross-examination, purely proba-bilistic generation often maintains behavioral coherence while failing to ensure procedural advancement. We characterize this failure as procedural stagnation and propose Soft-FSM, a neuro-symbolic architecture that enforces monotonic progress over accumulated Key Information Units (KIUs) via an external deterministic state controller. Experiments on three real-world Taiwanese criminal homicide cases show that baseline methods collapse below 40% completeness, while Soft-FSM consistently achieves over 97% with near-zero redundancy. These results suggest that, in such domains, reliable task completion cannot be guaranteed by emergent LLM behavior alone, and can be reliably enforced through explicit and verifiable external state control.
Paper Structure (25 sections, 1 equation, 2 figures, 1 table)

This paper contains 25 sections, 1 equation, 2 figures, 1 table.

Figures (2)

  • Figure 1: Structural divergence between progressive and stagnating state transitions. State advancement is defined by information gain in $K$, independent of conversational order. In the absence of external control, per-turn procedural misjudgments cause inquiry trajectories to deviate from the required Directed Acyclic Graph (DAG) of information states and collapse into a Stagnation Region—a cyclic attractor where behavioral consistency is preserved but informational progress stalls (i.e., $|K_{t+1}| = |K_t|$). In contrast, Soft-FSM enforces traversal along a monotonic Progress Path by permitting state transitions only when the information set strictly increases (i.e., $|K_{t+1}| > |K_t|$), ensuring monotonic advancement toward the terminal state $s_{K_{\max}}$.
  • Figure 2: The Complexity Cliff. Soft-FSM (Ours) maintains near-perfect completeness ($>97\%$) regardless of task complexity, while baseline methods degrade sharply as procedural depth increases. V3 maintains near-zero variance across all cases, demonstrating structural robustness. Error bars denote standard deviation ($N=5$).