Table of Contents
Fetching ...

Structurally Valid Log Generation using FSM-GFlowNets

Riya Samanta

TL;DR

This work tackles synthetic log generation for UI interactions under privacy constraints. It introduces FSM-GFlowNets, combining an FSM that constrains valid transitions with a Generative Flow Network that samples trajectories with probability proportional to a learned reward, $P_ heta( au) \propto R( au)$, under a dynamic action mask, and enforces flow conservation via forward and backward flows $F_ heta(s \rightarrow s')$ and $B_ heta(s' \rightarrow s)$. The FSM is reverse-engineered from expert traces produced by GPT-4o and refined with UIC HCI data; optimization uses a flow-matching objective and a hybrid reward balancing FSM compliance with realism. Across distributional metrics and a downstream intent-classification task, FSM-GFlowNet logs achieve closer alignment to real user behavior than unconstrained LLM baselines and demonstrate competitive generalization to real data.

Abstract

Generating structurally valid and behaviorally diverse synthetic event logs for interaction-aware models is a challenging yet crucial problem, particularly in settings with limited or privacy constrained user data. Existing methods such as heuristic simulations and LLM based generators often lack structural coherence or controllability, producing synthetic data that fails to accurately represent real world system interactions. This paper presents a framework that integrates Finite State Machines or FSMs with Generative Flow Networks or GFlowNets to generate structured, semantically valid, and diverse synthetic event logs. Our FSM-constrained GFlowNet ensures syntactic validity and behavioral variation through dynamic action masking and guided sampling. The FSM, derived from expert traces, encodes domain-specific rules, while the GFlowNet is trained using a flow matching objective with a hybrid reward balancing FSM compliance and statistical fidelity. We instantiate the framework in the context of UI interaction logs using the UIC HCI dataset, but the approach generalizes to any symbolic sequence domain. Experimental results based on distributional metrics show that our FSM GFlowNet produces realistic, structurally consistent logs, achieving, for instance, under the real user logs baseline, a KL divergence of 0.2769 and Chi squared distance of 0.3522, significantly outperforming GPT-4o's 2.5294/13.8020 and Gemini's 3.7233/63.0355, alongside a leading bigram overlap of 0.1214 vs. GPT 4o's 0.0028 and Gemini's 0.0007. A downstream use case intent classification demonstrates that classifiers trained solely on our synthetic logs produced from FSM-GFlowNet achieve competitive accuracy compared to real data.

Structurally Valid Log Generation using FSM-GFlowNets

TL;DR

This work tackles synthetic log generation for UI interactions under privacy constraints. It introduces FSM-GFlowNets, combining an FSM that constrains valid transitions with a Generative Flow Network that samples trajectories with probability proportional to a learned reward, , under a dynamic action mask, and enforces flow conservation via forward and backward flows and . The FSM is reverse-engineered from expert traces produced by GPT-4o and refined with UIC HCI data; optimization uses a flow-matching objective and a hybrid reward balancing FSM compliance with realism. Across distributional metrics and a downstream intent-classification task, FSM-GFlowNet logs achieve closer alignment to real user behavior than unconstrained LLM baselines and demonstrate competitive generalization to real data.

Abstract

Generating structurally valid and behaviorally diverse synthetic event logs for interaction-aware models is a challenging yet crucial problem, particularly in settings with limited or privacy constrained user data. Existing methods such as heuristic simulations and LLM based generators often lack structural coherence or controllability, producing synthetic data that fails to accurately represent real world system interactions. This paper presents a framework that integrates Finite State Machines or FSMs with Generative Flow Networks or GFlowNets to generate structured, semantically valid, and diverse synthetic event logs. Our FSM-constrained GFlowNet ensures syntactic validity and behavioral variation through dynamic action masking and guided sampling. The FSM, derived from expert traces, encodes domain-specific rules, while the GFlowNet is trained using a flow matching objective with a hybrid reward balancing FSM compliance and statistical fidelity. We instantiate the framework in the context of UI interaction logs using the UIC HCI dataset, but the approach generalizes to any symbolic sequence domain. Experimental results based on distributional metrics show that our FSM GFlowNet produces realistic, structurally consistent logs, achieving, for instance, under the real user logs baseline, a KL divergence of 0.2769 and Chi squared distance of 0.3522, significantly outperforming GPT-4o's 2.5294/13.8020 and Gemini's 3.7233/63.0355, alongside a leading bigram overlap of 0.1214 vs. GPT 4o's 0.0028 and Gemini's 0.0007. A downstream use case intent classification demonstrates that classifiers trained solely on our synthetic logs produced from FSM-GFlowNet achieve competitive accuracy compared to real data.

Paper Structure

This paper contains 20 sections, 1 theorem, 11 equations, 5 figures, 5 tables, 1 algorithm.

Key Result

Lemma 1

Every trajectory $\tau$ generated by the FSM-constrained GFlowNet strictly adheres to the valid transitions defined by the FSM $\mathcal{M}$.

Figures (5)

  • Figure 1: Prompt design for generating optimal user interaction logs to establish the benchmark ground truth for the Task.
  • Figure 2: Formalized FSM derived from the benchmark ground truth log. States correspond to contextual application views; transitions are governed by semantically valid interaction events.
  • Figure 3: Prompt used for LLM-based generation of baseline synthetic UI logs. Designed to simulate a non-expert user with realistic variability.
  • Figure 4: Distribution of evaluation metrics across methods using real users (UIC HCI) logs as baseline.
  • Figure 5: Distribution of evaluation metrics across methods using ground truth (GT) logs as baseline.

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Lemma 1
  • proof