Flow of Spans: Generalizing Language Models to Dynamic Span-Vocabulary via GFlowNets

Bo Xue; Yunchong Song; Fanghao Shao; Xuekai Zhu; Lin Chen; Luoyi Fu; Xinbing Wang; Zhouhan Lin

Flow of Spans: Generalizing Language Models to Dynamic Span-Vocabulary via GFlowNets

Bo Xue, Yunchong Song, Fanghao Shao, Xuekai Zhu, Lin Chen, Luoyi Fu, Xinbing Wang, Zhouhan Lin

TL;DR

This paper tackles the limitations of fixed-vocabulary autoregressive generation by proposing FoSS, a span-based language model that treats generation as DAG-structured span selection and optimization via Generative Flow Networks. FoSS builds a dynamic span vocabulary through a DAG-Inducing Span Segmentation and employs a span language model as the forward policy, trained with a subtrajectory balance objective and a hybrid online-offline strategy. The reward combines a language model fluency signal with a learned preference model to steer toward human-like, diverse continuations. Empirically, FoSS achieves notable gains in MAUVE and diversity across in-domain, out-of-domain, and knowledge-intensive tasks, with stronger scaling behavior and robust ablations confirming the value of the DAG representation and the dual reward signals for high-quality, diverse text generation.

Abstract

Standard autoregressive language models generate text token-by-token from a fixed vocabulary, inducing a tree-structured state space when viewing token sampling as an action, which limits flexibility and expressiveness. Recent work introduces dynamic vocabulary by sampling retrieved text spans but overlooks that the same sentence can be composed of spans of varying lengths, lacking explicit modeling of the directed acyclic graph (DAG) state space. This leads to restricted exploration of compositional paths and is biased toward the chosen path. Generative Flow Networks (GFlowNets) are powerful for efficient exploring and generalizing over state spaces, particularly those with a DAG structure. However, prior GFlowNets-based language models operate at the token level and remain confined to tree-structured spaces, limiting their potential. In this work, we propose Flow of SpanS (FOSS), a principled GFlowNets framework for span generation. FoSS constructs a dynamic span vocabulary by segmenting the retrieved text flexibly, ensuring a DAG-structured state space, which allows GFlowNets to explore diverse compositional paths and improve generalization. With specialized reward models, FoSS generates diverse, high-quality text. Empirically, FoSS improves MAUVE scores by up to 12.5% over Transformer on text generation and achieves 3.5% gains on knowledge-intensive tasks, consistently outperforming state-of-the-art methods. Scaling experiments further demonstrate FoSS benefits from larger models, more data, and richer retrieval corpora, retaining its advantage over strong baselines.

Flow of Spans: Generalizing Language Models to Dynamic Span-Vocabulary via GFlowNets

TL;DR

Abstract

Paper Structure (31 sections, 4 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 31 sections, 4 equations, 7 figures, 5 tables, 1 algorithm.

Introduction
Preliminaries
Span-Generation with GFlowNets
Markov Decision Process Formulation for GFlowNets Learning
Learning Objective and Training Policy
Policy Network
Reward Function
Experiments
In Domain Evaluation
Out of Domain Evaluation
Scaling Evaluation
Downstream Evaluation
Ablation Study
Conclusion
Related Work
...and 16 more sections

Figures (7)

Figure 1: Given the prefix, a standard language model generates token-by-token and forms a tree state space, trained with next-token prediction (NTP) loss; while in FoSS, since a sentence can be composed of spans in multiple ways, we construct a DAG state space and optimize it with GFlowNets.
Figure 2: Generation quality of FoSS with different sizes of the span index.
Figure 3: Generation quality of FoSS with different sizes of offline data trained.
Figure 4: Generation quality of FoSS with different sizes of models.
Figure 5: Case Study. The blue part represents directly sampling a phrase.
...and 2 more figures

Flow of Spans: Generalizing Language Models to Dynamic Span-Vocabulary via GFlowNets

TL;DR

Abstract

Flow of Spans: Generalizing Language Models to Dynamic Span-Vocabulary via GFlowNets

Authors

TL;DR

Abstract

Table of Contents

Figures (7)