Exploring Multiple High-Scoring Subspaces in Generative Flow Networks
Xuan Yu, Xu Wang, Rui Zhu, Yudong Zhang, Yang Wang
TL;DR
This work tackles the challenge of inefficient exploration in Generative Flow Networks (GFlowNets) by reframing exploration as subspace-level optimization. It introduces CMAB-GFN, a framework that uses combinatorial multi-armed bandits (CMAB) to prune action spaces into compact, high-reward subspaces and trains GFlowNets within those subspaces, while periodically evaluating across all subspaces to preserve diversity. The method employs a two-phase sampling protocol, a CUCB-based subspace selection with a co-occurrence aware scoring mechanism, and architectural adjustments to mitigate deep networks. Across bit sequence, molecule design, and RNA-binding tasks, CMAB-GFN yields higher-reward candidates, discovers more high-reward modes, and maintains diversity better than strong baselines, with ablation analyses confirming robustness to hyperparameter choices. The approach enhances efficiency and robustness of GFlowNets in structured generative domains and offers scalable, subspace-aware exploration for complex combinatorial design problems.
Abstract
As a probabilistic sampling framework, Generative Flow Networks (GFlowNets) show strong potential for constructing complex combinatorial objects through the sequential composition of elementary components. However, existing GFlowNets often suffer from excessive exploration over vast state spaces, leading to over-sampling of low-reward regions and convergence to suboptimal distributions. Effectively biasing GFlowNets toward high-reward solutions remains a non-trivial challenge. In this paper, we propose CMAB-GFN, which integrates a combinatorial multi-armed bandit (CMAB) framework with GFlowNet policies. The CMAB component prunes low-quality actions, yielding compact high-scoring subspaces for exploration. Restricting GFNs to these compact high-scoring subspaces accelerates the discovery of high-value candidates, while the exploration of different subspaces ensures that diversity is not sacrificed. Experimental results on multiple tasks demonstrate that CMAB-GFN generates higher-reward candidates than existing approaches.
