Improving GFlowNets with Monte Carlo Tree Search
Nikita Morozov, Daniil Tiapkin, Sergey Samsonov, Alexey Naumov, Dmitry Vetrov
TL;DR
The paper addresses enhancing planning in Generative Flow Networks (GFlowNets) by integrating Monte Carlo Tree Search (MCTS) through the MENTS algorithm to estimate entropy-regularized Q-values. By applying MENTS on top of SoftDQN, the authors enable look-ahead planning during both training and inference, aligning forward policies with the trajectory balance framework. Empirically, MENTS improves sample efficiency and generation fidelity in the Hypergrid and Bit Sequence tasks, with gains scaling with the number of MCTS rounds and when used for both training and inference. This approach leverages the DAG structure and soft RL formulation of GFlowNets to provide a principled planning mechanism that can be extended to other GFlowNet variants and domains.
Abstract
Generative Flow Networks (GFlowNets) treat sampling from distributions over compositional discrete spaces as a sequential decision-making problem, training a stochastic policy to construct objects step by step. Recent studies have revealed strong connections between GFlowNets and entropy-regularized reinforcement learning. Building on these insights, we propose to enhance planning capabilities of GFlowNets by applying Monte Carlo Tree Search (MCTS). Specifically, we show how the MENTS algorithm (Xiao et al., 2019) can be adapted for GFlowNets and used during both training and inference. Our experiments demonstrate that this approach improves the sample efficiency of GFlowNet training and the generation fidelity of pre-trained GFlowNet models.
