Table of Contents
Fetching ...

Streaming Bayes GFlowNets

Tiago da Silva, Daniel Augusto de Souza, Diego Mesquita

TL;DR

This work proposes streaming Bayes GFlowNets (abbreviated as SB-GFlowNets) by leveraging the recently proposed GFlowNets -- a powerful class of amortized samplers for discrete compositional objects and showcases the effectiveness of SB-GFlowNets in sampling from an unnormalized posterior in a streaming setting.

Abstract

Bayes' rule naturally allows for inference refinement in a streaming fashion, without the need to recompute posteriors from scratch whenever new data arrives. In principle, Bayesian streaming is straightforward: we update our prior with the available data and use the resulting posterior as a prior when processing the next data chunk. In practice, however, this recipe entails i) approximating an intractable posterior at each time step; and ii) encapsulating results appropriately to allow for posterior propagation. For continuous state spaces, variational inference (VI) is particularly convenient due to its scalability and the tractability of variational posteriors. For discrete state spaces, however, state-of-the-art VI results in analytically intractable approximations that are ill-suited for streaming settings. To enable streaming Bayesian inference over discrete parameter spaces, we propose streaming Bayes GFlowNets (abbreviated as SB-GFlowNets) by leveraging the recently proposed GFlowNets -- a powerful class of amortized samplers for discrete compositional objects. Notably, SB-GFlowNet approximates the initial posterior using a standard GFlowNet and subsequently updates it using a tailored procedure that requires only the newly observed data. Our case studies in linear preference learning and phylogenetic inference showcase the effectiveness of SB-GFlowNets in sampling from an unnormalized posterior in a streaming setting. As expected, we also observe that SB-GFlowNets is significantly faster than repeatedly training a GFlowNet from scratch to sample from the full posterior.

Streaming Bayes GFlowNets

TL;DR

This work proposes streaming Bayes GFlowNets (abbreviated as SB-GFlowNets) by leveraging the recently proposed GFlowNets -- a powerful class of amortized samplers for discrete compositional objects and showcases the effectiveness of SB-GFlowNets in sampling from an unnormalized posterior in a streaming setting.

Abstract

Bayes' rule naturally allows for inference refinement in a streaming fashion, without the need to recompute posteriors from scratch whenever new data arrives. In principle, Bayesian streaming is straightforward: we update our prior with the available data and use the resulting posterior as a prior when processing the next data chunk. In practice, however, this recipe entails i) approximating an intractable posterior at each time step; and ii) encapsulating results appropriately to allow for posterior propagation. For continuous state spaces, variational inference (VI) is particularly convenient due to its scalability and the tractability of variational posteriors. For discrete state spaces, however, state-of-the-art VI results in analytically intractable approximations that are ill-suited for streaming settings. To enable streaming Bayesian inference over discrete parameter spaces, we propose streaming Bayes GFlowNets (abbreviated as SB-GFlowNets) by leveraging the recently proposed GFlowNets -- a powerful class of amortized samplers for discrete compositional objects. Notably, SB-GFlowNet approximates the initial posterior using a standard GFlowNet and subsequently updates it using a tailored procedure that requires only the newly observed data. Our case studies in linear preference learning and phylogenetic inference showcase the effectiveness of SB-GFlowNets in sampling from an unnormalized posterior in a streaming setting. As expected, we also observe that SB-GFlowNets is significantly faster than repeatedly training a GFlowNet from scratch to sample from the full posterior.

Paper Structure

This paper contains 34 sections, 9 theorems, 25 equations, 8 figures.

Key Result

Theorem 3.1

Let $(\mathcal{G}, F, \tilde{\pi})$ be a imbalanced flow network defined on a DAG $\mathcal{G}$ and $\tilde{\pi}$ be an uniform distribution supported on a space with $n$ objects. Assume that, except for an edge $(u, v)$ in $\mathcal{G}$, i.e., $\mathbf{A}_{uv} = 1$, for which with $\delta \ge 0$, the network is balanced. Let $p_{\intercal}^{(\delta)}$ be the corresponding marginal distribution o

Figures (8)

  • Figure 1: Imbalanced flows in a regular tree with width $g = 2$ and depth $h = 2$. The extra flow within the root's left child breaks the expected uniform distribution over the $g^{h}$ leaves.
  • Figure 2: Weighted DB accelerates training convergence relatively to standard DB. By weighting each transition proportionally to its closeness to the state graph's root, we notoriously improve upon the DB loss. This empirically supports our theoretical claims regarding the elevated importance of ensuring the balance within states leading to a large number of terminal states.
  • Figure 3: State graph for a GNN-based GFlowNet generating 3-node graphs. The GNN's permutation invariance ensures each state is uniquely represented and the smallness resulting state graph.
  • Figure 4: State graph for a MLP-based GFlowNet generating 3-node graphs. The absence of an inductive bias for permutation variance leads to a different treatment for the column-wise isomorphic graphs by the MLP. Compare to \ref{['fig:gnns']} and note the reduction in size due to using an inductively biased neural network. We omitted some edges from the state graph to avoid cluttering.
  • Figure 5: GNN-based GFlowNets are less expressive than their MLP-based counterparts. GNN-based GFlowNets can learn to sample some (right), but not all (left), distributions represented by the state graph of \ref{['fig:gnns_state']}. An MLP-based GFlowNet, in contrast, is not subject to such constraints and can sample from any target.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Theorem 3.1: Deterministic flow-based bounds for the TV in general graphs
  • Theorem 3.2: Expected TV under Dirichlet-distributed extra flows
  • Corollary 3.3
  • Example 3.4: Total variation of the sampling distribution for trees
  • Theorem 4.1: Distributional limits for GNN-based GFlowNets
  • Proposition 5.1: Equivalence between TV and FCS
  • Proposition 5.2: PAC bound for FCS
  • Theorem 2: Total variation of the sampling distribution
  • proof
  • Theorem 3: Total variation of the sampling distribution
  • ...and 3 more