Geometric Flow Models over Neural Network Weights

Ege Erdogan

Geometric Flow Models over Neural Network Weights

Ege Erdogan

TL;DR

This thesis addresses the challenge of learning generative models over neural network weights by explicitly incorporating the geometry and symmetries of weight space. It introduces three flow designs—Euclidean, Normalized, and Geometric—built on flow matching and powered by weight-space graph neural networks to transport priors to posteriors while respecting permutation and scaling symmetries. Empirical results across toy, small, and MNIST-scale tasks show that geometry-aware flows can generate high-quality weight samples with far fewer parameters and can transfer or be guided by task gradients, enabling effective Bayesian inference and learned initialization. The work argues that explicit geometric modeling yields more data-efficient and transferable weight-space representations, with clear paths for scaling, broader architectures, and deeper exploration of symmetry-driven priors and flows.

Abstract

Deep generative models such as flow and diffusion models have proven to be effective in modeling high-dimensional and complex data types such as videos or proteins, and this has motivated their use in different data modalities, such as neural network weights. A generative model of neural network weights would be useful for a diverse set of applications, such as Bayesian deep learning, learned optimization, and transfer learning. However, the existing work on weight-space generative models often ignores the symmetries of neural network weights, or only takes into account a subset of them. Modeling those symmetries, such as permutation symmetries between subsequent layers in an MLP, the filters in a convolutional network, or scaling symmetries arising with the use of non-linear activations, holds the potential to make weight-space generative modeling more efficient by effectively reducing the dimensionality of the problem. In this light, we aim to design generative models in weight-space that more comprehensively respect the symmetries of neural network weights. We build on recent work on generative modeling with flow matching, and weight-space graph neural networks to design three different weight-space flows. Each of our flows takes a different approach to modeling the geometry of neural network weights, and thus allows us to explore the design space of weight-space flows in a principled way. Our results confirm that modeling the geometry of neural networks more faithfully leads to more effective flow models that can generalize to different tasks and architectures, and we show that while our flows obtain competitive performance with orders of magnitude fewer parameters than previous work, they can be further improved by scaling them up. We conclude by listing potential directions for future work on weight-space generative models.

Geometric Flow Models over Neural Network Weights

TL;DR

Abstract

Geometric Flow Models over Neural Network Weights

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)