Walking the Weight Manifold: a Topological Approach to Conditioning Inspired by Neuromodulation
Ari S. Benjamin, Kyle Daruwalla, Christian Pehle, Abdul-Malik Zekri, Anthony M. Zador
TL;DR
This work introduces weight manifolds as a neuromodulation-inspired mechanism for conditioning neural networks, where task context selects a point on a low-dimensional manifold in weight space rather than a single weight vector. It formalizes the optimization of an entire weight manifold via a variational loss $L[\mathcal{M}] = \int_0^1 \ell(\mathcal{M}(s,\mathbf{P})) ds$ under a bounded volumetric movement constraint, deriving a practical update $\Delta \mathbf{P} = -\frac{1}{2\lambda} \left[\int_0^1 \mathbf{M}(s) ds\right]^{-1} \int_0^1 \mathbf{g}(s) ds$ where $\mathbf{M}(s)$ and $\mathbf{g}(s)$ are the local metric and gradient. The framework supports analytic closed-form inverses for common manifolds (e.g., straight lines, ellipses), enabling efficient updates, and uses a basis-point decomposition to process batches without instantiating full weight matrices for every conditional input. Empirically, simple topologies like lines and ellipses implemented as weight manifolds can outperform traditional conditioning by input concatenation, and can generalize to unseen conditioning values (e.g., rotations of CIFAR-10) better than baselines; regularization experiments reveal when manifold conditioning helps and when mis-specification can limit benefits. Overall, the paper provides a principled, topology-aligned alternative to standard conditioning, with clear theoretical and practical pathways to richer topologies and conditioning inference in future work.
Abstract
One frequently wishes to learn a range of similar tasks as efficiently as possible, re-using knowledge across tasks. In artificial neural networks, this is typically accomplished by conditioning a network upon task context by injecting context as input. Brains have a different strategy: the parameters themselves are modulated as a function of various neuromodulators such as serotonin. Here, we take inspiration from neuromodulation and propose to learn weights which are smoothly parameterized functions of task context variables. Rather than optimize a weight vector, i.e. a single point in weight space, we optimize a smooth manifold in weight space with a predefined topology. To accomplish this, we derive a formal treatment of optimization of manifolds as the minimization of a loss functional subject to a constraint on volumetric movement, analogous to gradient descent. During inference, conditioning selects a single point on this manifold which serves as the effective weight matrix for a particular sub-task. This strategy for conditioning has two main advantages. First, the topology of the manifold (whether a line, circle, or torus) is a convenient lever for inductive biases about the relationship between tasks. Second, learning in one state smoothly affects the entire manifold, encouraging generalization across states. To verify this, we train manifolds with several topologies, including straight lines in weight space (for conditioning on e.g. noise level in input data) and ellipses (for rotated images). Despite their simplicity, these parameterizations outperform conditioning identical networks by input concatenation and better generalize to out-of-distribution samples. These results suggest that modulating weights over low-dimensional manifolds offers a principled and effective alternative to traditional conditioning.
