Sign-In to the Lottery: Reparameterizing Sparse Training From Scratch
Advait Gadhikar, Tom Jacobs, Chao Zhou, Rebekka Burkholz
TL;DR
The paper addresses the challenge of training sparse neural networks from scratch (PaI) and identifies parameter signs as a key missing piece compared to dense-to-sparse methods. It introduces Sign-In, a reparameterization $\theta \mapsto m \odot w$ with an inner scaling that provably induces sign flips and promotes sign alignment, yielding improved PaI performance across masks and architectures. The authors provide both theoretical (Riemannian gradient flow) and empirical support, including a no-replacement impossibility result for replacing overparameterization and strong results on sign recovery in simple settings. While Sign-In enhances PaI and is orthogonal to dense pretraining, it does not fully bridge the gap to dense-to-sparse training, underscoring the ongoing challenge of training sparse networks from scratch. Overall, sign alignment emerges as a sufficient condition for sparse trainability, and Sign-In offers a practical mechanism to realize it with modest overhead and broad applicability.
Abstract
The performance gap between training sparse neural networks from scratch (PaI) and dense-to-sparse training presents a major roadblock for efficient deep learning. According to the Lottery Ticket Hypothesis, PaI hinges on finding a problem specific parameter initialization. As we show, to this end, determining correct parameter signs is sufficient. Yet, they remain elusive to PaI. To address this issue, we propose Sign-In, which employs a dynamic reparameterization that provably induces sign flips. Such sign flips are complementary to the ones that dense-to-sparse training can accomplish, rendering Sign-In as an orthogonal method. While our experiments and theory suggest performance improvements of PaI, they also carve out the main open challenge to close the gap between PaI and dense-to-sparse training.
