Budgeted Broadcast: An Activity-Dependent Pruning Rule for Neural Network Efficiency
Yaron Meirovitch, Fuming Yang, Jeff Lichtman, Nir Shavit
TL;DR
Budgeted Broadcast introduces a local traffic budget $t_i=a_i k_i$ to prune neural networks, aiming to maximize information coding under a global resource constraint. From constrained-entropy optimization, the network converges to a selectivity–audience balance described by $\log\frac{1-a_i}{a_i}=\beta k_i$, implemented with SP-in/SP-out masks and EMA-based activity tracking. Across controlled didactic tasks and four real-domain benchmarks (ASR, face identification, change detection, and EM synapse segmentation), BB improves tail/rare-event metrics and decorrelation while matching or exceeding dense baselines at the same sparsity. The approach offers a biologically grounded, easy-to-integrate pruning mechanism with potential to foster more diverse and efficient representations in large-scale models.
Abstract
Most pruning methods remove parameters ranked by impact on loss (e.g., magnitude or gradient). We propose Budgeted Broadcast (BB), which gives each unit a local traffic budget (the product of its long-term on-rate $a_i$ and fan-out $k_i$). A constrained-entropy analysis shows that maximizing coding entropy under a global traffic budget yields a selectivity-audience balance, $\log\frac{1-a_i}{a_i}=βk_i$. BB enforces this balance with simple local actuators that prune either fan-in (to lower activity) or fan-out (to reduce broadcast). In practice, BB increases coding entropy and decorrelation and improves accuracy at matched sparsity across Transformers for ASR, ResNets for face identification, and 3D U-Nets for synapse prediction, sometimes exceeding dense baselines. On electron microscopy images, it attains state-of-the-art F1 and PR-AUC under our evaluation protocol. BB is easy to integrate and suggests a path toward learning more diverse and efficient representations.
