Activation-Descent Regularization for Input Optimization of ReLU Networks
Hongzhan Yu, Sicun Gao
TL;DR
This work tackles the challenge of input optimization in ReLU networks, where standard gradients fail to account for changes in activation patterns across regions. It introduces Activation-Descent Regularization, which couples optimization in input space with an activation-pattern space via a differentiable sigmoid-surrogate and a Lagrangian objective, guiding descent along activation-aware directions. The approach yields improved local descent, outperforming traditional gradient-based methods in adversarial attacks, image reconstruction with generative models, and action refinement in deep reinforcement learning, with thorough ablations validating each component. This activation-aware framework offers practical benefits for robust optimization in piecewise-linear networks and highlights directions for theories of convergence and scalability to larger architectures.
Abstract
We present a new approach for input optimization of ReLU networks that explicitly takes into account the effect of changes in activation patterns. We analyze local optimization steps in both the input space and the space of activation patterns to propose methods with superior local descent properties. To accomplish this, we convert the discrete space of activation patterns into differentiable representations and propose regularization terms that improve each descent step. Our experiments demonstrate the effectiveness of the proposed input-optimization methods for improving the state-of-the-art in various areas, such as adversarial learning, generative modeling, and reinforcement learning.
