Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints
Zilin Kang, Chonghua Liao, Tingqiang Xu, Huazhe Xu
TL;DR
Entropy Regularizing Activation (ERA) presents a universal, activation-based mechanism to enforce a target entropy on model outputs without altering the primary objective. By transforming final outputs through task-specific activations, ERA achieves provable entropy guarantees across continuous control, discrete classification, and large language model reinforcement learning, while incurring modest overhead (~7%). Empirically, ERA yields substantial gains in continuous control (SAC, PPO, TD-MPC2, FastSAC), image classification (ImageNet, CIFAR-10), and LLM reasoning benchmarks (AIME, AMC) and improves out-of-distribution generalization. The approach highlights output activations as a powerful, non-invasive tool for entropy control, offering a scalable path to more robust and generalizable learning systems.
Abstract
We propose ERA, a new paradigm that constrains the sampling entropy above given thresholds by applying specially designed activations to the outputs of models. Our approach demonstrates broad effectiveness across different domains: 1) for large language models(LLMs), boosting the AIME 2025 score for Qwen2.5-Math-7B by 37.4%; 2) for continuous control reinforcement learning agents, improving performance by more than 30% over strong baselines such as SAC on the challenging HumanoidBench; 3) for image classification, enhancing ImageNet top-1 accuracy by 0.69% for ResNet-50. These gains are achieved with a computational overhead of less than 7%. Our work validates output activation as a powerful tool for entropy control, opening a new direction for designing simpler and more robust algorithms.
