Continuous Specialization Transition in the Soft Committee Machine with ReLU Activation
Assem Afanah, Bernd Rosenow
Abstract
We analyze the soft committee machine with Rectified Linear Unit (ReLU) activation by means of the replica method. In a realizable teacher--student setting, we compute the quenched free energy within a replica-symmetric ansatz and obtain the typical generalization behavior from the saddle-point equations for the macroscopic order parameters. The system exhibits a transition from an unspecialized symmetric phase to a specialized phase in which the permutation symmetry among hidden units is broken. We determine the critical training-set size as a function of the inverse training temperature and derive analytic expressions both near the transition and in the asymptotic large-sample regime. Unlike the corresponding model with sigmoidal activations, which undergoes a first-order transition, the ReLU soft committee machine shows a continuous specialization transition. These results show that the activation function plays a decisive role in the phase structure and generalization behavior of multilayer networks.
