Parareal Neural Networks Emulating a Parallel-in-time Algorithm
Chang-Ock Lee, Youngkyu Lee, Jongho Park
TL;DR
This work introduces parareal neural networks, a parallel-in-time-inspired framework that splits deep networks into parallel subnetworks connected by a coarse corrective network. By emulating the parareal algorithm, the method enables multi-GPU training with reduced inter-GPU communication while preserving or enhancing accuracy. Consistency is shown in a linear setting, and empirical results on VGG-16 and ResNet-1001 across datasets including ImageNet and CIFAR demonstrate competitive performance and favorable training-time characteristics. The approach offers a new avenue for scalable, memory-efficient parallelism in very deep networks by leveraging coarse corrections to propagate residuals across interface layers.
Abstract
As deep neural networks (DNNs) become deeper, the training time increases. In this perspective, multi-GPU parallel computing has become a key tool in accelerating the training of DNNs. In this paper, we introduce a novel methodology to construct a parallel neural network that can utilize multiple GPUs simultaneously from a given DNN. We observe that layers of DNN can be interpreted as the time step of a time-dependent problem and can be parallelized by emulating a parallel-in-time algorithm called parareal. The parareal algorithm consists of fine structures which can be implemented in parallel and a coarse structure which gives suitable approximations to the fine structures. By emulating it, the layers of DNN are torn to form a parallel structure which is connected using a suitable coarse network. We report accelerated and accuracy-preserved results of the proposed methodology applied to VGG-16 and ResNet-1001 on several datasets.
