Hybrid Dual-Batch and Cyclic Progressive Learning for Efficient Distributed Training

Kuan-Wei Lu; Ding-Yong Hong; Pangfeng Liu; Jan-Jan Wu

Hybrid Dual-Batch and Cyclic Progressive Learning for Efficient Distributed Training

Kuan-Wei Lu, Ding-Yong Hong, Pangfeng Liu, Jan-Jan Wu

TL;DR

The paper tackles the efficiency-generalization gap in distributed deep learning by proposing dual-batch learning, which trains with two batch sizes $B_L$ and $B_S$ on a parameter-server, and cyclic progressive learning, which schedules training with progressively increasing image resolutions. The hybrid scheme combines both approaches, balancing throughput and gradient diversity while dynamically adapting batch sizes, resolutions, and learning rates across training stages. A model-update factor based on data distribution between large- and small-batch workers and a memory-based method to auto-determine $B_{max}$ enable scalable, hardware-aware training. Experiments on CIFAR-100 and ImageNet using ResNet-18 show up to 34.8% training-time reduction with comparable or improved accuracy, highlighting practical gains for large-scale CNNs.

Abstract

Distributed machine learning is critical for training deep learning models on large datasets with numerous parameters. Current research primarily focuses on leveraging additional hardware resources and powerful computing units to accelerate the training process. As a result, larger batch sizes are often employed to speed up training. However, training with large batch sizes can lead to lower accuracy due to poor generalization. To address this issue, we propose the dual-batch learning scheme, a distributed training method built on the parameter server framework. This approach maximizes training efficiency by utilizing the largest batch size that the hardware can support while incorporating a smaller batch size to enhance model generalization. By using two different batch sizes simultaneously, this method improves accuracy with minimal additional training time. Additionally, to mitigate the time overhead caused by dual-batch learning, we propose the cyclic progressive learning scheme. This technique repeatedly and gradually increases image resolution from low to high during training, thereby reducing training time. By combining cyclic progressive learning with dual-batch learning, our hybrid approach improves both model generalization and training efficiency. Experimental results with ResNet-18 demonstrate that, compared to conventional training methods, our approach improves accuracy by 3.3% while reducing training time by 10.1% on CIFAR-100, and further achieves a 34.8% reduction in training time on ImageNet.

Hybrid Dual-Batch and Cyclic Progressive Learning for Efficient Distributed Training

TL;DR

Abstract

Hybrid Dual-Batch and Cyclic Progressive Learning for Efficient Distributed Training

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)