BanglaNet: Bangla Handwritten Character Recognition using Ensembling of Convolutional Neural Network
Chandrika Saha, Md Mostafijur Rahman
TL;DR
This work tackles Bangla handwritten isolated character recognition by proposing BanglaNet, an ensemble of three CNN architectures inspired by Inception, ResNet, and DenseNet. The models are trained with and without data augmentation and are combined by averaging their outputs to improve accuracy across three major datasets: CMATERdb, BanglaLekha-Isolated, and Ekush. BanglaNet achieves state-of-the-art-like top-1 and top-3 accuracies on these datasets, particularly excelling on CMATERdb and BanglaLekha-Isolated, while also handling numerals, modifiers, and compound characters in a unified framework. The results demonstrate the effectiveness of architectural diversity and ensembling for comprehensive Bangla BHCR and pave the way toward a practical Bangla OCR system.
Abstract
Handwritten character recognition is a crucial task because of its abundant applications. The recognition task of Bangla handwritten characters is especially challenging because of the cursive nature of Bangla characters and the presence of compound characters with more than one way of writing. In this paper, a classification model based on the ensembling of several Convolutional Neural Networks (CNN), namely, BanglaNet is proposed to classify Bangla basic characters, compound characters, numerals, and modifiers. Three different models based on the idea of state-of-the-art CNN models like Inception, ResNet, and DenseNet have been trained with both augmented and non-augmented inputs. Finally, all these models are averaged or ensembled to get the finishing model. Rigorous experimentation on three benchmark Bangla handwritten characters datasets, namely, CMATERdb, BanglaLekha-Isolated, and Ekush has exhibited significant recognition accuracies compared to some recent CNN-based research. The top-1 recognition accuracies obtained are 98.40%, 97.65%, and 97.32%, and the top-3 accuracies are 99.79%, 99.74%, and 99.56% for CMATERdb, BanglaLekha-Isolated, and Ekush datasets respectively.
