Table of Contents
Fetching ...

BanglaNet: Bangla Handwritten Character Recognition using Ensembling of Convolutional Neural Network

Chandrika Saha, Md Mostafijur Rahman

TL;DR

This work tackles Bangla handwritten isolated character recognition by proposing BanglaNet, an ensemble of three CNN architectures inspired by Inception, ResNet, and DenseNet. The models are trained with and without data augmentation and are combined by averaging their outputs to improve accuracy across three major datasets: CMATERdb, BanglaLekha-Isolated, and Ekush. BanglaNet achieves state-of-the-art-like top-1 and top-3 accuracies on these datasets, particularly excelling on CMATERdb and BanglaLekha-Isolated, while also handling numerals, modifiers, and compound characters in a unified framework. The results demonstrate the effectiveness of architectural diversity and ensembling for comprehensive Bangla BHCR and pave the way toward a practical Bangla OCR system.

Abstract

Handwritten character recognition is a crucial task because of its abundant applications. The recognition task of Bangla handwritten characters is especially challenging because of the cursive nature of Bangla characters and the presence of compound characters with more than one way of writing. In this paper, a classification model based on the ensembling of several Convolutional Neural Networks (CNN), namely, BanglaNet is proposed to classify Bangla basic characters, compound characters, numerals, and modifiers. Three different models based on the idea of state-of-the-art CNN models like Inception, ResNet, and DenseNet have been trained with both augmented and non-augmented inputs. Finally, all these models are averaged or ensembled to get the finishing model. Rigorous experimentation on three benchmark Bangla handwritten characters datasets, namely, CMATERdb, BanglaLekha-Isolated, and Ekush has exhibited significant recognition accuracies compared to some recent CNN-based research. The top-1 recognition accuracies obtained are 98.40%, 97.65%, and 97.32%, and the top-3 accuracies are 99.79%, 99.74%, and 99.56% for CMATERdb, BanglaLekha-Isolated, and Ekush datasets respectively.

BanglaNet: Bangla Handwritten Character Recognition using Ensembling of Convolutional Neural Network

TL;DR

This work tackles Bangla handwritten isolated character recognition by proposing BanglaNet, an ensemble of three CNN architectures inspired by Inception, ResNet, and DenseNet. The models are trained with and without data augmentation and are combined by averaging their outputs to improve accuracy across three major datasets: CMATERdb, BanglaLekha-Isolated, and Ekush. BanglaNet achieves state-of-the-art-like top-1 and top-3 accuracies on these datasets, particularly excelling on CMATERdb and BanglaLekha-Isolated, while also handling numerals, modifiers, and compound characters in a unified framework. The results demonstrate the effectiveness of architectural diversity and ensembling for comprehensive Bangla BHCR and pave the way toward a practical Bangla OCR system.

Abstract

Handwritten character recognition is a crucial task because of its abundant applications. The recognition task of Bangla handwritten characters is especially challenging because of the cursive nature of Bangla characters and the presence of compound characters with more than one way of writing. In this paper, a classification model based on the ensembling of several Convolutional Neural Networks (CNN), namely, BanglaNet is proposed to classify Bangla basic characters, compound characters, numerals, and modifiers. Three different models based on the idea of state-of-the-art CNN models like Inception, ResNet, and DenseNet have been trained with both augmented and non-augmented inputs. Finally, all these models are averaged or ensembled to get the finishing model. Rigorous experimentation on three benchmark Bangla handwritten characters datasets, namely, CMATERdb, BanglaLekha-Isolated, and Ekush has exhibited significant recognition accuracies compared to some recent CNN-based research. The top-1 recognition accuracies obtained are 98.40%, 97.65%, and 97.32%, and the top-3 accuracies are 99.79%, 99.74%, and 99.56% for CMATERdb, BanglaLekha-Isolated, and Ekush datasets respectively.
Paper Structure (17 sections, 3 equations, 6 figures, 4 tables)

This paper contains 17 sections, 3 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Compound characters with multiple representations (same column represents same characters)
  • Figure 2: (a) Inception block (b) Residual block (c)DenseNet block
  • Figure 3: Block diagram of BanglaNet (‘Filters’ and ‘Dropout’ is the number of filters and dropout rate used in each consecutive layer)
  • Figure 4: Samples from Datasets
  • Figure 5: Learning curves for training and validation data
  • ...and 1 more figures