Classification of Non-native Handwritten Characters Using Convolutional Neural Network

F. A. Mamun; S. A. H. Chowdhury; J. E. Giti; H. Sarker

Classification of Non-native Handwritten Characters Using Convolutional Neural Network

F. A. Mamun, S. A. H. Chowdhury, J. E. Giti, H. Sarker

TL;DR

This work tackles non-native handwriting recognition by building a tailored CNN trained on the Handwritten Isolated English Character (HIEC) dataset, comprising $16,496$ images from $260$ writers. Through an ablation study, the authors identify a five-convolutional-layer architecture with a single hidden layer and dropout as the best configuration, achieving $97.04\%$ test accuracy and outperforming several state-of-the-art models by up to $4.38\%$. The study highlights the impact of non-native handwriting diversity on HCR and demonstrates robust performance via targeted data augmentation and architecture design. The findings suggest practical potential for reliable non-native HCR and motivate future work on multilingual and semi-supervised extensions for broader applicability.

Abstract

The use of convolutional neural networks (CNNs) has accelerated the progress of handwritten character classification/recognition. Handwritten character recognition (HCR) has found applications in various domains, such as traffic signal detection, language translation, and document information extraction. However, the widespread use of existing HCR technology is yet to be seen as it does not provide reliable character recognition with outstanding accuracy. One of the reasons for unreliable HCR is that existing HCR methods do not take the handwriting styles of non-native writers into account. Hence, further improvement is needed to ensure the reliability and extensive deployment of character recognition technologies for critical tasks. In this work, the classification of English characters written by non-native users is performed by proposing a custom-tailored CNN model. We train this CNN with a new dataset called the handwritten isolated English character (HIEC) dataset. This dataset consists of 16,496 images collected from 260 persons. This paper also includes an ablation study of our CNN by adjusting hyperparameters to identify the best model for the HIEC dataset. The proposed model with five convolutional layers and one hidden layer outperforms state-of-the-art models in terms of character recognition accuracy and achieves an accuracy of $\mathbf{97.04}$%. Compared with the second-best model, the relative improvement of our model in terms of classification accuracy is $\mathbf{4.38}$%.

Classification of Non-native Handwritten Characters Using Convolutional Neural Network

TL;DR

This work tackles non-native handwriting recognition by building a tailored CNN trained on the Handwritten Isolated English Character (HIEC) dataset, comprising

images from

writers. Through an ablation study, the authors identify a five-convolutional-layer architecture with a single hidden layer and dropout as the best configuration, achieving

test accuracy and outperforming several state-of-the-art models by up to

. The study highlights the impact of non-native handwriting diversity on HCR and demonstrates robust performance via targeted data augmentation and architecture design. The findings suggest practical potential for reliable non-native HCR and motivate future work on multilingual and semi-supervised extensions for broader applicability.

Abstract

%. Compared with the second-best model, the relative improvement of our model in terms of classification accuracy is

Paper Structure (15 sections, 6 figures, 2 tables)

This paper contains 15 sections, 6 figures, 2 tables.

Introduction
Related Work
Dataset Collection and Description
Dataset Collection
Dataset Description
Methodology
Data Preprocessing
Data Augmentation
Proposed Model
Performance Metrics
Implementation Details
Results & Analysis
Ablation Study
Performance Comparison with Related Works
Conclusion and Future Work

Figures (6)

Figure 1: An image of a handwritten character is being fed to a CNN-based handwritten character recognition (HCR) system. The system interacts with the input image and provides an output/predicted label. For accurate recognition, the predicted label needs to be similar to the ground truth label of the character being recognized.
Figure 2: Some samples of the HIEC dataset.
Figure 3: The overall pipeline of this work begins with a pre-processing step where the collected images of handwritten characters are cropped, then resized to a standard size, and normalized. The next step is data augmentation to balance and increase the size of the HIEC dataset. After splitting the augmented dataset into training, validation, and test sets, we design and train our model using training and validation data. The best-trained model is then saved and used to obtain the classification result on the test set.
Figure 4: Examples of images after applying various augmentation techniques. Each row represents images corresponding to different augmentation techniques. The top row shows the original data. The second, third, fourth, and fifth rows from the top show the augmented images after applying brightness adjustment, contrast adjustment, rotation, and sharpness adjustment, respectively.
Figure 5: In the proposed CNN model, the same kernel and pooling sizes are used for each conv and max-pooling layer, respectively. The size of the kernels for convolution and max-pooling are $3\times3$ and $2\times2$, respectively. The handwritten character images are fed to the first convolutional layer as input. The output size of each layer is shown inside the bracket at the bottom of the illustration corresponding to that layer. For example, the output size of the second convolutional layer is $109\times109\times64$ whereas the output size after the last max-pooling layer is $12\times12\times64$. The output from the last max-pooling layer is flattened which yields the size of $9216\times1$. This output serves as the input to the subsequent dense layers. The first dense (hidden) layer contains $64$ neurons, whereas the last dense layer, serving as the output layer, has $26$ neurons equal to the number of classes in the dataset. The softmax activation function is chosen for the output layer, transforming the raw output scores into a probability distribution over the classes.
...and 1 more figures

Classification of Non-native Handwritten Characters Using Convolutional Neural Network

TL;DR

Abstract

Classification of Non-native Handwritten Characters Using Convolutional Neural Network

Authors

TL;DR

Abstract

Table of Contents

Figures (6)