Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN
Md Laraib Salam, Akash S Balsaraf, Gaurav Gupta, Ashish Rajeshwar Kulkarni
TL;DR
This paper tackles predicting the base and exponent from images of mathematical expressions using a multi-output CNN. It trains on a synthetic dataset of 10,900 images with randomized noise, font sizes, and blur to simulate real-world variability. The model achieves high accuracy and robustness for base/exponent prediction, outperforming traditional approaches such as HOG in accuracy and speed. The approach has practical relevance for math OCR and can be extended with transfer learning, diverse data, and real-time processing.
Abstract
The use of neural networks and deep learning techniques in image processing has significantly advanced the field, enabling highly accurate recognition results. However, achieving high recognition rates often necessitates complex network models, which can be challenging to train and require substantial computational resources. This research presents a simplified yet effective approach to predicting both the base and exponent from images of mathematical expressions using a multi-output Convolutional Neural Network (CNN). The model is trained on 10,900 synthetically generated images containing exponent expressions, incorporating random noise, font size variations, and blur intensity to simulate real-world conditions. The proposed CNN model demonstrates robust performance with efficient training time. The experimental results indicate that the model achieves high accuracy in predicting the base and exponent values, proving the efficacy of this approach in handling noisy and varied input images.
