Table of Contents
Fetching ...

Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN

Md Laraib Salam, Akash S Balsaraf, Gaurav Gupta, Ashish Rajeshwar Kulkarni

TL;DR

This paper tackles predicting the base and exponent from images of mathematical expressions using a multi-output CNN. It trains on a synthetic dataset of 10,900 images with randomized noise, font sizes, and blur to simulate real-world variability. The model achieves high accuracy and robustness for base/exponent prediction, outperforming traditional approaches such as HOG in accuracy and speed. The approach has practical relevance for math OCR and can be extended with transfer learning, diverse data, and real-time processing.

Abstract

The use of neural networks and deep learning techniques in image processing has significantly advanced the field, enabling highly accurate recognition results. However, achieving high recognition rates often necessitates complex network models, which can be challenging to train and require substantial computational resources. This research presents a simplified yet effective approach to predicting both the base and exponent from images of mathematical expressions using a multi-output Convolutional Neural Network (CNN). The model is trained on 10,900 synthetically generated images containing exponent expressions, incorporating random noise, font size variations, and blur intensity to simulate real-world conditions. The proposed CNN model demonstrates robust performance with efficient training time. The experimental results indicate that the model achieves high accuracy in predicting the base and exponent values, proving the efficacy of this approach in handling noisy and varied input images.

Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN

TL;DR

This paper tackles predicting the base and exponent from images of mathematical expressions using a multi-output CNN. It trains on a synthetic dataset of 10,900 images with randomized noise, font sizes, and blur to simulate real-world variability. The model achieves high accuracy and robustness for base/exponent prediction, outperforming traditional approaches such as HOG in accuracy and speed. The approach has practical relevance for math OCR and can be extended with transfer learning, diverse data, and real-time processing.

Abstract

The use of neural networks and deep learning techniques in image processing has significantly advanced the field, enabling highly accurate recognition results. However, achieving high recognition rates often necessitates complex network models, which can be challenging to train and require substantial computational resources. This research presents a simplified yet effective approach to predicting both the base and exponent from images of mathematical expressions using a multi-output Convolutional Neural Network (CNN). The model is trained on 10,900 synthetically generated images containing exponent expressions, incorporating random noise, font size variations, and blur intensity to simulate real-world conditions. The proposed CNN model demonstrates robust performance with efficient training time. The experimental results indicate that the model achieves high accuracy in predicting the base and exponent values, proving the efficacy of this approach in handling noisy and varied input images.
Paper Structure (10 sections, 4 equations, 9 figures)

This paper contains 10 sections, 4 equations, 9 figures.

Figures (9)

  • Figure 1: CNN architecture for mathematical expression recognition.
  • Figure 2: Detailed CNN architecture of the model.
  • Figure 3: Forward propagation through the CNN.
  • Figure 4: Sample images from the test dataset with varying levels of noise and blur.
  • Figure 5: Distribution of base values in the test dataset.
  • ...and 4 more figures