Table of Contents
Fetching ...

Theoretical Analysis of Power-law Transformation on Images for Text Polarity Detection

Narendra Singh Yadav, Pavan Kumar Perepu

TL;DR

The paper addresses text polarity detection for image binarization by theoretically analyzing how power-law transformations affect the Otsu between-class variance, $\sigma_B^2(t^*)$, used to determine the binarization threshold. It formalizes the relationship between the transformed histogram and $MBCV$, showing that for $(1/\gamma)<1$ darker intensities can be stretched, influencing polarity-dependent changes in $MBCV$. Two cases are analyzed: bright text on dark background and dark text on bright background, with explicit conditions on class weights $w_1(t^*)$, $w_2(t^*)$, threshold location $t^*$, and mean difference $|\mu_1(t^*)-\mu_2(t^*)|$. These results offer a principled view of when power-law preprocessing reveals polarity information to guide binarization, while acknowledging counterexamples and pointing to future work on mixed-polarity images.

Abstract

Several computer vision applications like vehicle license plate recognition, captcha recognition, printed or handwriting character recognition from images etc., text polarity detection and binarization are the important preprocessing tasks. To analyze any image, it has to be converted to a simple binary image. This binarization process requires the knowledge of polarity of text in the images. Text polarity is defined as the contrast of text with respect to background. That means, text is darker than the background (dark text on bright background) or vice-versa. The binarization process uses this polarity information to convert the original colour or gray scale image into a binary image. In the literature, there is an intuitive approach based on power-law transformation on the original images. In this approach, the authors have illustrated an interesting phenomenon from the histogram statistics of the transformed images. Considering text and background as two classes, they have observed that maximum between-class variance between two classes is increasing (decreasing) for dark (bright) text on bright (dark) background. The corresponding empirical results have been presented. In this paper, we present a theoretical analysis of the above phenomenon.

Theoretical Analysis of Power-law Transformation on Images for Text Polarity Detection

TL;DR

The paper addresses text polarity detection for image binarization by theoretically analyzing how power-law transformations affect the Otsu between-class variance, , used to determine the binarization threshold. It formalizes the relationship between the transformed histogram and , showing that for darker intensities can be stretched, influencing polarity-dependent changes in . Two cases are analyzed: bright text on dark background and dark text on bright background, with explicit conditions on class weights , , threshold location , and mean difference . These results offer a principled view of when power-law preprocessing reveals polarity information to guide binarization, while acknowledging counterexamples and pointing to future work on mixed-polarity images.

Abstract

Several computer vision applications like vehicle license plate recognition, captcha recognition, printed or handwriting character recognition from images etc., text polarity detection and binarization are the important preprocessing tasks. To analyze any image, it has to be converted to a simple binary image. This binarization process requires the knowledge of polarity of text in the images. Text polarity is defined as the contrast of text with respect to background. That means, text is darker than the background (dark text on bright background) or vice-versa. The binarization process uses this polarity information to convert the original colour or gray scale image into a binary image. In the literature, there is an intuitive approach based on power-law transformation on the original images. In this approach, the authors have illustrated an interesting phenomenon from the histogram statistics of the transformed images. Considering text and background as two classes, they have observed that maximum between-class variance between two classes is increasing (decreasing) for dark (bright) text on bright (dark) background. The corresponding empirical results have been presented. In this paper, we present a theoretical analysis of the above phenomenon.

Paper Structure

This paper contains 7 sections, 1 equation, 5 figures.

Figures (5)

  • Figure 1: Power-law transformation, $o=i^\gamma$, where $i$ and $o$ are input and output intensity values respectively, for different values of $\gamma$gonzalez
  • Figure 2: MBCV vs Power ($1/\gamma$) curve for two types of text polarity: Bright text on dark background (left) and Dark text on bright background (right).
  • Figure 3: Binarized images for the original images shown in Fig. \ref{['plcurve']}. In the left image, white pixels (black pixels) form text (background) and vice-versa in the right image.
  • Figure 4: (a) Original Image (bright text on dark background) (b) $MBCV$ vs $\gamma$ curve (c) Histogram
  • Figure 5: (a) Original Image (dark text on bright background) (b) $MBCV$ vs $\gamma$ curve (c) Histogram