Table of Contents
Fetching ...

Evaluating OCR performance on food packaging labels in South Africa

Mayimunah Nagayi, Alice Khan, Tamryn Frank, Rina Swart, Clement Nyirenda

TL;DR

This study addresses the challenge of extracting ingredient lists and nutrition facts from real-world food packaging images in South Africa using four open-source OCR systems: Tesseract, EasyOCR, PaddleOCR, and TrOCR. It benchmarks these engines on a dataset of 231 products (1,628 images) with a ground-truth subset of 113 images, evaluating CER, WER, BLEU, ROUGE-L, F1, coverage, and processing time. Tesseract achieves the lowest mean CER ($CER = 0.912$) and WER ($WER = 6.262$) and the highest semantic scores (BLEU $=0.245$, ROUGE-L $=0.391$), while PaddleOCR offers the widest coverage albeit with CPU-only slowness, and TrOCR delivers full coverage but limited semantic accuracy. EasyOCR provides a balanced trade-off between speed, multilingual support, and accuracy. The results establish a domain-specific benchmark for packaging OCR and suggest future work on layout-aware methods and region-level text localization to improve robustness in multilingual, cluttered packaging contexts.

Abstract

This study evaluates four open-source Optical Character Recognition (OCR) systems which are Tesseract, EasyOCR, PaddleOCR, and TrOCR on real world food packaging images. The aim is to assess their ability to extract ingredient lists and nutrition facts panels. Accurate OCR for packaging is important for compliance and nutrition monitoring but is challenging due to multilingual text, dense layouts, varied fonts, glare, and curved surfaces. A dataset of 231 products (1,628 images) was processed by all four models to assess speed and coverage, and a ground truth subset of 113 images (60 products) was created for accuracy evaluation. Metrics include Character Error Rate (CER), Word Error Rate (WER), BLEU, ROUGE-L, F1, coverage, and execution time. On the ground truth subset, Tesseract achieved the lowest CER (0.912) and the highest BLEU (0.245). EasyOCR provided a good balance between accuracy and multilingual support. PaddleOCR achieved near complete coverage but was slower because it ran on CPU only due to GPU incompatibility, and TrOCR produced the weakest results despite GPU acceleration. These results provide a packaging-specific benchmark, establish a baseline, and highlight directions for layout-aware methods and text localization.

Evaluating OCR performance on food packaging labels in South Africa

TL;DR

This study addresses the challenge of extracting ingredient lists and nutrition facts from real-world food packaging images in South Africa using four open-source OCR systems: Tesseract, EasyOCR, PaddleOCR, and TrOCR. It benchmarks these engines on a dataset of 231 products (1,628 images) with a ground-truth subset of 113 images, evaluating CER, WER, BLEU, ROUGE-L, F1, coverage, and processing time. Tesseract achieves the lowest mean CER () and WER () and the highest semantic scores (BLEU , ROUGE-L ), while PaddleOCR offers the widest coverage albeit with CPU-only slowness, and TrOCR delivers full coverage but limited semantic accuracy. EasyOCR provides a balanced trade-off between speed, multilingual support, and accuracy. The results establish a domain-specific benchmark for packaging OCR and suggest future work on layout-aware methods and region-level text localization to improve robustness in multilingual, cluttered packaging contexts.

Abstract

This study evaluates four open-source Optical Character Recognition (OCR) systems which are Tesseract, EasyOCR, PaddleOCR, and TrOCR on real world food packaging images. The aim is to assess their ability to extract ingredient lists and nutrition facts panels. Accurate OCR for packaging is important for compliance and nutrition monitoring but is challenging due to multilingual text, dense layouts, varied fonts, glare, and curved surfaces. A dataset of 231 products (1,628 images) was processed by all four models to assess speed and coverage, and a ground truth subset of 113 images (60 products) was created for accuracy evaluation. Metrics include Character Error Rate (CER), Word Error Rate (WER), BLEU, ROUGE-L, F1, coverage, and execution time. On the ground truth subset, Tesseract achieved the lowest CER (0.912) and the highest BLEU (0.245). EasyOCR provided a good balance between accuracy and multilingual support. PaddleOCR achieved near complete coverage but was slower because it ran on CPU only due to GPU incompatibility, and TrOCR produced the weakest results despite GPU acceleration. These results provide a packaging-specific benchmark, establish a baseline, and highlight directions for layout-aware methods and text localization.

Paper Structure

This paper contains 26 sections, 5 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Sample food packaging images from the SA NFP 2023 dataset.
  • Figure 2: OCR evaluation workflow from image input to Evaluations.
  • Figure 3: CER and WER with box plots for distributions and bars for mean values.
  • Figure 4: BLEU and ROUGE-L comparison across OCR models.
  • Figure 5: F1 score distribution across OCR models.