Leveraging Deep Learning with Multi-Head Attention for Accurate Extraction of Medicine from Handwritten Prescriptions

Usman Ali; Sahil Ranmbail; Muhammad Nadeem; Hamid Ishfaq; Muhammad Umer Ramzan; Waqas Ali

Leveraging Deep Learning with Multi-Head Attention for Accurate Extraction of Medicine from Handwritten Prescriptions

Usman Ali, Sahil Ranmbail, Muhammad Nadeem, Hamid Ishfaq, Muhammad Umer Ramzan, Waqas Ali

TL;DR

This study addresses extracting medicine names from handwritten prescriptions, a task hindered by diverse handwriting styles and formats. It proposes a two-stage hybrid approach that combines Mask R-CNN for region segmentation with TrOCR, a Transformer-based OCR, for text transcription, followed by Levenshtein and fuzzy matching against a medicines database. The model is fine-tuned on a novel Pakistan-origin dataset of approximately 1,000 prescriptions from 50 doctors, augmented to 9,920 samples to cover variability. Empirical results show a character error rate of $CER = 1.4\%$ on standard benchmarks, demonstrating robust recognition and potential to automate medicine-name extraction in clinical workflows.

Abstract

Extracting medication names from handwritten doctor prescriptions is challenging due to the wide variability in handwriting styles and prescription formats. This paper presents a robust method for extracting medicine names using a combination of Mask R-CNN and Transformer-based Optical Character Recognition (TrOCR) with Multi-Head Attention and Positional Embeddings. A novel dataset, featuring diverse handwritten prescriptions from various regions of Pakistan, was utilized to fine-tune the model on different handwriting styles. The Mask R-CNN model segments the prescription images to focus on the medicinal sections, while the TrOCR model, enhanced by Multi-Head Attention and Positional Embeddings, transcribes the isolated text. The transcribed text is then matched against a pre-existing database for accurate identification. The proposed approach achieved a character error rate (CER) of 1.4% on standard benchmarks, highlighting its potential as a reliable and efficient tool for automating medicine name extraction.

Leveraging Deep Learning with Multi-Head Attention for Accurate Extraction of Medicine from Handwritten Prescriptions

TL;DR

Abstract

Leveraging Deep Learning with Multi-Head Attention for Accurate Extraction of Medicine from Handwritten Prescriptions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)