Page Classification for Print Imaging Pipeline
Shaoyuan Xu, Cheng Lu, Mark Shaw, Peter Bauer, Jan P. Allebach
TL;DR
The work extends SVM-based image classification from three to five classes to optimize print imaging pipelines. It introduces four new chroma-based features (Chroma Histogram Flatness, Chroma Around Text, Color Block Ratio, White Block Ratio) alongside existing features and employs a DAG-SVM for robust multi-class decisions. Feature selection reduces to seven features, with evaluation on 500 scanned images demonstrating improved discrimination among five image types and facilitating appropriate processing pipelines. This approach enhances printing quality by enabling pipelines tailored to text, pictures, receipts, highlights, and mixtures, using chroma-aware analysis across color spaces.
Abstract
Digital copiers and printers are widely used nowadays. One of the most important things people care about is copying or printing quality. In order to improve it, we previously came up with an SVM-based classification method to classify images with only text, only pictures or a mixture of both based on the fact that modern copiers and printers are equipped with processing pipelines designed specifically for different kinds of images. However, in some other applications, we need to distinguish more than three classes. In this paper, we develop a more advanced SVM-based classification method using four more new features to classify 5 types of images which are text, picture, mixed, receipt and highlight.
