Karpagam JCS ISSN: 2582 – 8525 (Print), 2583 – 3669 (Online)

Global+Local(Global) Features based Script Indentification System for Indian Multi-Script Documents

Abstract
The problem of determining the script of the text present in multi-script documents is one of the important steps as a precursor to Optical Character Recognition (OCR). In this paper, the word level script identification in bilingual or multilingual documents based on global and local features is reported. Initially, the identification of the script of words using morphological filters (global features) and regional descriptors (local features) in a bi- script scenario is considered. In the later stage, the problem is extended across tri-script to five-script scenarios. The words of different scripts are classified using K nearest neighbour algorithm with five fold cross validation on a large dataset of 27,500 word images. The proposed algorithm achieves an average accuracy of more than 94.78% and is robust for noise, word length, font styles, and sizes.

View Full Article

Download or view the complete article PDF published by the author.

📥 Download PDF 👁️ View in Browser