Detection and recognition via adaptive binarization and fuzzy clustering

Saad Mohmad Saad Ismail; Siti Norul Huda Sheikh Abdullah; Fariza Fauzi

Detection and identification of text in natural scene images pose major challenges: image quality varies as scenes are taken under different conditions (lighting, angle and resolution) and the contained text entities can be in any form (size, style and orientation). In this paper, a robust approach is proposed to localize, extract and recognize scene texts of different sizes, fonts and orientations from images of varying quality. The proposed method consists of the following steps: preprocessing and enhancement of input image using the National Television System Committee (NTSC) color mapping and the contrast enhancement via mean histogram stretching; candidate text regions detection using hybrid adaptive segmentation and fuzzy c-means clustering techniques; a two-stage text extraction from the candidate text regions to filter out false text regions include local character filtering according to a rule-based approach using shape and statistical features and text region filtering via stroke width transform (SWT); and finally, text recognition using Tesseract OCR engine. The proposed method was evaluated using two benchmark datasets: ICDAR2013 and KAIST image datasets. The proposed method effectively dealt with complex scene images containing texts of various font sizes, colors, and orientation; and outperformed state-of-the-art methods, achieving >80% in both precision and recall measures.

Detection and recognition via adaptive binarization and fuzzy clustering

Affiliations

Abstract