Abstract
This paper addresses the problem of binalizing multicolored character strings in scene images subject to heavy image degradations and complex backgrounds. The proposed method consists of four steps. The first step generates tentatively binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of a given image in the HSI color space. The total number of tentatively binarized images equals 2^K-2. The second step divides each binarized image into a sequence of "single-character-like" images using an average aspect ratio of a character. The third step is use of support vector machines (SVM) to determine whether each "single-character-like" image represents a character or non-character. We feed the SVM with the mesh feature to output the degree of "character-likeness.Âh The fourth step selects a single binarized image with the maximum average of Âgcharacter-likenessÂh as an optimal binarization result. Experiments using a total of 1000 character strings extracted from the ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 80.8%.