2014 11th IAPR International Workshop on Document Analysis Systems (DAS)
Download PDF

Abstract

Training a system using a small number of instances to obtain accurate recognition/classification is a crucial need in document classification domain. The one-class classification is chosen since only positive samples are available for the training. In this paper, a new one-class classification method based on symbolic representation method is proposed. Initially a set of features is extracted from the training set. A set of intervals valued symbolic feature vector is then used to represent the class. Each interval value (symbolic data) is computed using mean and standard deviation of the corresponding feature values. To evaluate the proposed one-class classification method a dataset composed of 544 document images was used. Experiment results reveal that the proposed one-class classification method works well even when the number of training samples is small (≤10). Moreover, we noted that the proposed one-class classification method is suitable for document classification and provides better result compared to one-class k-nearest neighbor (k-NN) classifier.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles