Proceedings of Sixth International Conference on Document Analysis and Recognition
Download PDF

Abstract

Abstract: In this paper an off-line script recognition system is described, which makes use of a language model, that consists of backoff character n-grams. The performance of this open vocabulary recognition is compared with the use of closed dictionaries. The system is based on Hidden Markov Models (HMMs) using a hybrid modeling technique, which depends on a neural vector quantizer. The presented recognition results refer to the SEDAL-database of degraded English documents such as photocopy or fax and a writer-dependent handwritten database of cursive German script samples. Our resulting system for character recognition yields significantly better recognition results for an unlimited vocabulary using language models.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!