Abstract
Abstract: In this paper an off-line script recognition system is described, which makes use of a language model, that consists of backoff character n-grams. The performance of this open vocabulary recognition is compared with the use of closed dictionaries. The system is based on Hidden Markov Models (HMMs) using a hybrid modeling technique, which depends on a neural vector quantizer. The presented recognition results refer to the SEDAL-database of degraded English documents such as photocopy or fax and a writer-dependent handwritten database of cursive German script samples. Our resulting system for character recognition yields significantly better recognition results for an unlimited vocabulary using language models.