Acoustics, Speech, and Signal Processing, IEEE International Conference on
Download PDF

Abstract

We present a trainable speech synthesis system that uses the trended hidden Markov model to generate the trajectories of spectral features of synthesis units. The synthesis units are trained from a transcribed continuous speech corpus, making the speech more natural than that produced by conventional diphone synthesisers which are generally, trained from a highly articulated speech database and require a large investment of time and effort in order to train a new voice. The,overall system has been incorporated into a PSOLA synthesiser to produce speech that is natural sounding and preserves the identity of the source speaker.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!