Visual speech synthesis from 3D mesh sequences driven by combined speech features

Felix Kuhnke; Jörn Ostermann

doi:10.1109/ICME.2017.8019546

2017 IEEE International Conference on Multimedia and Expo (ICME)

Visual speech synthesis from 3D mesh sequences driven by combined speech features

Year: 2017, Pages: 1075-1080

DOI Bookmark: 10.1109/ICME.2017.8019546

Authors

Felix Kuhnke, Institut für Informationsverarbeitung, Leibniz Universität Hannover, Germany
Jörn Ostermann, Institut für Informationsverarbeitung, Leibniz Universität Hannover, Germany

Abstract

Given a pre-registered 3D mesh sequence and accompanying phoneme-labeled audio, our system creates an animatable face model and a mapping procedure to produce realistic speech animations for arbitrary speech input. Mapping of speech features to model parameters is done using random forests for regression. We propose a new speech feature based on phonemic labels and acoustic features. The novel feature produces more expressive facial animation and it robustly handles temporal labeling errors. Furthermore, by employing a sliding window approach to feature extraction, the system is easy to train and allows for low-delay synthesis. We show that our novel combination of speech features improves visual speech synthesis. Our findings are confirmed by a subjective user study.

Like what you’re reading?

Already a member?

Get this article FREE with a new membership!

Multilingual Speech-to-Speech Translation System: VoiceTra
2013 IEEE 14th International Conference on Mobile Data Management
Determination of articulatory movements from speech acoustics using an HMM-based speech production model
Proceedings of International Conference on Acoustics, Speech and Signal Processing (CASSP'02)
Analysis of Chinese Interrogative Intonation and its Synthesis in HMM-Based Synthesis System
2011 International Conference on Internet Computing and Information Services (ICICIS 2011)
Duration refinement for hybrid speech synthesis system using random forest
2015 International Conference on Affective Computing and Intelligent Interaction (ACII)
Low latency parameter generation for real-time speech synthesis system
2014 IEEE International Conference on Multimedia and Expo (ICME)
HMM-based Vietnamese speech synthesis
2015 IEEE/ACIS 14th International Conference on Computer and Information Science (ICIS)
Integrating Articulatory Features into Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech
2018 IEEE International Conference on Multimedia and Expo (ICME)
Speech-Based Interface for Visually Impaired Users
2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
Arabic Speech Synthesis System Based on HMM
2019 6th International Conference on Electrical and Electronics Engineering (ICEEE)
The Speech Synthesis of Yi Language Based on DNN
2019 International Joint Conference on Information, Media and Engineering (IJCIME)

Visual speech synthesis from 3D mesh sequences driven by combined speech features

Authors

Abstract

Related Articles