Pattern Recognition, International Conference on
Download PDF

Abstract

This paper describes a methodology by which audio and visual data about a scene can be fused in a meaningful manner in order to locate a speaker in a scene. This fusion is implemented within a Particle Filter such that a single speaker can be identified in the presence of multiple visual observations. The advantages of this fusion are that weak sensory data from either modality can be reinforced and the presence of noise can be reduced.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles