2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
Download PDF

Abstract

Audio is a rich source of information in the digital videos that can provide useful descriptors for indexing the video databases. In this paper, we model the shape of the distribution of wavelet coefficients of embedded audio with a Laplacian mixture. The distributions of wavelet coefficients are very peaky in nature. The shape of these distributions can be modeled with only two components in the Laplacian mixture with low computational complexity. The parameters of this mixture model form a low dimensional feature vector representing global similarity of the audio content of the video clips. An interactive approach involving the feature vector updating scheme is used to adapt the retrieval system to the users' needs. This relevance feedback (RF) increases the retrieval ratio substantially. A comprehensive experimental evaluation using the CNN news database has been performed.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles