2022 Fifth International Conference on Computational Intelligence and Communication Technologies (CCICT)
Download PDF

Abstract

The exponential growth in video content generated via multiple sources has led to the development of video summarization mechanisms, where smaller summaries are generated from comparatively longer videos. Video summarization plays a significant role in a diverse range of applications including secured surveillance, biometrics, education, sports, news, movies, etc. Several criteria are used for video summarization based on certain events or objects present in the input video for generating respective small summaries. However, audio is a key parameter that may be used to obtain a small summary of a video where some sound or voice is noticed. This article aims to expound on an analysis of audio-based video summarization (ABVS) via data-driven techniques such as a machine or deep learning. We also present a novel framework for ABVS using handcrafted or automatic feature engineering. An evaluation methodology including performance metrics and datasets for assessing the effectiveness of various ABVS approaches is also illustrated. The study reveals several key issues related to ABVS that need further investigation from the active researchers in this active field. One of the critical issues involves exploring effective audio features and the scarcity of benchmark datasets for an audio-based summary generation.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles