2020 International Conference on Culture-oriented Science & Technology (ICCST)
Download PDF

Abstract

This paper proposes an efficient supervised video summarization algorithm with self-attention based encoder-decoder network. Given an input video, we implement a Bi-GRU network to encode the contextual information of the video frames using self-attention mechanism, and a GRU network as the decoder, accompanying with a regression network to predict the importance score of every video frame. Experiments and analysis are conducted on the public benchmark datasets TvSum and SumMe, the results validate the superiority of our algorithm.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles