2019 IEEE International Conference on Big Data and Smart Computing (BigComp)
Download PDF

Abstract

In today's society where the enormous amount of documents are overflowing, the demand for automatic text summarization has been increasing. In particular, when considering the practical use, it is expected that the query-oriented text summarization, which generates summaries focusing on the given specific queries, will be more important rather than the generic summarization that simply summarizes the entire document. In the automatic summarization methods based on the deep neural model, Long-Short Term Memory (LSTM) which enables longterm information storage, and attention mechanism which allows emphasis on specific time steps from the vectors encoded by the Recurrent Neural Network (RNN), are essential. However, Koehn et al. [1] report that the accuracy is degraded in the encoding of texts longer than 60 tokens entered into the LSTM. Even in the summarization task, it is considered to be a problem that the encoding of a long sentence fails or causes the disappearance of the relation of the sentences, as a factor that greatly deteriorates the quality of the summary result. In this paper, we propose a method to generate a summary by introducing a sentence unit vector in addition to the word unit vector of the original document. We aim to generate summaries considering the importance degree of a sentence unit and the relations between sentences, by learning both the attention mechanism of word unit and the attention mechanism of sentence unit. In the experiment, we examine the summarization model which added the mechanism of sentence unit vector to the latest query-focused summarization model, and clarify the influence on the summary generated by presence or absence of the sentence vector.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles