Abstract
Topic evolution automatically tracks a set of concepts within a given dataset over time, assisting researchers to overview various research domains. Network-based topic evolution is one of the recent approaches incorporating relational models instead of traditional text-based models for allowing the detection of topic correlation events. Topics are represented with co-occurrence relationships instead of word vectors, connected over time through their positions in a network instead of their semantic similarities. This paper shows that the topics and their network representations share meaningfully similar semantics. The existence of such contextual relationships allows topics to be labeled without having enough direct textual appearances in the document collection. Forty fields-of-study keywords with 64,215 to 5.8 million related articles were selected from the Microsoft Academic Graph dataset containing more than 200 million publications. The semantics of topics within forty topic networks were found using three sets of word embeddings trained from a collection of 10.5 million Medline abstracts from the year 2000 to 2016, and word embeddings of the topics are compared against their network-based representations, which are their neighborhoods in the previous timeslot. Cosine similarities between topics and their neighbors consistently resulted in moderate correlations from the year 2001 to 2015, showing higher values for topics that were already present in the topic networks compared to newly emerging topics. The result suggested that labeling topics based on the network structure are possible without semantic analysis, which is necessary for predicting topic evolutions such as the topic emergence when future documents are unavailable.