2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI)
Download PDF

Abstract

The use of thesauri and taxonomies for science and technology information in scientometrics has been attracting attention. However, manual construction and maintenance of thesauri is expensive and requires significant time, thus, methods for semi-automatic construction and maintenance are being actively studied. We propose a method to expand an existing thesaurus using the abstracts of articles from state-of-the-art technological domains with limited structured information. Specifically, we consider a method for properly allocating new terms to the hierarchical structures of an existing thesaurus using rapidly evolving word embedding. In an experiment, word vectors of 500 degrees are constructed from 567,000 biomedical articles and are clustered after dimension reduction using principal component analysis. Then, semantic relations are estimated based on the spatial relations between the new term and any of the terms in the thesaurus. We then conducted a comparison of the results obtained from three experts. In future, we will develop a recommendation system for new terms related to the existing terms to support semi-automatic thesaurus maintenance.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles