Abstract:To solve the problems of mining relationships among topics, authors and time in large scale scientific literature corpora, this paper proposed the Author-Topic over Time (AToT) model according to the intra-features and inter-features of scientific literature. In AToT, a document was represented as a mixture of probabilistic topics and each topic was correspondent with a multinomial distribution over words and a beta distribution over time. The word-topic distribution was influenced not only by word co-occurrence but also by document timestamps. Each author was also correspondent with a multinomial distribution over topics. The word-topic distribution and author-topic distribution were used to describe the topics evolution and research interests changes of the authors over time respectively. Parameters in AToT could be learned from the documents by employing methods of Gibbs sampling. The experimental results by running in the collections of 1700 NIPS conference papers show that AToT model can characterize the latent topics evolution, dynamically find authors research interests and predict the authors related to the topics. Meanwhile, AToT model can also lower perplexity compared with the author-topic model.
史庆伟 李艳妮 郭朋亮. 科技文献中作者研究兴趣动态发现[J]. 计算机应用, 2013, 33(11): 3080-3083.
SHI Qingwei LI Yanni GUO Pengliang. Dynamic finding of authors‘ research interests in scientific literature. Journal of Computer Applications, 2013, 33(11): 3080-3083.