Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multi-source text topic mining model based on Dirichlet multinomial allocation model
XU Liyang, HUANG Ruizhang, CHEN Yanping, QIAN Zhisen, LI Wanying
Journal of Computer Applications    2018, 38 (11): 3094-3099.   DOI: 10.11772/j.issn.1001-9081.2018041359
Abstract488)      PDF (1100KB)(481)       Save
With the rapid increase of text data sources, topic mining for multi-source text data becomes the research focus of text mining. Since the traditional topic model is mainly oriented to single-source, there are many limitations to directly apply to multi-source. Therefore, a topic model for multi-source based on Dirichlet Multinomial Allocation model (DMA) was proposed considering the difference between sources of topic word-distribution and the nonparametric clustering quality of DMA, namely MSDMA (Multi-Source Dirichlet Multinomial Allocation). The main contributions of the proposed model are as follows:1) it takes into account the characteristics of each source itself when modeling the topic, and can learn the source-specific word distributions of topic k; 2) it can improve the topic discovery performance of high noise and low information through knowledge sharing; 3) it can automatically learn the number of topics within each source without the need for human pre-given. The experimental results in the simulated data set and two real datasets indicate that the proposed model can extract topic information more effectively and efficiently than the state-of-the-art topic models.
Reference | Related Articles | Metrics