[1] GHOSH R, ASUR S. Mining information from heterogeneous sources:a topic modeling approach[EB/OL].[2018-03-20]. http://www.hpl.hp.com/techreports/2013/HPL-2013-83.pdf. [2] HUANG R, YU G, WANG Z, et al. Dirichlet process mixture model for document clustering with feature partition[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(8):1748-1759. [3] GREEN P J, RICHARDSON S. Modelling heterogeneity with and without the Dirichlet process[J]. Scandinavian Journal of Statistics, 2001, 28(2):355-375. [4] 周建英, 王飞跃, 曾大军. 分层Dirichlet过程及其应用综述[J]. 自动化学报, 2011, 37(4):389-407.(ZHOU J Y, WANG F Y, ZENG D J. Hierarchical Dirichlet processes and their applications:a survey[J]. Acta Automatica Sinica, 2011, 37(4):389-407.) [5] ANTONIAK C E. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems[J]. The Annals of Statistics, 1974, 2(6):1152-1174. [6] 高悦, 王文贤, 杨淑贤. 一种基于狄利克雷过程混合模型的文本聚类算法[J]. 信息网络安全, 2015(11):60-65.(GAO Y, WANG W X, YANG S X. A document clustering algorithm based on Diriehlet process mixture model[J]. Netinfo Security,2015(11):60-65.) [7] JENSEN C S. Blocking Gibbs sampling for inference in large and complex Bayesian networks with applications in genetics[EB/OL].[2018-03-20]. http://vbn.aau.dk/ws/files/104290/csjensen.pdf. [8] YAN Y, HUANG R, MA C, et al. Improving document clustering for short texts by long documents via a Dirichlet multinomial allocation model[C]//Proceedings of the 1st International Joint Conference on Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data. Berlin:Springer, 2017:626-641. [9] BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3:993-1022. [10] PHAN X H, NGUYEN C T, LE D T, et al. A hidden topic-based framework toward building applications with short Web documents[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(7):961-976. [11] JIN O, LIU N N, ZHAO K, et al. Transferring topical knowledge from auxiliary long texts for short text clustering[C]//Proceedings of the 20th ACM International Conference on Information and Knowledge Management. New York:ACM, 2011:775-784. [12] HONG L, DOM B, GURUMURTHY S, et al. A time-dependent topic model for multiple text streams[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM, 2011:832-840. [13] FOSTER A, LI H, MAIERHOFER G, et al. An extension of standard latent Dirichlet allocation to multiple corpora[EB/OL].[2018-03-20].http://evoq-eval.siam.org/Portals/0/Publications/SIURO/Vol9/AN_EXTENSION_STANDARD_LATENT_DIRICHLET_ALLOCATION.pdf?ver=2018-04-06-152049-177. [14] SALOMATIN K, YANG Y, LAD A. Multi-field correlated topic modeling[EB/OL].[2018-03-20].http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/alad/papers/salomatin-sdm09.pdf. [15] TEH Y W, JORDAN M I, BEAL M J, et al. Sharing clusters among related groups:hierarchical Dirichlet processes[EB/OL].[2018-03-21].http://papers.nips.cc/paper/2698-sharing-clusters-among-related-groups-hierarchical-dirichlet-processes.pdf. [16] SMYTH P. Model selection for probabilistic clustering using cross-validated likelihood[J]. Statistics and Computing, 2000, 10(1):63-72. [17] CHEESEMAN P, KELLY J, SELF M, et al. Autoclass:A Bayesian classification system[M]//Proceedings of the Fifth International Conference on Machine Learning. San Francisco, CA:Morgan Kaufmann Publishers, 1988:54-64. [18] ZHONG S. Semi-supervised model-based document clustering:a comparative study[J]. Machine Learning, 2006, 65(1):3-29. [19] BELA A, FRIGYIK A, GUPTA M. Introduction to the Dirichlet distribution and related processes[EB/OL]. [2018-03-22].https://www2.ee.washington.edu/techsite/papers/documents/UWEETR-2010-0006.pdf. [20] TANG J, ZHANG J, YAO L, et al. ArnetMiner: extraction and mining of academic social networks[C]// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2008: 990-998. [21] DHILLON I S, MODHA D S. Concept decompositions for large sparse text data using clustering[J]. Machine Learning, 2001, 42(1/2): 143-175. [22] YIN J, WANG J. A Dirichlet multinomial mixture model-based approach for short text clustering[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 233-242. |