Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (4): 1075-1082.DOI: 10.11772/j.issn.1001-9081.2017.04.1075

Previous Articles     Next Articles

Construction method of mobile application similarity matrix based on latent Dirichlet allocation topic model

CHU Zheng1, YU Jiong1,2, WANG Jiayu1, WANG Yuefei1,2   

  1. 1. School of Software, Xinjiang University, Urumqi Xinjiang 830008, China;
    2. School of Information Science and Engineering, Xinjiang University, Urumqi Xinjiang 830046, China
  • Received:2016-09-26 Revised:2016-12-25 Online:2017-04-10 Published:2017-04-19
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61462079,61262088,61562086,61363083).

基于LDA主题模型的移动应用相似度构建方法

褚征1, 于炯1,2, 王佳玉1, 王跃飞1,2   

  1. 1. 新疆大学 软件学院, 乌鲁木齐 830008;
    2. 新疆大学 信息科学与工程学院, 乌鲁木齐 830046
  • 通讯作者: 于炯
  • 作者简介:褚征(1991-),男,河南南阳人,硕士研究生,CCF会员,主要研究方向:数据挖掘、内存计算、绿色计算;于炯(1964-),男,新疆乌鲁木齐人,教授,博士,CCF会员,主要研究方向:网络安全、网格计算、分布式计算;王佳玉(1991-),女,黑龙江哈尔滨人,硕士研究生,CCF会员,主要研究方向:数据挖掘、移动计算;王跃飞(1991-),男,新疆乌鲁木齐人,博士研究生,主要研究方向:数据挖掘、内存计算、绿色计算。
  • 基金资助:
    国家自然科学基金资助项目(61462079,61262088,61562086,61363083)。

Abstract: With the rapid development of mobile Internet, how to extract effective description information from a large number of mobile applications and then provide effective and accurate recommendation strategies for mobile users becomes urgent. At present, recommendation strategies are relatively traditional, and mostly recommend applications according to the single attribute, such as downloads, application name and application classification. In order to resolve the problem that the granularity of recommended applications is too coarse and the recommendation is not accurate, a mobile application similarity matrix construction method based on Latent Dirichlet Allocation (LDA) was proposed. Started from the application labels, a topic model distribution matrix of mobile applications was constructed, which was utilized to construct mobile application similarity matrix. Meanwhile, a method for converting the mobile application similarity matrix to the viable storage structure was also proposed. Extensive experiments demonstrate the feasibility of the proposed method, and the application similarity achieves 130 percent increasement by the proposed method compared with that by the existing 360 application market. The proposed method solves the problem that the recommended granularity is too coarse in the mobile application recommendation process, so that the recommendation result is more accurate.

Key words: similarity matrix, topic model, latent information, application recommendation, label

摘要: 随着移动互联网的快速发展,如何从大量的移动应用中抽取有效的描述信息继而为移动用户提供有效准确的推荐策略变得尤为迫切。目前,移动应用市场对应用的推荐策略相对传统,大多是根据应用的单一属性进行推荐,如下载量、应用名称、应用分类等。针对推荐粒度过粗和推荐不准确的问题,提出了一种基于潜在狄利克雷分布(LDA)主题模型的移动应用相似度构建方法。该方法从应用的标签入手,构造应用的主题模型分布矩阵,利用该主题分布矩阵构建移动应用的相似度矩阵,同时提出了将移动应用相似度矩阵转化为可行的存储结构的方法。实验结果表明该方法是有效的,相比现有的360应用市场推荐的应用其相似度提升130%。该方法解决了移动应用推荐过程中推荐粒度过粗的问题,可使推荐结果更加准确。

关键词: 相似度矩阵, 主题模型, 隐含信息, 应用推荐, 标签

CLC Number: