计算机应用 ›› 2018, Vol. 38 ›› Issue (9): 2515-2522.DOI: 10.11772/j.issn.1001-9081.2018020296

• 数据科学与技术 • 上一篇    下一篇

基于Nyström方法的偏好特征提取

杨美姣, 刘惊雷   

  1. 烟台大学 计算机与控制工程学院, 山东 烟台 264005
  • 收稿日期:2018-01-30 修回日期:2018-04-28 出版日期:2018-09-10 发布日期:2018-09-06
  • 通讯作者: 刘惊雷
  • 作者简介:杨美姣(1992—),女,山东栖霞人,硕士研究生,主要研究方向:矩阵分解;刘惊雷(1970—),男,山西临猗人,教授,博士,CCF会员,主要研究方向:人工智能、理论计算机科学。
  • 基金资助:
    国家自然科学基金资助项目(61572419,61773331,61703360);山东省高等学校科技计划项目(J17KA091)。

Preference feature extraction based on Nyström method

YANG Meijiao, LIU Jinglei   

  1. College of Computer and Control Engineering, Yantai University, Yantai Shandong 264005, China
  • Received:2018-01-30 Revised:2018-04-28 Online:2018-09-10 Published:2018-09-06
  • Contact: 刘惊雷
  • Supported by:
    This work is partially supported by the National Natural Science Foundation (61572419, 61773331, 61703360), the Shandong Province University Science and Technology Program(J17KA091).

摘要: 针对电影评分中特征提取效率较低的问题,提出了与QR分解相结合的Nyström方法。首先,利用自适应方法进行采样,然后对内部矩阵进行QR分解,将分解后的矩阵与内部矩阵进行重新组合并进行特征分解。Nyström方法的近似过程与标志点选取的数量以及选取标志点的过程密切相关,选取一系列具有标志性的点来保证采样后的近似性,自适应的采样方法能够保证近似的精度。QR分解能够保证矩阵的稳定性,提高偏好特征提取的精度。偏好特征提取的精度越高,推荐系统的稳定性就会越高,推荐的精度也会提高。最后在真实的观众对电影评分的数据集上进行了特征提取的实验,该电影数据集中包含480189个用户,17770部电影,实验结果表明,提取相同数目的标志点时,该算法的精度和效率都有了一定程度的提高:相对于采样前,时间复杂度由原来的On3)减少为Onc2)(c<<n);与标准的Nyström相比,误差控制在25%以下。

关键词: 自适应Nyström方法, 特征提取, 核方法, 低秩近似, QR分解

Abstract: To solve the problem of low feature extraction efficiency in movie scoring, a Nyström method combined with QR decomposition was proposed. Firstly, sampling was performed using an adaptive method, QR decomposition of the internal matrix was performed, and the decomposed matrix was recombined with the internal matrix for feature decomposition. The approximate process of Nyström method was closely related to the number of selected landmarks and the process of selecting marker points. A series of point markers were selected to ensure the similarity after sampling. The adaptive sampling method can ensure the accuracy of approximation. QR decomposition can ensure the stability of the matrix and improve the accuracy of the preference feature extraction. The higher the accuracy of the preference feature extraction, the higher the stability of the recommendation system and the higher the accuracy of the recommendation. Finally, a feature extraction experiment was performed on a dataset of actual audience ratings of movies. The movie rating dataset contained 480189 users and 17770 movies. The experimental results show that when extracting the same number of landmarks, accuracy and efficiency of the improved Nyström method are improved to a certain degree, the time complexity is reduced from original O(n3) to O(nc2) (c<<n) compared to pre-sampling. Compared with the standard Nyström method, the error is controlled below 25%.

Key words: adaptive Nyström method, feature extraction, kernel method, low rank approximation, QR decomposition

中图分类号: