基于Nyström方法的偏好特征提取

doi:10.11772/j.issn.1001-9081.2018020296

计算机应用 ›› 2018, Vol. 38 ›› Issue (9): 2515-2522.DOI: 10.11772/j.issn.1001-9081.2018020296

基于Nyström方法的偏好特征提取

杨美姣, 刘惊雷

烟台大学计算机与控制工程学院, 山东烟台 264005

收稿日期:2018-01-30 修回日期:2018-04-28 发布日期:2018-09-06 出版日期:2018-09-10
通讯作者: 刘惊雷
作者简介:杨美姣(1992—),女,山东栖霞人,硕士研究生,主要研究方向:矩阵分解;刘惊雷(1970—),男,山西临猗人,教授,博士,CCF会员,主要研究方向:人工智能、理论计算机科学。
基金资助:
国家自然科学基金资助项目（61572419，61773331，61703360）；山东省高等学校科技计划项目（J17KA091）。

Preference feature extraction based on Nyström method

YANG Meijiao, LIU Jinglei

College of Computer and Control Engineering, Yantai University, Yantai Shandong 264005, China

Received:2018-01-30 Revised:2018-04-28 Online:2018-09-06 Published:2018-09-10
Contact: 刘惊雷
Supported by:
This work is partially supported by the National Natural Science Foundation (61572419, 61773331, 61703360), the Shandong Province University Science and Technology Program(J17KA091).

摘要/Abstract

摘要： 针对电影评分中特征提取效率较低的问题，提出了与QR分解相结合的Nyström方法。首先，利用自适应方法进行采样，然后对内部矩阵进行QR分解，将分解后的矩阵与内部矩阵进行重新组合并进行特征分解。Nyström方法的近似过程与标志点选取的数量以及选取标志点的过程密切相关，选取一系列具有标志性的点来保证采样后的近似性，自适应的采样方法能够保证近似的精度。QR分解能够保证矩阵的稳定性，提高偏好特征提取的精度。偏好特征提取的精度越高，推荐系统的稳定性就会越高，推荐的精度也会提高。最后在真实的观众对电影评分的数据集上进行了特征提取的实验，该电影数据集中包含480189个用户，17770部电影，实验结果表明，提取相同数目的标志点时，该算法的精度和效率都有了一定程度的提高：相对于采样前，时间复杂度由原来的O（n³）减少为O（nc²）（c<<n）；与标准的Nyström相比，误差控制在25%以下。

关键词: 自适应Nyström方法, 特征提取, 核方法, 低秩近似, QR分解

Abstract: To solve the problem of low feature extraction efficiency in movie scoring, a Nyström method combined with QR decomposition was proposed. Firstly, sampling was performed using an adaptive method, QR decomposition of the internal matrix was performed, and the decomposed matrix was recombined with the internal matrix for feature decomposition. The approximate process of Nyström method was closely related to the number of selected landmarks and the process of selecting marker points. A series of point markers were selected to ensure the similarity after sampling. The adaptive sampling method can ensure the accuracy of approximation. QR decomposition can ensure the stability of the matrix and improve the accuracy of the preference feature extraction. The higher the accuracy of the preference feature extraction, the higher the stability of the recommendation system and the higher the accuracy of the recommendation. Finally, a feature extraction experiment was performed on a dataset of actual audience ratings of movies. The movie rating dataset contained 480189 users and 17770 movies. The experimental results show that when extracting the same number of landmarks, accuracy and efficiency of the improved Nyström method are improved to a certain degree, the time complexity is reduced from original O(n³) to O(nc²) (c<<n) compared to pre-sampling. Compared with the standard Nyström method, the error is controlled below 25%.

Key words: adaptive Nyström method, feature extraction, kernel method, low rank approximation, QR decomposition

中图分类号:

TP181

杨美姣, 刘惊雷. 基于Nyström方法的偏好特征提取[J]. 计算机应用, 2018, 38(9): 2515-2522.

YANG Meijiao, LIU Jinglei. Preference feature extraction based on Nyström method[J]. Journal of Computer Applications, 2018, 38(9): 2515-2522.

参考文献

[1] WANG L, REGE M, DONG M, et al. Low-rank kernel matrix factorization for large-scale evolutionary clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(6):1036-1050.
[2] WILLIAMS C K I, SEEGER M. Using the Nyström method to speed up kernel machines[C]//NIPS'00:Proceedings of the 13th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2000:661-667.
[3] TALWALKAR A, KUMAR S, MOHRI M, et al. Large-scale SVD and manifold learning[J]. Journal of Machine Learning Research, 2013, 14(1):3129-3152.
[4] WANG S, ZHANG Z, ZHANG T. Towards more efficient SPSD matrix approximation and CUR matrix decomposition[J]. Journal of Machine Learning Research, 2016, 17(1):7329-7377.
[5] ZHANG K, KWOK J T. Clustered Nyström method for large scale manifold learning and dimension reduction[J]. IEEE Transactions on Neural Networks, 2010, 21(10):1576-1587.
[6] ZHANG X, ZONG L, YOU Q, et al. Sampling for Nyström extension-based spectral clustering:incremental perspective and novel analysis[J]. ACM Transactions on Knowledge Discovery from Data, 2016, 11(1):Article No. 7.
[7] IOSIFIDIS A, GABBOUJ M. Nyström-based approximate kernel sub-space learning[J]. Pattern Recognition, 2016, 57(C):190-197.
[8] CORTES C, MOHRI M, TALWALKAR A. On the impact of kernel approximation on learning accuracy[J]. Journal of Machine Learning Research, 2010, 9(9):113-120.
[9] 方玲,陈松灿.基于特征偏好的聚类研究[J].计算机科学,2015,42(5):57-61.(FANG L, CHEN S C. Research on clustering with feature preferences[J]. Computer Science, 2015, 42(5):57-61.)
[10] TONG H, PAPADIMITRIOU S, SUN J, et al. Colibri:fast mining of large static and dynamic graphs[C]//KDD'08:Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM, 2008:686-694.
[11] DING C H Q, LI T, JORDAN M I. Convex and semi-nonnegative matrix factorizations[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(1):45-55.
[12] CAI D, HE X, WU X, et al. Non-negative matrix factorization on manifold[C]//ICDM'08:Proceedings of the 20088th IEEE International Conference on Data Mining. Washington, DC:IEEE Computer Society, 2008:63-72.
[13] GITTENS A, MAHONEY M W. Revisiting the Nyström method for improved large-scale machine learning[EB/OL].[2017-12-20]. https://www.stat.berkeley.edu/~mmahoney/pubs/nystrom-icml13.pdf.
[14] 丁世飞,贾洪杰,史忠植.基于自适应Nyström采样的大数据谱聚类算法[J].软件学报,2014,25(9):2037-2049.(DING S F, JIA H J, SHI Z Z. Spectral clustering algorithm based on adaptive Nyström sampling for big data analysis[J]. Journal of Software, 2014, 25(9):2037-2049.)
[15] SUN S, ZHAO J, ZHU J. A review of Nyström methods for large-scale machine learning[J]. Information Fusion, 2015, 26(C):36-48.
[16] KUMAR S, MOHRI M, TALWALKAR A. Sampling methods for the Nyström method[J]. Journal of Machine Learning Research, 2012, 13(1):981-1006.
[17] BACH F R, JORDAN M I. Predictive low-rank decomposition for kernel methods[C]//ICML'05:Proceedings of the 22nd International Conference on Machine Learning. New York:ACM, 2005:33-40.
[18] BELKIN M, NIYOGI P. Laplacian eigenmaps for dimensionality reduction and data representation[J]. Neural Computation, 2003, 15(6):1373-1396.
[19] BUCIU I. Non-negative matrix factorization, a new tool for feature extraction:theory and applications[J]. International Journal of Computers, Communications and Control, 2008, 3(1):67-74.
[20] ZHANG K, KWOK J T. Density-weighted Nyström method for computing large kernel eigensystems[J]. Neural Computation, 2009, 21(1):121-146.
[21] WANG S, LUO L, ZHANG Z. The modified Nyström method:theories, algorithms, and extension[EB/OL].[2017-12-22]. http://bcmi.sjtu.edu.cn/~luoluo/paper/nystrom3.pdf.
[22] WANG S, ZHANG Z. Efficient algorithms and error analysis for the modified Nyström method[EB/OL].[2017-12-22]. http://proceedings.mlr.press/v33/wang14c.pdf.
[23] DESHPANDE A, RADEMACHER L, VEMPALA S, et al. Matrix approximation and projective clustering via volume sampling[C]//SODA'06:Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithm. Philadelphia, PA:Society for Industrial and Applied Mathematics, 2006:1117-1126.
[24] LI M, BI W, KWOK J T. Large-scale Nyström kernel matrix approximation using randomized SVD[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(1):152-164.
[25] KIM B, JEONG Y S, TONG S H, et al. A regularized singular value decomposition-based approach for failure pattern classification on fail bit map in a DRAM wafer[J]. IEEE Transactions on Semiconductor Manufacturing, 2015, 28(1),41-49.
[26] WANG L, DONG M. Exemplar-based low-rank matrix decomposition for data clustering[J]. Data Mining and Knowledge Discovery, 2015, 29(2):324-357.
[27] CHAZAL F, GUIBAS L J, OUDOT S Y, et al. Persistence-based clustering in Riemannian manifolds[J]. Journal of the ACM, 2013, 60(6):Article No. 41.

基于Nyström方法的偏好特征提取

Preference feature extraction based on Nyström method

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	杨鑫, 陈雪妮, 吴春江, 周世杰. 结合变种残差模型和Transformer的城市公路短时交通流预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2947-2951.
[2]	付帅, 郭小英, 白茹意, 闫涛, 陈斌. 改进的CloFormer模型与有序回归相结合的年龄评估方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2372-2380.
[3]	陈彤, 杨丰玉, 熊宇, 严荭, 邱福星. 基于多尺度频率通道注意力融合的声纹库构建方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2407-2413.
[4]	龙伍丹, 彭博, 胡节, 申颖, 丁丹妮. 基于加强特征提取的道路病害检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2264-2270.
[5]	刘瑞华, 郝子赫, 邹洋杨. 基于多层级精细特征融合的步态识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2250-2257.
[6]	吴郅昊, 迟子秋, 肖婷, 王喆. 基于元学习自适应的小样本语音合成[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1629-1635.
[7]	崔晨辉, 蔺素珍, 李大威, 禄晓飞, 武杰. 基于孪生网络和Transformer的红外弱小目标跟踪方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 563-571.
[8]	刘涛, 鞠事宏, 高一萌. 基于改进YOLOv8n的无人机视角下小目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3603-3609.
[9]	范艺扬, 张洋, 曾尚, 曾渝, 付茂栗. 基于分解和频域特征提取的多变量长时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3442-3448.
[10]	赵培, 乔焰, 胡荣耀, 袁新宇, 李敏悦, 张本初. 基于多域特征提取的多变量时间序列异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3419-3426.
[11]	花晓雨, 李冬芬, 付优, 毕可骏, 应时, 王瑞锦. 结合层次图神经网络与长短期记忆的产业链风险评估预警模型[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3223-3231.
[12]	张雨宁, 阿布都克力木·阿布力孜, 梅悌胜, 徐春, 麦尔达娜·买买提热依木, 哈里旦木·阿布都克里木, 侯钰涛. 基于自监督特征提取的骨骼X线影像异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 175-181.
[13]	李牧, 杨宇恒, 柯熙政. 基于混合特征提取与跨模态特征预测融合的情感识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 86-93.
[14]	田悦霖, 黄瑞章, 任丽娜. 融合局部语义特征的学者细粒度信息提取方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2707-2714.
[15]	王先兰, 周金坤, 穆楠, 王晨. 基于多任务联合学习的跨视角地理定位方法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1625-1635.