基于会话时序相似性的矩阵分解数据填充

doi:10.11772/j.issn.1001-9081.2018010264

计算机应用 ›› 2018, Vol. 38 ›› Issue (8): 2236-2242.DOI: 10.11772/j.issn.1001-9081.2018010264

基于会话时序相似性的矩阵分解数据填充

乔永卫¹, 张宇翔², 肖春景^2,3

1. 中国民航大学工程技术训练中心, 天津 300300;
2. 中国民航大学计算机科学与技术学院, 天津 300300;
3. 河北工业大学电子信息工程学院, 天津 300401

收稿日期:2018-01-29 修回日期:2018-03-22 出版日期:2018-08-10 发布日期:2018-08-11
通讯作者: 乔永卫
作者简介:乔永卫(1976-),男,山西祁县人,讲师,硕士,主要研究方向:机器学习、民航智能信息处理;张宇翔(1975-),男,山西大同人,副教授,博士,主要研究方向:机器学习、数据挖掘、人工智能;肖春景(1978-),女,河北唐山人,讲师,博士研究生,主要研究方向:推荐系统、数据挖掘。
基金资助:
国家自然科学基金资助项目（U1533104）；河北省自然科学基金资助项目（E2016202341）；中央高校基本科研业务费资助项目（ZXH2012P009）。

Data imputation using matrix factorization based on session-based temporal similarity

QIAO Yongwei¹, ZHANG Yuxiang², XIAO Chunjing^2,3

1. Engineering and Technical Training Center, Civil Aviation University of China, Tianjin 300300, China;
2. College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China;
3. School of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China

Received:2018-01-29 Revised:2018-03-22 Online:2018-08-10 Published:2018-08-11
Supported by:
This work is partially supported by the National Natural Science Foundation of China (U1533104), the Natural Science Foundation of Hebei Province (E2016202341), the Fundamental Research Funds for the Central Universities (ZXH2012P009).

摘要/Abstract

摘要： 针对已有数据填充方法只考虑评分信息和传统相似性，无法捕获用户间真实相似关系的问题，提出了基于会话时序相似性的矩阵分解数据填充方法来缓解数据稀疏性、提高推荐精度。首先，分析了传统相似性的缺陷，并根据时序相似性和相异性提出了基于会话时序相似性度量，它结合了时间上下文和评分信息，能更好地捕获用户间的真实关系，从而识别近邻；接着，根据目标用户的近邻及其消费的项目抽取了具有用户和项目潜在影响因素的待填充的关键项目集合，并利用矩阵分解填充关键项目集合；然后，利用隐含狄利克雷分布（LDA）抽取用户在每个时间段内的概率主题分布，并利用时间惩罚权值建立用户动态偏好模型；最后，根据用户间概率主题分布的相关性和基于用户的协同过滤完成项目推荐。实验结果表明，与其他数据填充方法相比，基于会话时序相似性的矩阵分解数据填充方法在不同稀疏度下都能降低平均绝对误差（MAE），提高推荐性能。

关键词: 数据稀疏, 数据填充, 时序上下文, 矩阵分解, 时间权值

Abstract: The actual relationship between users cannot be captured by the existing data imputation methods because they only consider the rating information and traditional similarity. To alleviate data sparsity and improve recommendation accuracy, a data imputation method was proposed. Firstly, the defects of traditional similarity were analyzed and a new session-based temporal similarity based on tempoaral similarity and dissimilarity was defined, which integrated time context into rating patterns to better identify neighbors for active user. Additionally, the rating sub-matrix of key item set was extracted from similar users and their consumption items which can mine the potential influence factors of users and items, and it was imputed by using matrix factorization. Then the user probabilistic topic distribution for each stage was obtained by using Latent Dirichlet Allocation (LDA) and the user dynamic profile was built with the temporal penalty weights. Finally, the items were recommended based on the correlation of probabilistic topic distribution between users and user-based collaborative filtering. Experimental results show that compared with other imputation-based methods, the proposed method can reduce the Mean Absolute Error (MAE) and improve the recommendation performance under different sparsity.

Key words: data sparisity, data imputation, temporal context, matrix factorization, temporal weight

中图分类号:

TP391

乔永卫, 张宇翔, 肖春景. 基于会话时序相似性的矩阵分解数据填充[J]. 计算机应用, 2018, 38(8): 2236-2242.

QIAO Yongwei, ZHANG Yuxiang, XIAO Chunjing. Data imputation using matrix factorization based on session-based temporal similarity[J]. Journal of Computer Applications, 2018, 38(8): 2236-2242.

参考文献

[1] SU P, YE H. An item based collaborative filtering recommendation algorithms using rough set prediction[C]//JCAI'09:Proceedings of the 1st ⅡTA International Joint Conference on Artificial Intelligence. Washington, DC:IEEE Computer Society, 2009:308-311.
[2] LAWRENCE N D, URTASUM R. Non-linear matrix factorization with Gaussian processes[C]//ICML'09:Proceedings of the 26th International Conference on Machine Learning. New York:ACM, 2009:601-608.
[3] SUKSAWATCHON U, DARAPISUT S, SUKSAWATCHON J. Incremental session based collaborative filtering with forgetting mechanisms[C]//Proceedings of the 201519th International Computer Science and Engineering Conference:Hybrid Cloud Computing:a New Approach for Big Data Era. Piscataway,NJ:IEEE, 2009:Article No. 7401418.
[4] RICARDO D, FONSECA M J. Improving music recommendation in session-based collaborative filtering by using temporal context[C]//ICTAI'13:Proceedings of the IEEE 25th International Conference on Tools with Artificial Intelligence. Washington, DC:IEEE Computer Society, 2013:783-788.
[5] YU J, ZHU T. Combining long-term and short-term user interest for personalized hashtag recommendation[J]. Frontiers of Computer Science, 2015, 9(4):608-622.
[6] LI B, ZHU X, LI R, et al. Cross-domain collaborative filtering over time[C]//IJCAI'11:Proceeding of the 22th International Joint Conference on Artificial Intelligence. Menlo Park, CA:AAAI Press, 2011:2293-2298.
[7] LI L, ZHENG L, YANG F, et al. Modeling and broadening temporal user interest in personalized news recommendation[J]. Expert Systems with Applications, 2014, 41(7):3168-3177.
[8] HONG W, LI L, LI T. Product recommendation with temporal dynamics[J]. Expert Systems with Applications, 2012, 39(16):12398-12406.
[9] ADOMAVICIUS G, TUZHILIN A. Toward the next generation of recommender system:a survey of the state-of-the-art and possible extensions[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(6):734-749.
[10] RESNICK P, LACOVOU N, SUCHAK M, et.al. GroupLens:an open architecture for collaborative filtering of netnews[C]//CSCW'94:Processings of the 1994 ACM conference on computer supported cooperative work. New York:ACM, 1994:175-186.
[11] HU Y, PENG Q, HU X. A time-aware and data sparsity tolerant approach for Web service recommendation[C]//ICWS'14:Proceedings the of 21st IEEE International Conference on Web Services. Washington, DC:IEEE Computer Society, 2014:33-40.
[12] SHI Y, LARSON M, HANJALIC A. Exploiting user similarity based on rated-item pools for improved user-based collaborative filtering[C]//RecSys'09:Proceedings of the 3rd ACM Conference on Recommender Systems. New York:ACM, 2009:125-132.
[13] SU X, KHOSHGOFTAAR T M. Collaborative filtering for multi-class data using belief net algorithm[C]//ICTAI'06:Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence. Washington, DC:IEEE Computer Society, 2006:497-504.
[14] GHAZANFAR M A, PRUGEL-BENNETT A. The advantage of careful imputation sources in sparse data-environment of recommender systems:Generating improved SVD-based recommendations[J]. Informatica (Slovenia), 2013, 37(1):61-92.
[15] ZHONG E, LIU N, SHI Y, et al. Building discriminative user profiles for large-scale content recommendation[C]//KDD'15:Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York:ACM, 2015:2277-2286.
[16] PARK C, KIM D, OH J, et al. Predicting user purchase in E-commerce by comprehensive feature engineering and decision boundary focused under-sampling[C]//RecSys'15 Challenge:Proceedings of the 2015 International ACM Recommender Systems Challenge. NewYork:ACM, 2015:2277-2286.
[17] SU X, KHOSHGOFTAAR T M, GREINER R. Imputed neighborhood based collaborative filtering[C]//WI-IAT'08:Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. Washington, DC:IEEE Computer Society, 2008:633-639.
[18] XUE G-R, LIN C, YANG Q, et. al. Scalable collaborative filtering using cluster-based smoothing[C]//SIGIR'05:Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM, 2005:114-121.
[19] MA H, KING I, LYU M R. Effective missing data prediction for collaborative filtering[C]//SIGIR'07:Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM, 2007:39-47.
[20] REN Y, LI G, ZHANG J, et. al. The efficient imputation method for neighborhood-based collaborative filtering[C]//CIKM'12:Proceedings of the 21st ACM International Conference on Information and Knowledge Management. New York:ACM, 2012:684-693.
[21] 袁卫华,王红,杜向华.结合非负矩阵填充及子集划分的协同推荐算法[J].小型微型计算机系统,2017,38(12):2645-2651. (YUAN W H, WANG H, DU X H. Collaborative filtering algorithm integrating non-negative matrix completion and subgroups partitioning[J]. Journal of Chinese Computer Systems, 2017, 38(12):2645-2651.)
[22] YIN F, WANG Z, TAN W, et. al. Sparsity-tolerated algorithm with missing value recovering in user-based collaborative filtering recommendation[J]. Journal of Information & Computational Science, 2013, 10(15):4939-4948.
[23] RANJBAR M, MORADI P, AZAMI M, et al. An imputation-based matrix factorization method for improving accuracy of collaborative filtering systems[J]. Engineering Applications of Artificial Intelligence, 2015, 46(Part A):58-66.
[24] YAO Q, KWOK J T. Accelerated inexact soft-impute for fast large-scale matrix completion[C]//IJCAI'15:Proceedings of the 24th International Conference on Artificial Intelligence. Menlo Park, CA:AAAI Press, 2015:4002-4008.
[25] 成韵姿,陈曦,傅明.基于云填充和混合相似性的协同过滤推荐算法的研究[J].计算机技术与自动化,2016,35(4):56-60. (CHENG Y Z, CHEN X, FU M. Research on collaborative filtering recommendation algorithm based on cloud model filling and hybrid similarity[J]. Computing Technology and Automation, 2016, 35(4):56-60.)
[26] 李灿,李书琴,蔡骋,等.基于IALM和填充可信度的协同过滤算法及其并行化研究[J].计算机应用研究,2016,33(10):2954-2958. (LI C, LI S Q, CAI C, et. al. Parallel research of collaborative filtering algorithm based on IALM and filling credibility[J]. Application Research of Computers, 2016, 33(10):2954-2958.

基于会话时序相似性的矩阵分解数据填充

Data imputation using matrix factorization based on session-based temporal similarity

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	王锦凯, 贾旭. 基于迁移孪生非负矩阵分解的静脉识别算法[J]. 计算机应用, 2021, 41(3): 898-903.
[2]	温雯, 刘芳, 蔡瑞初, 郝志峰. 面向群组用户时序行为的动态推荐算法[J]. 计算机应用, 2021, 41(1): 60-66.
[3]	田保军, 刘爽, 房建东. 融合主题信息和卷积神经网络的混合推荐算法[J]. 计算机应用, 2020, 40(7): 1901-1907.
[4]	王锦凯, 贾旭. 基于加权正交约束非负矩阵分解的车脸识别算法[J]. 计算机应用, 2020, 40(4): 1050-1055.
[5]	成其伟, 陈启买, 贺超波, 刘海. 基于改进对称二值非负矩阵分解的重叠社区发现方法[J]. 计算机应用, 2020, 40(11): 3203-3210.
[6]	沈学利, 李子健, 赫辰皓. 基于评分填充与信任信息的混合推荐算法[J]. 计算机应用, 2020, 40(10): 2789-2794.
[7]	刘颖, 梁楠楠, 李大湘, 杨凡超. 基于光谱距离聚类的高光谱图像解混算法[J]. 计算机应用, 2019, 39(9): 2541-2546.
[8]	杨燕琳, 冶忠林, 赵海兴, 孟磊. 基于高阶近似的链路预测算法[J]. 计算机应用, 2019, 39(8): 2366-2373.
[9]	陈善学, 储成泉. 基于稀疏和正交约束非负矩阵分解的高光谱解混[J]. 计算机应用, 2019, 39(8): 2276-2280.
[10]	杨亮东, 杨志霞. 稀疏限制的增量式鲁棒非负矩阵分解及其应用[J]. 计算机应用, 2019, 39(5): 1275-1281.
[11]	邵长城, 陈平华. 融合社交网络和图像内容的兴趣点推荐[J]. 计算机应用, 2019, 39(5): 1261-1268.
[12]	王磊, 任航, 龚凯. 基于多维信任和联合矩阵分解的社会化推荐方法[J]. 计算机应用, 2019, 39(5): 1269-1274.
[13]	李飞, 杜亮, 任超宏. 基于全局融合的多核概念分解算法[J]. 计算机应用, 2019, 39(4): 1021-1026.
[14]	余江兰, 李向利, 赵朋飞. 基于核技巧和超图正则的稀疏非负矩阵分解[J]. 计算机应用, 2019, 39(3): 742-749.
[15]	李艳生, 刘园, 张毅. 基于感知掩蔽的重构非负矩阵分解单通道语音增强算法[J]. 计算机应用, 2019, 39(3): 894-898.