Commodity recommendation model based on improved deep Q network structure

doi:10.11772/j.issn.1001-9081.2019112002

Abstract

Abstract: Traditional recommendation methods have problems such as data sparsity and poor feature recognition. To solve these problems, positive and negative feedback datasets with time-series property were constructed according to implicit feedback. Since positive and negative feedback datasets and commodity purchases have strong time-series feature, Long Short-Term Memory (LSTM) network was introduced as the component of the model. Considering that the user’s own characteristics and action selection returns are determined by different input data, the deep Q network based on competitive architecture was improved: integrating the user positive and negative feedback and the time-series features of commodity purchases, a commodity recommendation model based on the improved deep Q network structure was designed. In the model, the positive and negative feedback data were trained differently, and the time-series features of the commodity purchases were extracted. On the Retailrocket dataset, compared with the best performance among the Factorization Machine (FM) model, W&D (Wide & Deep learning) and Collaborative Filtering (CF) models, the proposed model has the precision, recall, Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG) increased by 158.42%, 89.81%, 95.00% and 65.67%. At the same time, DBGD (Dueling Bandit Gradient Descent) was used as the exploration method, so as to improve the low diversity problem of recommended commodities.

Key words: deep reinforcement learning, positive and negative feedback dataset, competitive network architecture, Long Short-Term Memory (LSTM) network, commodity recommendation

摘要： 传统推荐方法存在数据稀疏和特征识别差等问题，为了解决这些问题，根据隐式反馈构建具有时序性的正负反馈数据集。由于正负反馈数据集和商品购买具有强时序性特征，引入长短期记忆（LSTM）网络作为模型构件。考虑用户自身特征和用户动作选择回报由不同的输入数据决定，对竞争架构的深度Q网络进行改进，融合用户正负反馈和商品购买时序性，设计了基于改进的深度Q网络结构的商品推荐模型。模型对正负反馈数据进行区分性训练，对商品购买的时序性特征进行提取。在Retailrocket数据集上，与因子分解机（FM）模型、W&D模型和协同过滤（CF）模型中表现最好的相比，所提模型的准确率、召回率、平均准确率（MAP）和归一化折损累计增益（NDCG）分别提高了158.42%、89.81%、95.00%和67.57%。同时，使用DBGD作为探索方法，改善了推荐商品多样性低的缺陷。

关键词: 深度强化学习, 正负反馈数据集, 竞争网络架构, 长短期记忆网络, 商品推荐

CLC Number:

TP181

FU Kui, LIANG Shaoqing, LI Bing. Commodity recommendation model based on improved deep Q network structure[J]. Journal of Computer Applications, 2020, 40(9): 2613-2621.

傅魁, 梁少晴, 李冰. 基于改进的深度Q网络结构的商品推荐模型[J]. 计算机应用, 2020, 40(9): 2613-2621.

References

[1] 盈艳, 曹妍, 牟向伟. 基于项目评分预测的混合式协同过滤推荐[J]. 现代图书情报技术,2015,31(6):27-32.(YING Y,CAO Y,MU X W. A hybrid collaborative filtering recommender based on item rating prediction[J]. New Technology of Library and Information Service,2015,31(6):27-32.)
[2] 李清霞, 魏文红, 蔡昭权. 混合用户和项目协同过滤的电子商务个性化推荐算法[J]. 中山大学学报(自然科学版),2016,55(5):37-42.(LI Q X,WEI W H,CAI Z Q. Hybrid user and item based collaborative filtering personalized recommendation algorithm in Ecommerce[J]. Acta Scientiarum Naturalium Universitatis Sunyatseni,2016,55(5):37-42.)
[3] 欧阳龙, 卢琪, 彭艳兵. 基于内容和背景的微博问答问题推荐[J]. 电子设计工程,2018,26(11):183-188.(OUYANG L,LU Q,PENG Y B. Question recommendation of microblog QA based on content and background[J]. Electronic Design Engineering, 2018,26(11):183-188.)
[4] WANG H,WANG N,YEUNG D Y. Collaborative deep learning for recommender systems[C]//Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2015:1235-1244.
[5] WEI J,HE J,CHEN K,et al. Collaborative filtering and deep learning based recommendation system for cold start items[J]. Expert Systems with Applications,2016,69:29-39.
[6] KOREN Y. Collaborative filtering with temporal dynamics[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2009:447-456.
[7] ZHENG L,NOROOZI V,YU P S. Joint deep modeling of users and items using reviews for recommendation[C]//Proceedings of the 10th ACM International Conference on Web Search and Data Mining. New York:ACM,2017:425-434.
[8] COVINGTON P,ADAMS J,SARGIN E. Deep neural networks for YouTube recommendations[C]//Proceedings of the 10th ACM Conference on Recommender Systems. New York:ACM,2016:191-198.
[9] KIM D,PARK C,OH J,et al. Convolutional matrix factorization for document context-aware recommendation[C]//Proceedings of the 10th ACM Conference on Recommender Systems. New York:ACM,2016:233-240.
[10] 黄立威, 江碧涛, 吕守业, 等. 基于深度学习的推荐系统研究综述[J]. 计算机学报,2018,41(7):1619-1647.(HUANG L W, JIANG B T,LV S Y,et al. Survey on deep learning based recommender systems[J]. Chinese Journal of Computers,2018, 41(7):1619-1647.)
[11] 张家精, 夏巽鹏, 陈金兰, 等. 基于张量分解和深度学习的混合推荐算法[J]. 南京大学学报(自然科学版),2019,55(6):952-959. (ZHANG J J,XIA X P,CHEN J L,et al. Blending recommendation algorithm based on tensor decompositions and deep learning[J]. Journal of Nanjing University (Natural Science),2019,55(6):952-959.)
[12] 张敏军, 华庆一, 贾伟, 等. 基于深度神经网络的个性化推荐系统研究[J]. 西南大学学报(自然科学版),2019,41(11):104-109.(ZHANG M J,HUA Q Y,JIA W,et al. A personalized recommendation system based on deep neural network[J]. Journal of Southwest University (Natural Science Edition),2019,41(11):104-109.)
[13] 万里鹏, 兰旭光, 张翰博, 等. 深度强化学习理论及其应用综述[J]. 模式识别与人工智能,2019,32(1):67-81.(WAN L P, LAN X G,ZHANG H B,et al. A review of deep reinforcement learning theory and application[J]. Pattern Recognition and Artificial Intelligence,2019,32(1):67-81.)
[14] ZHAO X, ZHANG L, DING Z, et al. Deep reinforcement learning for list-wise recommendations[EB/OL].[2019-11-14]. https://arxiv.org/pdf/1801.00209.pdf.
[15] ZHAO X,ZHANG L,XIA L,et al. Recommendations with negative feedback via pairwise deep reinforcement learning[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2018:1040-1048.
[16] ZHENG G, ZHANG F, ZHENG Z, et al. DRN:a deep reinforcement learning framework for news recommendation[C]//Proceedings of the 2018 World Wide Web Conference. Republic and Canton of Geneva:International World Wide Web Conferences Steering Committee,2018:167-176.
[17] 刘洋军. 基于深度强化学习的推荐系统研究[D]. 成都:电子科技大学,2019:20-61.(LIU Y J. Research on recommendation system based on deep reinforcement learning[D]. Chengdu:University of Electronic Science and Technology of China,2019:20-61.)
[18] MNIH V,KAVUKCUOGLU K,SILVER D,et al. Playing Atari with deep reinforcement learning[EB/OL].[2019-11-14]. https://arxiv.org/pdf/1312.5602.pdf.
[19] HOFMANN K, WHITESON S, DE RIJKE M. Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval[J]. Information Retrieval,2013, 16(1):63-90.
[20] RENDLE S. Factorization machines[C]//Proceedings of the 2010 IEEE International Conference on Data Mining. Piscataway:IEEE,2010:995-1000.
[21] CHENG H T,KOC L,HARMSEN J,et al. Wide & deep learning for recommender systems[C]//Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. New York:ACM, 2016:7-10.
[22] 陈建荣. 基于用户反馈的智能查询扩展技术研究[D]. 哈尔滨:哈尔滨工业大学,2014:10-11.(CHEN J R. Research on intelligent query expansion technology based on users' feedback[D]. Harbin:Harbin Institute of Technology,2014:10-11.)
[23] 刘广东. 基于"用户画像" 的商品推送系统设计与实现[D]. 西安:西安电子科技大学,2017:6-7.(LIU G D. The design and implementation of product recommendation system based on "user portrait"[D]. Xi'an:Xidian University,2017:6-7.)
[24] 叶锡君, 龚玥. 基于项目类别的协同过滤推荐算法多样性研究[J]. 计算机工程,2015,41(10):42-46,52.(YE X J,GONG Y. Study on diversity of collaborative filtering recommendation algorithm based on item category[J]. Computer Engineering, 2015,41(10):42-46,52.)