Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (2): 426-432.DOI: 10.11772/j.issn.1001-9081.2021050907
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Yuxi LIU, Yuqi LIU, Zonglin ZHANG, Zhihua WEI, Ran MIAO
Received:
2021-06-01
Revised:
2021-07-16
Accepted:
2021-07-19
Online:
2022-02-11
Published:
2022-02-10
About author:
LIU Yuxi, born in 2001. Her research interests include machine learning, deep learning.Supported by:
刘羽茜, 刘玉奇, 张宗霖, 卫志华, 苗冉
作者简介:
刘羽茜(2001—),女,福建南平人,主要研究方向:机器学习、深度学习;基金资助:
CLC Number:
Yuxi LIU, Yuqi LIU, Zonglin ZHANG, Zhihua WEI, Ran MIAO. News recommendation model with deep feature fusion injecting attention mechanism[J]. Journal of Computer Applications, 2022, 42(2): 426-432.
刘羽茜, 刘玉奇, 张宗霖, 卫志华, 苗冉. 注入注意力机制的深度特征融合新闻推荐模型[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 426-432.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021050907
符号 | 含义 |
---|---|
新闻文本的长度(单词数) | |
一条新闻的所有词向量顺序组成的向量 | |
卷积操作后一条新闻文本的所有词向量顺序组成的向量 | |
卷积核数量 | |
一个用户浏览的新闻数量 | |
单词层次的附加注意力机制查询矩阵 | |
第 | |
编码表征后合并的 | |
GRU模块输出的 | |
用户特征向量维度 | |
多头自注意力输出的 | |
新闻层次的附加注意力机制查询矩阵 | |
第 | |
一个用户的特征向量 |
Tab. 1 Definition of symbols
符号 | 含义 |
---|---|
新闻文本的长度(单词数) | |
一条新闻的所有词向量顺序组成的向量 | |
卷积操作后一条新闻文本的所有词向量顺序组成的向量 | |
卷积核数量 | |
一个用户浏览的新闻数量 | |
单词层次的附加注意力机制查询矩阵 | |
第 | |
编码表征后合并的 | |
GRU模块输出的 | |
用户特征向量维度 | |
多头自注意力输出的 | |
新闻层次的附加注意力机制查询矩阵 | |
第 | |
一个用户的特征向量 |
类别 | 中文数据集 | 英文数据集 |
---|---|---|
用户数 | 9 457 | 100 000 |
新闻数 | 100 197 | 120 962 |
平均新闻标题长度 | 14.00 | 11.52 |
平均新闻内容长度 | 584.00 | 585.05 |
Tab. 2 Statistics of datasets
类别 | 中文数据集 | 英文数据集 |
---|---|---|
用户数 | 9 457 | 100 000 |
新闻数 | 100 197 | 120 962 |
平均新闻标题长度 | 14.00 | 11.52 |
平均新闻内容长度 | 584.00 | 585.05 |
名称 | 含义 | 取值 |
---|---|---|
Embedding dim | 词向量维度 | 256 |
Seq length | 新闻序列长度 | 300 |
Num classes | 新闻类别数 | 10 |
Num filters | 卷积核数量 | 256 |
Kernel size | 卷积核尺寸 | 3*256 |
Vocab size | 词汇表大小 | 5 000 |
Hidden dim | 全连接层神经络神经元个数 | 256 |
Dropout keep prob | 正则化保留比例 | 0.5 |
Learning rate | 学习率 | 2.00E-04 |
Batch size | 每批训练大小 | 64 |
Num epochs | 总迭代轮次 | 10 |
Print per batch | 每多少轮输出一次结果 | 10 |
Save per batch | 每多少轮存入tensorboard | 10 |
Attention size | 附加新闻注意力机制的维度 | 128 |
Query vector dim | 注意力机制query向量的维度 | 128 |
Candidate num | 候选新闻个数 | 5 |
Click num | 用户已浏览新闻个数 | 20 |
Num attention heads | 多头注意力机制头个数 | 16 |
Tab. 3 Hyperparameter value
名称 | 含义 | 取值 |
---|---|---|
Embedding dim | 词向量维度 | 256 |
Seq length | 新闻序列长度 | 300 |
Num classes | 新闻类别数 | 10 |
Num filters | 卷积核数量 | 256 |
Kernel size | 卷积核尺寸 | 3*256 |
Vocab size | 词汇表大小 | 5 000 |
Hidden dim | 全连接层神经络神经元个数 | 256 |
Dropout keep prob | 正则化保留比例 | 0.5 |
Learning rate | 学习率 | 2.00E-04 |
Batch size | 每批训练大小 | 64 |
Num epochs | 总迭代轮次 | 10 |
Print per batch | 每多少轮输出一次结果 | 10 |
Save per batch | 每多少轮存入tensorboard | 10 |
Attention size | 附加新闻注意力机制的维度 | 128 |
Query vector dim | 注意力机制query向量的维度 | 128 |
Candidate num | 候选新闻个数 | 5 |
Click num | 用户已浏览新闻个数 | 20 |
Num attention heads | 多头注意力机制头个数 | 16 |
模型 | nDCG | MRR | 收敛时间 |
---|---|---|---|
本文模型 | 0.820 6 | 0.781 9 | 0:01:49 |
NRMS | 0.794 7 | 0.763 8 | 0:06:36 |
TANR | 0.801 4 | 0.756 9 | 0:02:14 |
LSTUR | 0.822 4 | 0.788 4 | 0:01:58 |
NAML | 0.782 2 | 0.755 6 | 0:02:52 |
Tab. 4 Experimental results of different models on Chinese dataset
模型 | nDCG | MRR | 收敛时间 |
---|---|---|---|
本文模型 | 0.820 6 | 0.781 9 | 0:01:49 |
NRMS | 0.794 7 | 0.763 8 | 0:06:36 |
TANR | 0.801 4 | 0.756 9 | 0:02:14 |
LSTUR | 0.822 4 | 0.788 4 | 0:01:58 |
NAML | 0.782 2 | 0.755 6 | 0:02:52 |
模型 | nDCG | MRR | 收敛时间 |
---|---|---|---|
本文模型 | 0.946 8 | 0.977 7 | 0:10:34 |
NRMS | 0.945 1 | 0.966 3 | 0:21:14 |
TANR | 0.940 9 | 0.973 6 | 0:09:41 |
LSTUR | 0.946 1 | 0.977 4 | 0:09:09 |
NAML | 0.938 1 | 0.971 0 | 0:11:39 |
DKN | 0.930 5 | 0.965 2 | 0:15:48 |
Tab. 5 Experimental results of different models on English dataset
模型 | nDCG | MRR | 收敛时间 |
---|---|---|---|
本文模型 | 0.946 8 | 0.977 7 | 0:10:34 |
NRMS | 0.945 1 | 0.966 3 | 0:21:14 |
TANR | 0.940 9 | 0.973 6 | 0:09:41 |
LSTUR | 0.946 1 | 0.977 4 | 0:09:09 |
NAML | 0.938 1 | 0.971 0 | 0:11:39 |
DKN | 0.930 5 | 0.965 2 | 0:15:48 |
不同注意力机制模型 | nDCG | MRR |
---|---|---|
no attention | 0.743 8 | 0.662 8 |
words attention | 0.815 6 | 0.662 8 |
news attention | 0.745 7 | 0.772 6 |
both attention | 0.820 6 | 0.781 9 |
Tab. 6 Performance comparison of different attention mechanisms
不同注意力机制模型 | nDCG | MRR |
---|---|---|
no attention | 0.743 8 | 0.662 8 |
words attention | 0.815 6 | 0.662 8 |
news attention | 0.745 7 | 0.772 6 |
both attention | 0.820 6 | 0.781 9 |
数据集 | 指标 | 有时序预测模块 | 无时序预测模块 | 有时序预测模块相较于无时序预测模块的提升率/% |
---|---|---|---|---|
中文数据集 | nDCG | 0.820 6 | 0.786 5 | 4.33 |
MRR | 0.781 9 | 0.756 9 | 3.30 | |
英文数据集 | nDCG | 0.946 8 | 0.934 1 | 1.36 |
MRR | 0.977 7 | 0.962 6 | 1.57 |
Tab. 7 Comparison experimental results of the proposed model with and without time series prediction module
数据集 | 指标 | 有时序预测模块 | 无时序预测模块 | 有时序预测模块相较于无时序预测模块的提升率/% |
---|---|---|---|---|
中文数据集 | nDCG | 0.820 6 | 0.786 5 | 4.33 |
MRR | 0.781 9 | 0.756 9 | 3.30 | |
英文数据集 | nDCG | 0.946 8 | 0.934 1 | 1.36 |
MRR | 0.977 7 | 0.962 6 | 1.57 |
1 | OKURA S, TAGAMI Y, ONO S, et al. Embedding-based news recommendation for millions of users [C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 1933-1942. 10.1145/3097983.3098108 |
2 | WANG H W, ZHANG F Z, XIE X, et al. DKN: deep knowledge-aware network for news recommendation [C]// Proceedings of the 2018 World Wide Web Conference. Republic and Canton of Geneva: International World Wide Web Conferences Steering Committee, 2018: 1835-1844. 10.1145/3178876.3186175 |
3 | WU C H, WU F Z, GE S Y, et al. Neural news recommendation with multi-head self-attention [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2019: 6389-6394. 10.18653/v1/d19-1671 |
4 | 殷宏磊.基于深度学习的推荐系统研究与实现[D].成都:电子科技大学, 2019: 1-3. 10.20937/rica.2019.35.esp01.20 |
YIN H L. Research and implementation of recommendation system based on deep learning[D]. Chengdu: University of Electronic Science and Technology of China, 2019: 1-3. 10.20937/rica.2019.35.esp01.20 | |
5 | 杨武,唐瑞,卢玲.基于内容的推荐与协同过滤融合的新闻推荐方法[J].计算机应用, 2016, 36(2): 414-418. 10.11772/j.issn.1001-9081.2016.02.0414 |
YANG W, TANG R, LU L. News recommendation method by fusion of content-based recommendation and collaborative filtering[J]. Journal of Computer Applications, 2016, 36(2): 414-418. 10.11772/j.issn.1001-9081.2016.02.0414 | |
6 | 田萱,丁琪,廖子慧,等.基于深度学习的新闻推荐算法研究综述[J].计算机科学与探索, 2021, 15(6): 971-998. |
TIAN X, DING Q, LIAO Z H, et al. Survey on deep learning based news recommendation algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(6): 971-998. | |
7 | WU C H, WU F Z, AN M X, et al. Neural news recommendation with topic-aware news representation [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2019: 1154-1159. 10.18653/v1/p19-1110 |
8 | AN M X, WU F Z, WU C H, et al. Neural news recommendation with long- and short-term user representations [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2019: 336-345. 10.18653/v1/p19-1033 |
9 | WU C H, WU F Z, AN M X, et al. Neural news recommendation with attentive multi-view learning [C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. [S.l.]: IJCAI Organization, 2019: 3863-3869. 10.24963/ijcai.2019/536 |
10 | GU J X, WANG Z H, KUEN J, et al. Recent advances in convolutional neural networks[J]. Pattern Recognition, 2018, 77: 354-377. 10.1016/j.patcog.2017.10.013 |
11 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. 10.1016/s0262-4079(17)32358-8 |
12 | YAMAK P T, LI Y J, GADOSEY P K. A comparison between ARIMA, LSTM, and GRU for time series forecasting [C]// Proceedings of the 2nd International Conference on Algorithms, Computing and Artificial Intelligence. New York: ACM, 2019: 49-55. 10.1145/3377713.3377722 |
13 | CHUNG J, GULCEHRE C, CHO K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL]. (2014-12-11) [2021-03-10]. . 10.1007/978-3-662-44848-9_34 |
14 | WU F Z, QIAO Y, CHEN J H, et al. MIND: a large-scale dataset for news recommendation [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 3597-3606. 10.18653/v1/2020.acl-main.331 |
15 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. (2017-01-30) [2021-03-10]. . |
[1] | Na WANG, Lin JIANG, Yuancheng LI, Yun ZHU. Optimization of tensor virtual machine operator fusion based on graph rewriting and fusion exploration [J]. Journal of Computer Applications, 2024, 44(9): 2802-2809. |
[2] | Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910. |
[3] | Tingjie TANG, Jiajin HUANG, Jin QIN. Session-based recommendation with graph auxiliary learning [J]. Journal of Computer Applications, 2024, 44(9): 2711-2718. |
[4] | Rui ZHANG, Pengyun ZHANG, Meirong GAO. Self-optimized dual-modal multi-channel non-deep vestibular schwannoma recognition model [J]. Journal of Computer Applications, 2024, 44(9): 2975-2982. |
[5] | Qi SHUAI, Hairui WANG, Guifu ZHU. Chinese story ending generation model based on bidirectional contrastive training [J]. Journal of Computer Applications, 2024, 44(9): 2683-2688. |
[6] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[7] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. |
[8] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[9] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[10] | Hang YANG, Wanggen LI, Gensheng ZHANG, Zhige WANG, Xin KAI. Multi-layer information interactive fusion algorithm based on graph neural network for session-based recommendation [J]. Journal of Computer Applications, 2024, 44(9): 2719-2725. |
[11] | Yu DU, Yan ZHU. Constructing pre-trained dynamic graph neural network to predict disappearance of academic cooperation behavior [J]. Journal of Computer Applications, 2024, 44(9): 2726-2731. |
[12] | Guanglei YAO, Juxia XIONG, Guowu YANG. Flower pollination algorithm based on neural network optimization [J]. Journal of Computer Applications, 2024, 44(9): 2829-2837. |
[13] | Ying HUANG, Jiayu YANG, Jiahao JIN, Bangrui WAN. Siamese mixed information fusion algorithm for RGBT tracking [J]. Journal of Computer Applications, 2024, 44(9): 2878-2885. |
[14] | Xingyao YANG, Yu CHEN, Jiong YU, Zulian ZHANG, Jiaying CHEN, Dongxiao WANG. Recommendation model combining self-features and contrastive learning [J]. Journal of Computer Applications, 2024, 44(9): 2704-2710. |
[15] | Chunxue ZHANG, Liqing QIU, Cheng’ai SUN, Caixia JING. Purchase behavior prediction model based on two-stage dynamic interest recognition [J]. Journal of Computer Applications, 2024, 44(8): 2365-2371. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||