Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (2): 426-432.DOI: 10.11772/j.issn.1001-9081.2021050907
• Artificial intelligence • Previous Articles Next Articles
Yuxi LIU, Yuqi LIU, Zonglin ZHANG, Zhihua WEI, Ran MIAO
Received:
2021-06-01
Revised:
2021-07-16
Accepted:
2021-07-19
Online:
2022-02-11
Published:
2022-02-10
About author:
LIU Yuxi, born in 2001. Her research interests include machine learning, deep learning.Supported by:
作者简介:
刘羽茜(2001—),女,福建南平人,主要研究方向:机器学习、深度学习;基金资助:
CLC Number:
Yuxi LIU, Yuqi LIU, Zonglin ZHANG, Zhihua WEI, Ran MIAO. News recommendation model with deep feature fusion injecting attention mechanism[J]. Journal of Computer Applications, 2022, 42(2): 426-432.
刘羽茜, 刘玉奇, 张宗霖, 卫志华, 苗冉. 注入注意力机制的深度特征融合新闻推荐模型[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 426-432.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021050907
符号 | 含义 |
---|---|
新闻文本的长度(单词数) | |
一条新闻的所有词向量顺序组成的向量 | |
卷积操作后一条新闻文本的所有词向量顺序组成的向量 | |
卷积核数量 | |
一个用户浏览的新闻数量 | |
单词层次的附加注意力机制查询矩阵 | |
第 | |
编码表征后合并的 | |
GRU模块输出的 | |
用户特征向量维度 | |
多头自注意力输出的 | |
新闻层次的附加注意力机制查询矩阵 | |
第 | |
一个用户的特征向量 |
Tab. 1 Definition of symbols
符号 | 含义 |
---|---|
新闻文本的长度(单词数) | |
一条新闻的所有词向量顺序组成的向量 | |
卷积操作后一条新闻文本的所有词向量顺序组成的向量 | |
卷积核数量 | |
一个用户浏览的新闻数量 | |
单词层次的附加注意力机制查询矩阵 | |
第 | |
编码表征后合并的 | |
GRU模块输出的 | |
用户特征向量维度 | |
多头自注意力输出的 | |
新闻层次的附加注意力机制查询矩阵 | |
第 | |
一个用户的特征向量 |
类别 | 中文数据集 | 英文数据集 |
---|---|---|
用户数 | 9 457 | 100 000 |
新闻数 | 100 197 | 120 962 |
平均新闻标题长度 | 14.00 | 11.52 |
平均新闻内容长度 | 584.00 | 585.05 |
Tab. 2 Statistics of datasets
类别 | 中文数据集 | 英文数据集 |
---|---|---|
用户数 | 9 457 | 100 000 |
新闻数 | 100 197 | 120 962 |
平均新闻标题长度 | 14.00 | 11.52 |
平均新闻内容长度 | 584.00 | 585.05 |
名称 | 含义 | 取值 |
---|---|---|
Embedding dim | 词向量维度 | 256 |
Seq length | 新闻序列长度 | 300 |
Num classes | 新闻类别数 | 10 |
Num filters | 卷积核数量 | 256 |
Kernel size | 卷积核尺寸 | 3*256 |
Vocab size | 词汇表大小 | 5 000 |
Hidden dim | 全连接层神经络神经元个数 | 256 |
Dropout keep prob | 正则化保留比例 | 0.5 |
Learning rate | 学习率 | 2.00E-04 |
Batch size | 每批训练大小 | 64 |
Num epochs | 总迭代轮次 | 10 |
Print per batch | 每多少轮输出一次结果 | 10 |
Save per batch | 每多少轮存入tensorboard | 10 |
Attention size | 附加新闻注意力机制的维度 | 128 |
Query vector dim | 注意力机制query向量的维度 | 128 |
Candidate num | 候选新闻个数 | 5 |
Click num | 用户已浏览新闻个数 | 20 |
Num attention heads | 多头注意力机制头个数 | 16 |
Tab. 3 Hyperparameter value
名称 | 含义 | 取值 |
---|---|---|
Embedding dim | 词向量维度 | 256 |
Seq length | 新闻序列长度 | 300 |
Num classes | 新闻类别数 | 10 |
Num filters | 卷积核数量 | 256 |
Kernel size | 卷积核尺寸 | 3*256 |
Vocab size | 词汇表大小 | 5 000 |
Hidden dim | 全连接层神经络神经元个数 | 256 |
Dropout keep prob | 正则化保留比例 | 0.5 |
Learning rate | 学习率 | 2.00E-04 |
Batch size | 每批训练大小 | 64 |
Num epochs | 总迭代轮次 | 10 |
Print per batch | 每多少轮输出一次结果 | 10 |
Save per batch | 每多少轮存入tensorboard | 10 |
Attention size | 附加新闻注意力机制的维度 | 128 |
Query vector dim | 注意力机制query向量的维度 | 128 |
Candidate num | 候选新闻个数 | 5 |
Click num | 用户已浏览新闻个数 | 20 |
Num attention heads | 多头注意力机制头个数 | 16 |
模型 | nDCG | MRR | 收敛时间 |
---|---|---|---|
本文模型 | 0.820 6 | 0.781 9 | 0:01:49 |
NRMS | 0.794 7 | 0.763 8 | 0:06:36 |
TANR | 0.801 4 | 0.756 9 | 0:02:14 |
LSTUR | 0.822 4 | 0.788 4 | 0:01:58 |
NAML | 0.782 2 | 0.755 6 | 0:02:52 |
Tab. 4 Experimental results of different models on Chinese dataset
模型 | nDCG | MRR | 收敛时间 |
---|---|---|---|
本文模型 | 0.820 6 | 0.781 9 | 0:01:49 |
NRMS | 0.794 7 | 0.763 8 | 0:06:36 |
TANR | 0.801 4 | 0.756 9 | 0:02:14 |
LSTUR | 0.822 4 | 0.788 4 | 0:01:58 |
NAML | 0.782 2 | 0.755 6 | 0:02:52 |
模型 | nDCG | MRR | 收敛时间 |
---|---|---|---|
本文模型 | 0.946 8 | 0.977 7 | 0:10:34 |
NRMS | 0.945 1 | 0.966 3 | 0:21:14 |
TANR | 0.940 9 | 0.973 6 | 0:09:41 |
LSTUR | 0.946 1 | 0.977 4 | 0:09:09 |
NAML | 0.938 1 | 0.971 0 | 0:11:39 |
DKN | 0.930 5 | 0.965 2 | 0:15:48 |
Tab. 5 Experimental results of different models on English dataset
模型 | nDCG | MRR | 收敛时间 |
---|---|---|---|
本文模型 | 0.946 8 | 0.977 7 | 0:10:34 |
NRMS | 0.945 1 | 0.966 3 | 0:21:14 |
TANR | 0.940 9 | 0.973 6 | 0:09:41 |
LSTUR | 0.946 1 | 0.977 4 | 0:09:09 |
NAML | 0.938 1 | 0.971 0 | 0:11:39 |
DKN | 0.930 5 | 0.965 2 | 0:15:48 |
不同注意力机制模型 | nDCG | MRR |
---|---|---|
no attention | 0.743 8 | 0.662 8 |
words attention | 0.815 6 | 0.662 8 |
news attention | 0.745 7 | 0.772 6 |
both attention | 0.820 6 | 0.781 9 |
Tab. 6 Performance comparison of different attention mechanisms
不同注意力机制模型 | nDCG | MRR |
---|---|---|
no attention | 0.743 8 | 0.662 8 |
words attention | 0.815 6 | 0.662 8 |
news attention | 0.745 7 | 0.772 6 |
both attention | 0.820 6 | 0.781 9 |
数据集 | 指标 | 有时序预测模块 | 无时序预测模块 | 有时序预测模块相较于无时序预测模块的提升率/% |
---|---|---|---|---|
中文数据集 | nDCG | 0.820 6 | 0.786 5 | 4.33 |
MRR | 0.781 9 | 0.756 9 | 3.30 | |
英文数据集 | nDCG | 0.946 8 | 0.934 1 | 1.36 |
MRR | 0.977 7 | 0.962 6 | 1.57 |
Tab. 7 Comparison experimental results of the proposed model with and without time series prediction module
数据集 | 指标 | 有时序预测模块 | 无时序预测模块 | 有时序预测模块相较于无时序预测模块的提升率/% |
---|---|---|---|---|
中文数据集 | nDCG | 0.820 6 | 0.786 5 | 4.33 |
MRR | 0.781 9 | 0.756 9 | 3.30 | |
英文数据集 | nDCG | 0.946 8 | 0.934 1 | 1.36 |
MRR | 0.977 7 | 0.962 6 | 1.57 |
1 | OKURA S, TAGAMI Y, ONO S, et al. Embedding-based news recommendation for millions of users [C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 1933-1942. 10.1145/3097983.3098108 |
2 | WANG H W, ZHANG F Z, XIE X, et al. DKN: deep knowledge-aware network for news recommendation [C]// Proceedings of the 2018 World Wide Web Conference. Republic and Canton of Geneva: International World Wide Web Conferences Steering Committee, 2018: 1835-1844. 10.1145/3178876.3186175 |
3 | WU C H, WU F Z, GE S Y, et al. Neural news recommendation with multi-head self-attention [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2019: 6389-6394. 10.18653/v1/d19-1671 |
4 | 殷宏磊.基于深度学习的推荐系统研究与实现[D].成都:电子科技大学, 2019: 1-3. 10.20937/rica.2019.35.esp01.20 |
YIN H L. Research and implementation of recommendation system based on deep learning[D]. Chengdu: University of Electronic Science and Technology of China, 2019: 1-3. 10.20937/rica.2019.35.esp01.20 | |
5 | 杨武,唐瑞,卢玲.基于内容的推荐与协同过滤融合的新闻推荐方法[J].计算机应用, 2016, 36(2): 414-418. 10.11772/j.issn.1001-9081.2016.02.0414 |
YANG W, TANG R, LU L. News recommendation method by fusion of content-based recommendation and collaborative filtering[J]. Journal of Computer Applications, 2016, 36(2): 414-418. 10.11772/j.issn.1001-9081.2016.02.0414 | |
6 | 田萱,丁琪,廖子慧,等.基于深度学习的新闻推荐算法研究综述[J].计算机科学与探索, 2021, 15(6): 971-998. |
TIAN X, DING Q, LIAO Z H, et al. Survey on deep learning based news recommendation algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(6): 971-998. | |
7 | WU C H, WU F Z, AN M X, et al. Neural news recommendation with topic-aware news representation [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2019: 1154-1159. 10.18653/v1/p19-1110 |
8 | AN M X, WU F Z, WU C H, et al. Neural news recommendation with long- and short-term user representations [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2019: 336-345. 10.18653/v1/p19-1033 |
9 | WU C H, WU F Z, AN M X, et al. Neural news recommendation with attentive multi-view learning [C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. [S.l.]: IJCAI Organization, 2019: 3863-3869. 10.24963/ijcai.2019/536 |
10 | GU J X, WANG Z H, KUEN J, et al. Recent advances in convolutional neural networks[J]. Pattern Recognition, 2018, 77: 354-377. 10.1016/j.patcog.2017.10.013 |
11 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. 10.1016/s0262-4079(17)32358-8 |
12 | YAMAK P T, LI Y J, GADOSEY P K. A comparison between ARIMA, LSTM, and GRU for time series forecasting [C]// Proceedings of the 2nd International Conference on Algorithms, Computing and Artificial Intelligence. New York: ACM, 2019: 49-55. 10.1145/3377713.3377722 |
13 | CHUNG J, GULCEHRE C, CHO K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL]. (2014-12-11) [2021-03-10]. . 10.1007/978-3-662-44848-9_34 |
14 | WU F Z, QIAO Y, CHEN J H, et al. MIND: a large-scale dataset for news recommendation [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 3597-3606. 10.18653/v1/2020.acl-main.331 |
15 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. (2017-01-30) [2021-03-10]. . |
[1] | Jie MENG, Li WANG, Yanjie YANG, Biao LIAN. Multi-modal deep fusion for false information detection [J]. Journal of Computer Applications, 2022, 42(2): 419-425. |
[2] | Renzhi PAN, Fulan QIAN, Shu ZHAO, Yanping ZHANG. Recommendation model for user attribute preference modeling based on convolutional neural network interaction [J]. Journal of Computer Applications, 2022, 42(2): 404-411. |
[3] | Quan CHEN, Li LI, Yongle CHEN, Yuexing DUAN. Adversarial attack algorithm for deep learning interpretability [J]. Journal of Computer Applications, 2022, 42(2): 510-518. |
[4] | Yi ZHANG, Shuangsheng WANG, Bin HE, Peiming YE, Keqiang LI. Named entity recognition method of elementary mathematical text based on BERT [J]. Journal of Computer Applications, 2022, 42(2): 433-439. |
[5] | Kun FU, Jinhui GAO, Xiaomeng ZHAO, Jianing LI. Topology optimization based graph convolutional network combining with global structural information [J]. Journal of Computer Applications, 2022, 42(2): 357-364. |
[6] | Yaoming MA, Yu ZHANG. Insulator detection algorithm based on improved Faster-RCNN [J]. Journal of Computer Applications, 2022, 42(2): 631-637. |
[7] | Wei LI, Yaochi FAN, Qiaoyong JIANG, Lei WANG, Qingzheng XU. Variable convolutional autoencoder method based on teaching-learning-based optimization for medical image classification [J]. Journal of Computer Applications, 2022, 42(2): 592-598. |
[8] | Xinyu CHEN, Mingzhe LIU, Jun REN, Ying TANG. Parameter asynchronous updating algorithm based on multi-column convolutional neural network [J]. Journal of Computer Applications, 2022, 42(2): 395-403. |
[9] | Yaming LI, Kai XING, Hongwu DENG, Zhiyong WANG, Xuan HU. Derivative-free few-shot learning based performance optimization method of pre-trained models with convolution structure [J]. Journal of Computer Applications, 2022, 42(2): 365-374. |
[10] | Jianming LI, Bin CHEN, Zhiwei JIANG, Jian QIN. Constrained differentiable neural architecture search in optimized search space [J]. Journal of Computer Applications, 2022, 42(1): 44-49. |
[11] | Jialiang DUAN, Guoming CAI, Kaiyong XU. Memory combined feature classification method based on multiple BP neural networks [J]. Journal of Computer Applications, 2022, 42(1): 178-182. |
[12] | Hengxin LI, Kan CHANG, Yufei TAN, Mingyang LING, Tuanfa QIN. Color image demosaicking network based on inter-channel correlation and enhanced information distillation [J]. Journal of Computer Applications, 2022, 42(1): 245-251. |
[13] | Zhen YANG, Xiaobao PENG, Qiangqiang ZHU, Zhijian YIN. Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus [J]. Journal of Computer Applications, 2022, 42(1): 230-238. |
[14] | Huiqing XU, Bin CHEN, Jingfei WANG, Zhiyi CHEN, Jian QIN. Elongated pavement distress detection method based on convolutional neural network [J]. Journal of Computer Applications, 2022, 42(1): 265-272. |
[15] | Runze WANG, Yueqin ZHANG, Qiqi QIN, Zehua ZHANG, Xumin GUO. Multi-aspect multi-attention fusion of molecular features for drug-target affinity prediction [J]. Journal of Computer Applications, 2022, 42(1): 325-332. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||