《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (2): 426-432.DOI: 10.11772/j.issn.1001-9081.2021050907

• 人工智能 • 上一篇    下一篇

注入注意力机制的深度特征融合新闻推荐模型

刘羽茜, 刘玉奇, 张宗霖, 卫志华, 苗冉   

  1. 同济大学 电子与信息工程学院,上海 201804
  • 收稿日期:2021-06-01 修回日期:2021-07-16 接受日期:2021-07-19 发布日期:2022-02-11 出版日期:2022-02-10
  • 作者简介:刘羽茜(2001—),女,福建南平人,主要研究方向:机器学习、深度学习;
    刘玉奇(2001—),男,山东烟台人,CCF会员,主要研究方向:Web挖掘、深度学习;
    张宗霖(2001—),男,辽宁丹东人,主要研究方向:机器学习、深度学习;
    卫志华(1979—),女,山西晋中人,教授,博士,CCF会员,主要研究方向:机器学习、数据挖掘、图像内容分析、自然语言处理;
    苗冉(2000—),男,安徽池州人,主要研究方向:自然语言处理、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61976160);上海市大学生创新训练计划项目(S202110247021)

News recommendation model with deep feature fusion injecting attention mechanism

Yuxi LIU, Yuqi LIU, Zonglin ZHANG, Zhihua WEI, Ran MIAO   

  1. College of Electronic and Information Engineering,Tongji University,Shanghai 201804,China
  • Received:2021-06-01 Revised:2021-07-16 Accepted:2021-07-19 Online:2022-02-11 Published:2022-02-10
  • About author:LIU Yuxi, born in 2001. Her research interests include machine learning, deep learning.
    LIU Yuqi, born in 2001. His research interests include Web mining, deep learning.
    ZHANG Zonglin, born in 2001. His research interests include machine learning, deep learning.
    WEI Zhihua, born in 1979, Ph. D., professor. Her research interests include machine learning, data mining, image content analysis, natural language processing.
    MIAO Ran, born in 2000. His research interests include natural language processing, data mining.
  • Supported by:
    National Natural Science Foundation of China(61976160);Shanghai Innovative Training Program for College Students(S202110247021)

摘要:

现有新闻推荐模型在挖掘新闻特征和用户特征时,往往没有考虑所浏览新闻之间的关系、时序变化以及不同新闻对用户的重要性,从而缺乏全面性;同时,现有模型在新闻更细粒度的内容特征挖掘方面有欠缺。因此构建了一个能够全面而不冗余地进行用户表征并能提取新闻更细粒度片段特征的新闻推荐模型——注入注意力机制的深度特征融合新闻推荐模型。该模型首先采用基于深度学习的方法,通过注入注意力机制的卷积神经网络(CNN)对新闻文本特征矩阵进行深度提取;然后,通过对用户已经浏览的新闻添加时序预测,并注入多头自注意力机制,来提取用户的兴趣特征;最后,使用真实的中文数据集与英文数据集,以收敛时间、平均值倒数秩(MRR)和归一化折现累积收益(nDCG)为指标进行实验。与基于多头自注意力的神经网络新闻推荐(NRMS)模型等进行对比,该模型在中文数据集上nDCG的提升率为-0.22%~4.91%,MRR的提升率为-0.82%~3.48%,而且,与唯一为负提升率的模型相比,收敛时间缩短7.63%;在英文数据集上该模型在nDCG和MRR上的提升率分别为0.07%~1.75%与0.03%~1.30%,且该模型始终具有较快的收敛速度。消融实验的结果表明增加注意力机制与时序模块是有效的。

关键词: 新闻推荐, 自然语言处理, 注意力机制, 神经网络, 时序预测

Abstract:

When mining news features and user features, the existing news recommendation models often lack comprehensiveness since they often fail to consider the relationship between the browsed news, the change of time series, and the importance of different news to users. At the same time, the existing models also have shortcomings in more fine-grained content feature mining. Therefore, a news recommendation model with deep feature fusion injecting attention mechanism was constructed, which can comprehensively and non-redundantly conduct user characterization and extract the features of more fine-grained news fragments. Firstly, a deep learning-based method was used to deeply extract the feature matrix of news text through the Convolutional Neural Network (CNN) injecting attention mechanism. By adding time series prediction to the news that users had browsed and injecting multi-head self-attention mechanism, the interest characteristics of users were extracted. Finally, a real Chinese dataset and English dataset were used to carry out experiments with convergence time, Mean Reciprocal Rank (MRR) and normalized Discounted Cumulative Gain (nDCG) as indicators. Compared with Neural news Recommendation with Multi-head Self-attention (NRMS) and other models, on the Chinese dataset, the proposed model has the average improvement rate of nDCG from -0.22% to 4.91% and MRR from -0.82% to 3.48%. Compared with the only model with negative improvement rate, the proposed model has the convergence time reduced by 7.63%. on the English dataset, the proposed model has the improvement rates reached 0.07% to 1.75% and 0.03% to 1.30% respectively on nDCG and MRR; At the same time this model always has fast convergence speed. Results of ablation experiments show that adding attention mechanism and time series prediction module is effective.

Key words: news recommendation, natural language processing, attention mechanism, neural network, time series prediction

中图分类号: