Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (4): 976-983.DOI: 10.11772/j.issn.1001-9081.2020081275

Special Issue: CCF第35届中国计算机应用大会(CCF NCCA 2020)

• The 35 CCF National Conference of Computer Applications (CCF NCCA 2020) • Previous Articles     Next Articles

Tag recommendation method combining network structure information and text content

CHE Bingqian, ZHOU Dong   

  1. School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan Hunan 411201, China
  • Received:2020-08-20 Revised:2020-09-28 Online:2021-04-10 Published:2020-11-05
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61876062).

融合网络结构信息及文本内容的标签推荐方法

车冰倩, 周栋   

  1. 湖南科技大学 计算机科学与工程学院, 湖南 湘潭 411201
  • 通讯作者: 周栋
  • 作者简介:车冰倩(1998—),女,内蒙古呼伦贝尔人,硕士研究生,CCF会员,主要研究方向:信息检索、自然语言处理;周栋(1979—),男,湖南长沙人,教授,博士,主要研究方向:信息检索、自然语言处理。
  • 基金资助:
    国家自然科学基金资助项目(61876062)。

Abstract: Recommending appropriate tags for texts is an effective way to better organize and use the text content. At present, most tag recommendation methods mainly recommend tags by mining the text content. However, most of the data information does not exist independently, for example, the co-occurrence of words between texts in a corpus can form a complex network structure. Previous studies have shown that the network structure information between texts and the text content information can summarize the semantics of the same text from two different perspectives, and the information extracted from two aspects can complement and explain each other. Based on this, a tag recommendation method was proposed to simultaneously model the network structure information of text and the content information of text. Firstly, Graph Convolutional neural Network(GCN) was used to extract the structure information of the network between texts, then Recurrent Neural Network(RNN) was used to extract the text content information, and finally the attention mechanism was used to recommend tags by combining the network structure information between texts and the text content information. Compared with baseline methods, such as tag recommendation method based on GCN and tag recommendation method with Topical attention-based Long Short-Term Memory(TLSTM) neural network, the proposed tag recommendation method with attention mechanism combining network structure information and text content information has better performance. For example, on the Mathematics Stack Exchange dataset, the precision, recall and F1 of the proposed method are improved by 2.3%, 3.8%, and 7.0% respectively compared with the optimal baseline method.

Key words: tag recommendation, Recurrent Neural Network (RNN), Graph Convolutional neural Network (GCN), attention mechanism, network structure information, text content

摘要: 为文本推荐合适的标签是更好地组织和使用文本内容的一项有效手段,目前大部分标签推荐方法主要通过挖掘文本内容来进行推荐。然而,大部分数据信息并非独立存在,如语料库中的文本间的词共现关系可形成复杂的网络结构。以往研究表明,文本间的网络结构信息和文本内容信息可以分别从两个不同的角度对同一文本的语义进行概括,并且从两方面提取的信息可以互为补充和解释。基于此,提出一种同时对文本网络结构信息和文本内容信息进行建模的标签推荐方法。该方法首先使用图卷积神经网络(GCN)提取文本间网络的结构信息,然后使用循环神经网络(RNN)提取文本内容信息,最后使用注意力机制结合文本间网络结构信息和文本内容信息进行标签的推荐。与基于图卷积神经网络(GCN)的标签推荐方法、基于主题注意力的长短时记忆(TLSTM)神经网络的标签推荐方法等基线方法相比,提出的使用注意力机制结合网络结构信息与文本内容信息的标签推荐方法具有更好的性能。如在Mathematics Stack Exchange数据集上所提方法的准确率、召回率和F1值相较最优基线方法分别提高了2.3%、3.8%、7.0%。

关键词: 标签推荐, 循环神经网络, 图卷积神经网络, 注意力机制, 网络结构信息, 文本内容

CLC Number: