计算机应用 ›› 2020, Vol. 40 ›› Issue (7): 1884-1890.DOI: 10.11772/j.issn.1001-9081.2019112027

• 人工智能 • 上一篇    下一篇

基于动态路由序列生成模型的多标签文本分类方法

王敏蕊, 高曙, 袁自勇, 袁蕾   

  1. 武汉理工大学 计算机科学与技术学院, 武汉 430063
  • 收稿日期:2019-11-28 修回日期:2020-02-10 出版日期:2020-07-10 发布日期:2020-06-29
  • 通讯作者: 高曙
  • 作者简介:王敏蕊(1995-),女,江西南昌人,硕士研究生,主要研究方向:自然语言处理;高曙(1967-),女,湖北武汉人,教授,博士,主要研究方向:智能计算与语义识别、数据分析与应用;袁自勇(1995-),男,安徽亳州人,硕士研究生,主要研究方向:自然语言处理;袁蕾(1997-),女,安徽滁州人,硕士研究生,主要研究方向:自然语言处理。
  • 基金资助:
    国家自然科学基金资助项目(51679180)。

Sequence generation model with dynamic routing for multi-label text classification

WANG Minrui, GAO Shu, YUAN Ziyong, YUAN Lei   

  1. School of Computer Science and Technology, Wuhan University of Technology, Wuhan Hubei 430063, China
  • Received:2019-11-28 Revised:2020-02-10 Online:2020-07-10 Published:2020-06-29
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (51679180).

摘要: 现实世界中,多标签文本比单标签文本具有更广泛的应用场景,但其输出空间的庞大给分类任务带来了更多的挑战。将多标签文本分类问题看作标签序列生成问题,把序列生成模型(SGM)应用于多标签文本分类领域,并针对该模型的顺序结构容易产生累积误差等不足,构建了基于动态路由(DR)的序列生成模型(DR-SGM)。该模型基于Encoder-Decoder模式:Encoder层中使用双向长短期记忆(Bi-LSTM)神经网络+Attention进行语义信息编码;Decoder层设计了一种基于动态路由的解码器结构,该结构在隐含层后添加了动态路由聚合层,利用路由参数的全局共享减弱了累积误差产生的影响。同时,动态路由能捕获文本中部分-部分、部分-整体的位置信息,并且通过优化动态路由算法进一步提高了语义聚合效果。将DR-SGM应用于多标签文本分类,实验结果表明,在RCV1-V2、AAPD和Slashdot数据集上,多标签文本分类效果得到了有效的提升。

关键词: 多标签文本分类, 序列生成模型, 胶囊网络, 动态路由, 双向长短期记忆神经网络

Abstract: In the real world, multi-label text has a wider application scenario than single-label text. At the same time, due to its huge output space, it brings a lot of challenges to the classification task. The multi-label text classification problem was regarded as label sequence generation problem, and the Sequence Generation Model (SGM) was applied to the multi-label text classification field. Aiming at the problems such as that the sequence structure of the model is easy to produce the cumulative error, an SGM based on Dynamic Routing (DR-SGM) was proposed. The model was based on Encoder-Decoder mode. In the Encoder layer, Bi-directional Long Short-Term Memory (Bi-LSTM) neural network+Attention was used to encode the semantic information. In the Decoder layer, a decoder structure with the dynamic routing aggregation layer was designed which reduces the influence of the cumulative error added behind the hidden layer. At the same time, the part-part and part-glob position information in the text was captured by dynamic routing. And by optimizing the dynamic routing algorithm, the semantic clustering effect was further improved. DR-SGM was applied to the classification of multi-label texts. The experimental results show that DR-SGM improves multi-label text classification results on the RCV1-V2, AAPD and Slashdot datasets.

Key words: multi-label text classification, Sequence Generation Model (SGM), capsule network, Dynamic Routing (DR), Bi-directional Long Short-Term Memory (Bi-LSTM) neural network

中图分类号: