Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (1): 57-63.DOI: 10.11772/j.issn.1001-9081.2021020366

• Artificial intelligence • Previous Articles     Next Articles

Text multi-label classification method incorporating BERT and label semantic attention

Xueqiang LYU, Chen PENG, Le ZHANG(), Zhi’an DONG, Xindong YOU   

  1. Beijing Key Laboratory of Internet Culture and Digital Dissemination Research (Beijing Information Science and Technology University),Beijing 100101,China
  • Received:2021-03-11 Revised:2021-04-28 Accepted:2021-04-29 Online:2021-05-21 Published:2022-01-10
  • Contact: Le ZHANG
  • About author:LYU Xueqiang, born in 1970, Ph. D., professor. His research interests include Chinese and multimedia information processing.
    PENG Chen, born in 1996, M. S. candidate. His research interests include natural language processing.
    ZHANG Le, born in 1988, Ph. D., associate professor. Her research interests include natural language processing, Web user behavior analysis.
    DONG Zhi’an, born in 1989, M. S., research fellow. His research interests include natural language processing.
    YOU Xindong, born in 1979, Ph. D., associate professor. Her research interests include natural language processing, text mining, data classification.
  • Supported by:
    Natural Science Foundation of Beijing(4212020);Tibetan Information Processing and Machine Translation Key Laboratory of Qinghai Province/ Open Project Fund of Key Laboratory of Tibetan Information Processing of Ministry of Education(2019Z002)


吕学强, 彭郴, 张乐(), 董志安, 游新冬   

  1. 网络文化与数字传播北京市重点实验室(北京信息科技大学),北京 100101
  • 通讯作者: 张乐
  • 作者简介:吕学强(1970—),男,山东鱼台人,教授,博士,CCF会员,主要研究方向:中文与多媒体信息处理
  • 基金资助:


Multi-Label Text Classification (MLTC) is one of the important subtasks in the field of Natural Language Processing (NLP). In order to solve the problem of complex correlation between multiple labels, an MLTC method TLA-BERT was proposed by incorporating Bidirectional Encoder Representations from Transformers (BERT) and label semantic attention. Firstly, the contextual vector representation of the input text was learned by fine-tuning the self-coding pre-training model. Secondly, the labels were encoded individually by using Long Short-Term Memory (LSTM) neural network. Finally, the contribution of text to each label was explicitly highlighted with the use of an attention mechanism in order to predict the multi-label sequences. Experimental results show that compared with Sequence Generation Model (SGM) algorithm, the proposed method improves the F value by 2.8 percentage points and 1.5 percentage points on the Arxiv Academic Paper Dataset (AAPD) and Reuters Corpus Volume I (RCV1)-v2 public dataset respectively.

Key words: multi-label classification, Bidirectional Encoder Representations from Transformers (BERT), label semantic information, Bidirectional Long Short-Term Memory (BiLSTM) neural network, attention mechanism



关键词: 多标签分类, BERT, 标签语义信息, 双向长短期记忆神经网络, 注意力机制

CLC Number: