Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (2): 385-392.DOI: 10.11772/j.issn.1001-9081.2023020179

• Artificial intelligence • Previous Articles    

Chinese medical named entity recognition based on self-attention mechanism and lexicon enhancement

Xinran LUO, Tianrui LI(), Zhen JIA   

  1. School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu Sichuan 611756,China
  • Received:2023-02-27 Revised:2023-04-11 Accepted:2023-04-13 Online:2024-02-22 Published:2024-02-10
  • Contact: Tianrui LI
  • About author:LUO Xinran, born in 1997, M. S. candidate. Her research interests include natural language processing.
    JIA Zhen, born in 1975, Ph. D., lecturer. Her research interests include intelligent question answering, knowledge graph.
  • Supported by:
    National Natural Science Foundation of China(62276218)


罗歆然, 李天瑞(), 贾真   

  1. 西南交通大学 计算机与人工智能学院,成都 611756
  • 通讯作者: 李天瑞
  • 作者简介:罗歆然(1997—),女,四川德阳人,硕士研究生,主要研究方向:自然语言处理
  • 基金资助:


To address the difficulty of word boundary recognition stemming from nested entities in Chinese medical texts, as well as significant semantic information loss in existing Lattice-LSTM structures with integrated lexical features, an adaptive lexical information enhancement model for Chinese Medical Named Entity Recognition (MNER) was proposed. First, the BiLSTM (Bi-directional Long-Short Term Memory) network was utilized to encode the contextual information of the character sequence and capture the long-distance dependencies. Next, potential word information of each character was modeled as character-word pairs, and the self-attention mechanism was utilized to realize internal interactions between different words. Finally, a lexicon adapter based on bilinear-attention mechanism was used to integrate lexical information into each character in the text sequence, enhancing semantic information effectively while fully utilizing the rich boundary information of words and suppressing words with low correlation. Experimental results demonstrate that the average F1 value of the proposed model increases by 1.37 to 2.38 percentage points compared to the character-based baseline model, and its performance is further optimized when combined with BERT.

Key words: Medical Named Entity Recognition (MNER), Chinese medical text, lexicon adapter, self-attention mechanism, Bi-directional Long Short-Term Memory (BiLSTM) network



关键词: 医学命名实体识别, 中文医学文本, 词汇适配器, 自注意力机制, 双向长短期记忆网络

CLC Number: