《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (S2): 34-40.DOI: 10.11772/j.issn.1001-9081.2022121862

• 人工智能 • 上一篇    下一篇

基于多粒度自注意力机制的抑郁症预测模型

谭朋柳(), 张露玉, 徐光勇, 徐滕   

  1. 南昌航空大学 软件学院,南昌 330063
  • 收稿日期:2022-12-15 修回日期:2023-03-01 接受日期:2023-03-08 发布日期:2024-01-09 出版日期:2023-12-31
  • 通讯作者: 谭朋柳
  • 作者简介:谭朋柳(1975—),男,湖北崇阳人,副教授,博士,CCF会员,主要研究方向:智能医疗、区块链、信息物理融合系统
    张露玉(1997—),女,江西赣州人,硕士研究生,主要研究方向:智能医疗、疾病预测
    徐光勇(1997—),男,江西南昌人,硕士研究生,主要研究方向:智能医疗、疾病预测
    徐滕(1998—),女,江西南昌人,硕士研究生,主要研究方向:智能医疗、区块链。
  • 基金资助:
    国家自然科学基金资助项目(61961029);江西省科技厅重点研发计划项目(20171ACE50025)

Depression prediction model based on multi-granularity self-attention mechanism

Pengliu TAN(), Luyu ZHANG, Guangyong XU, Teng XU   

  1. School of Software,Nanchang Hangkong University,Nanchang Jiangxi 330063,China
  • Received:2022-12-15 Revised:2023-03-01 Accepted:2023-03-08 Online:2024-01-09 Published:2023-12-31
  • Contact: Pengliu TAN

摘要:

针对基于稀疏文本的抑郁症预测模型特征提取能力不足的问题,提出一种基于分层多粒度自注意网络(HMG-SAN)的模型。首先,通过全局向量(GloVe)模型获取词向量,解决词语和语句的向量化表示的问题;然后通过双向门控循环单元(Bi-GRU)获取文本结构中的词序信息和文本特征,解决提取上下文依赖的特征信息的问题;再通过多粒度自注意力(MG-SA)机制识别不同特征,解决不同粒度短语信息捕捉的问题;最后使用softmax函数获取分类结果。HMG-SAN模型的亮点在于MG-SA机制的融入,对于捕获文本重要词汇提供了很大帮助。在遇险分析访谈语料库(DAIC)数据集上与基于分层注意力网络(HAN)的模型和分层自注意力网络(HSAN)的模型进行对比实验,实验结果表明,所提模型的准确率和召回率均有显著提升,其中,准确率分别提升了2.74%和1.35%,召回率分别提升了7.35%和4.29%。可见,HMG-SAN模型可以更加准确地捕获受访者的抑郁状态,并以此进行更加高效的抑郁症预测。

关键词: 文本分类, 多粒度自注意力机制, 双向门控循环单元, 深度神经网络, 抑郁症预测

Abstract:

Aiming at the problem of insufficient feature extraction ability of depression prediction models based on sparse text, a model based on Hierarchical Multi-Granularity Self-Attention Network (HMG-SAN) was proposed. Firstly, the word vectors were obtained through the Global Vector (GloVe) model to solve the problem of vectorized representation of words and sentences. Then, the word order information and text features in the text structure were obtained by Bi-directional Gated Recurrent Unit (Bi-GRU) to solve the problem of extracting context-dependent feature information. Then, MG-SA (Multi-Granularity Self-Attention) mechanism was used to identify different features to solve the problem of different granularity phrase information capture. Finally, the softmax function was used to obtain the classification results. The highlight of HMG-SAN model is the integration of MG-SA mechanism, which provides a great help for capturing important words in the text. Compared with the Hierarchical Attention Network (HAN) based model and Hierarchical Self-Attention Network (HSAN) based model on the Distress Analysis Interview Corpus (DAIC) dataset, experimental results show that the accuracy and recall rate of the proposed model are significantly improved. Among them, the precision is increased by 2.74% and 1.35% respectively, and the recall rate is increased by 7.35% and 4.29% respectively. In summary, HMG-SAN can capture the depression state of the respondents more accurately, and predict depression more efficiently.

Key words: text classification, multi-granularity self-attention mechanism, Bi-directional Gated Recurrent Unit (Bi-GRU), deep neural network, depression prediction

中图分类号: