《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (6): 1796-1806.DOI: 10.11772/j.issn.1001-9081.2023060733
所属专题: 人工智能
收稿日期:
2023-06-09
修回日期:
2023-08-13
接受日期:
2023-08-15
发布日期:
2023-09-14
出版日期:
2024-06-10
通讯作者:
沈君凤
作者简介:
周星辰(1998—),男,湖北襄阳人,硕士研究生,主要研究方向:文本分类、文本情感分析基金资助:
Junfeng SHEN(), Xingchen ZHOU, Can TANG
Received:
2023-06-09
Revised:
2023-08-13
Accepted:
2023-08-15
Online:
2023-09-14
Published:
2024-06-10
Contact:
Junfeng SHEN
About author:
ZHOU Xingchen, born in 1998, M. S. candidate. His research interests include text classification, text sentiment analysis.Supported by:
摘要:
针对先前提示学习方法中存在的模板迭代更新周期长、泛化能力差等问题,基于改进的提示学习方法提出一种双通道的情感分析模型。首先,将序列化后的提示模板与输入词向量一起引入注意力机制结构,在输入词向量在多层注意力机制中更新的同时迭代更新提示模板;其次,在另一通道采用ALBERT(A Lite BERT (Bidirectional Encoder Representations from Transformers))模型提取语义信息;最后,输出用集成方式提取的语义特征,提升整体模型的泛化能力。所提模型在SemEval2014的Laptop和Restaurants数据集、ACL(Association for Computational Linguistics)的Twitter数据集和斯坦福大学创建的SST-2数据集上进行实验,分类准确率达到80.88%、91.78%、76.78%和95.53%,与基线模型BERT_Large相比,分别提升0.99%、1.13%、3.39%和2.84%;与P-tuning v2相比,所提模型的分类准确率在Restaurants数据集、Twitter数据集以及SST-2数据集上分别有2.88%、3.60%和2.06%的提升,且比原方法更早达到收敛状态。
中图分类号:
沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 计算机应用, 2024, 44(6): 1796-1806.
Junfeng SHEN, Xingchen ZHOU, Can TANG. Dual-channel sentiment analysis model based on improved prompt learning method[J]. Journal of Computer Applications, 2024, 44(6): 1796-1806.
输入 | 输出 |
---|---|
You are likely to find a overflow in a ___. | drain |
Ravens can ___. | fly |
Joke would make you want to ___. | laugh |
Sometimes virus causes ___. | infection |
Birds have ___. | feathers |
表1 手动构建模板示例
Tab. 1 Handcrafted template examples
输入 | 输出 |
---|---|
You are likely to find a overflow in a ___. | drain |
Ravens can ___. | fly |
Joke would make you want to ___. | laugh |
Sometimes virus causes ___. | infection |
Birds have ___. | feathers |
样本 极性 | SST-2 | Lap14 | Rest14 | |||||
---|---|---|---|---|---|---|---|---|
train | test | train | test | train | test | train | test | |
共计 | 67 349 | 872 | 2 313 | 638 | 3 518 | 973 | 6 050 | 676 |
积极 | 29 780 | 428 | 987 | 341 | 2 179 | 657 | 1 507 | 172 |
消极 | 37 569 | 444 | 866 | 128 | 839 | 222 | 1 527 | 168 |
中性 | 0 | 0 | 460 | 169 | 500 | 94 | 3 016 | 336 |
表2 数据集样本极性分布统计
Tab. 2 Statistics of polarity distribution of dataset samples
样本 极性 | SST-2 | Lap14 | Rest14 | |||||
---|---|---|---|---|---|---|---|---|
train | test | train | test | train | test | train | test | |
共计 | 67 349 | 872 | 2 313 | 638 | 3 518 | 973 | 6 050 | 676 |
积极 | 29 780 | 428 | 987 | 341 | 2 179 | 657 | 1 507 | 172 |
消极 | 37 569 | 444 | 866 | 128 | 839 | 222 | 1 527 | 168 |
中性 | 0 | 0 | 460 | 169 | 500 | 94 | 3 016 | 336 |
超参数 | 左通道 | 右通道 |
---|---|---|
Batch size | 32 | 32 |
Learning rate | ||
Weight decay | 0.01 | |
Prefix_length | — | 20 |
Prefix_att_rate | — | 0.1 |
GCN layers | — | 2 |
Hidden size | 2 048 | 1 024 |
Padding_length | 55 | 55 |
Dropout | 0.5 | 0.5 |
表3 模型超参数
Tab. 3 Model hyperparameters
超参数 | 左通道 | 右通道 |
---|---|---|
Batch size | 32 | 32 |
Learning rate | ||
Weight decay | 0.01 | |
Prefix_length | — | 20 |
Prefix_att_rate | — | 0.1 |
GCN layers | — | 2 |
Hidden size | 2 048 | 1 024 |
Padding_length | 55 | 55 |
Dropout | 0.5 | 0.5 |
模型 | STT-2 | Lap14 | Rest14 | |||||
---|---|---|---|---|---|---|---|---|
Accuracy | Accuracy | Accuracy | Accuracy | |||||
GloVe_LSTM | 85.58 | 85.58 | 68.17 | 68.12 | 77.50 | 76.35 | 68.64 | 68.45 |
GloVe_BiLSTM_TextCNN | 88.18 | 88.18 | 71.16 | 70.15 | 84.28 | 84.00 | 71.45 | 71.32 |
ATAE_LSTM | — | — | 68.20 | — | 77.20 | — | — | — |
AEN_BERT | — | — | 79.93 | 76.31 | 83.12 | 73.76 | 74.71 | 73.13 |
BERT_base | 91.20 | 91.20 | 78.53 | 78.52 | 89.11 | 88.66 | 73.82 | 73.76 |
BERT_Large | 92.89 | 92.89 | 80.09 | 79.93 | 90.75 | 90.75 | 74.26 | 74.10 |
BERT_TextCNN | 93.23 | 93.23 | 80.56 | 80.65 | 89.93 | 89.47 | 74.70 | 74.58 |
P-tuning v2(BERT_Large) | 93.60 | 93.60 | 81.97 | 82.05 | 89.21 | 88.91 | 74.11 | 73.99 |
P-Tuning(our) | 94.61 | 94.61 | 81.03 | 81.26 | 90.03 | 89.90 | 75.15 | 74.97 |
ALBERT_xlarge | 95.30 | 95.30 | 79.00 | 78.57 | 91.16 | 90.91 | 74.85 | 74.85 |
本文模型 | 95.53 | 95.53 | 80.88 | 80.81 | 91.78 | 91.54 | 76.78 | 76.48 |
表4 对比实验结果 (%)
Tab. 4 Comparison experiment results
模型 | STT-2 | Lap14 | Rest14 | |||||
---|---|---|---|---|---|---|---|---|
Accuracy | Accuracy | Accuracy | Accuracy | |||||
GloVe_LSTM | 85.58 | 85.58 | 68.17 | 68.12 | 77.50 | 76.35 | 68.64 | 68.45 |
GloVe_BiLSTM_TextCNN | 88.18 | 88.18 | 71.16 | 70.15 | 84.28 | 84.00 | 71.45 | 71.32 |
ATAE_LSTM | — | — | 68.20 | — | 77.20 | — | — | — |
AEN_BERT | — | — | 79.93 | 76.31 | 83.12 | 73.76 | 74.71 | 73.13 |
BERT_base | 91.20 | 91.20 | 78.53 | 78.52 | 89.11 | 88.66 | 73.82 | 73.76 |
BERT_Large | 92.89 | 92.89 | 80.09 | 79.93 | 90.75 | 90.75 | 74.26 | 74.10 |
BERT_TextCNN | 93.23 | 93.23 | 80.56 | 80.65 | 89.93 | 89.47 | 74.70 | 74.58 |
P-tuning v2(BERT_Large) | 93.60 | 93.60 | 81.97 | 82.05 | 89.21 | 88.91 | 74.11 | 73.99 |
P-Tuning(our) | 94.61 | 94.61 | 81.03 | 81.26 | 90.03 | 89.90 | 75.15 | 74.97 |
ALBERT_xlarge | 95.30 | 95.30 | 79.00 | 78.57 | 91.16 | 90.91 | 74.85 | 74.85 |
本文模型 | 95.53 | 95.53 | 80.88 | 80.81 | 91.78 | 91.54 | 76.78 | 76.48 |
消融模块 | 模型 | SST-2 | Lap14 | Rest14 | |
---|---|---|---|---|---|
序列化模块 | P-tuning v2 | 93.60 | 81.97 | 89.21 | 74.11 |
Prefix_BiLSTM | 93.68 | 80.09 | 89.62 | 75.00 | |
Prefix_Att_BiLSTM | 94.61 | 81.03 | 90.03 | 75.15 | |
情感权重模块 | Prefix_adj_Gcn | 94.69 | 80.88 | 90.34 | 76.04 |
Prefix_senti_Gcn | 94.72 | 81.19 | 90.44 | 76.92 | |
对抗训练实验 | GloVe_BiLSTM_TextCNN | 88.18 | 71.16 | 84.28 | 71.45 |
GloVe_BiLSTM_TextCNN(adv) | 88.30 | 72.57 | 84.89 | 72.63 | |
BERT_base | 91.20 | 78.53 | 89.11 | 72.82 | |
BERT_adv | 94.15 | 79.47 | 90.34 | 76.63 | |
ALBERT_xlarge | 95.30 | 79.00 | 91.78 | 74.58 | |
ALBERT_adv | 95.30 | 80.88 | 91.16 | 76.78 |
表5 消融实验准确率结果 (%)
Tab. 5 Ablation experiment accuracy results
消融模块 | 模型 | SST-2 | Lap14 | Rest14 | |
---|---|---|---|---|---|
序列化模块 | P-tuning v2 | 93.60 | 81.97 | 89.21 | 74.11 |
Prefix_BiLSTM | 93.68 | 80.09 | 89.62 | 75.00 | |
Prefix_Att_BiLSTM | 94.61 | 81.03 | 90.03 | 75.15 | |
情感权重模块 | Prefix_adj_Gcn | 94.69 | 80.88 | 90.34 | 76.04 |
Prefix_senti_Gcn | 94.72 | 81.19 | 90.44 | 76.92 | |
对抗训练实验 | GloVe_BiLSTM_TextCNN | 88.18 | 71.16 | 84.28 | 71.45 |
GloVe_BiLSTM_TextCNN(adv) | 88.30 | 72.57 | 84.89 | 72.63 | |
BERT_base | 91.20 | 78.53 | 89.11 | 72.82 | |
BERT_adv | 94.15 | 79.47 | 90.34 | 76.63 | |
ALBERT_xlarge | 95.30 | 79.00 | 91.78 | 74.58 | |
ALBERT_adv | 95.30 | 80.88 | 91.16 | 76.78 |
1 | 江洋洋,金伯,张宝昌. 深度学习在自然语言处理领域的研究进展[J]. 计算机工程与应用, 2021, 57(22): 1-14. |
JIANG Y Y, JIN B, ZHANG B C. Research progress of natural language processing based on deep learning [J]. Computer Engineering and Applications, 2021, 57(22): 1-14. | |
2 | SMITH S, PATWARY M, NORICK B, et al. Using DeepSpeed and Megatron to train Megatron-Turing NLG 530B, a large-scale generative language model [EB/OL]. (2022-02-04)[2023-05-30]. . |
3 | SUN Y, WANG S, FENG S, et al. ERNIE 3.0: large-scale knowledge enhanced pre-training for language understanding and generation [EB/OL]. (2021-07-05)[2023-05-30]. . |
4 | WU S, ZHAO X, YU T, et al. Yuan 1.0: large-scale pre-trained language model in zero-shot and few-shot learning [EB/OL]. (2021-10-12)[2023-05-30]. . |
5 | HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-efficient transfer learning for NLP [EB/OL]. (2019-06-09)[2023-05-30]. . |
6 | WANG A, SINGH A, MICHAEL J, et al. GLUE: a multi-task benchmark and analysis platform for natural language understanding [EB/OL]. (2018-04-20)[2023-05-30]. . |
7 | LIU X, ZHENG Y, DU Z, et al. GPT understands, too [EB/OL]. (2021-03-18)[2023-05-30]. . |
8 | LI X L, LIANG P. Prefix-tuning: optimizing continuous prompts for generation [EB/OL]. (2021-01-01)[2023-05-30]. . |
9 | LIU X, JI K, FU Y, et al. P-tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks [EB/OL]. (2021-10-14)[2023-05-30]. . |
10 | LESTER B, AL-RFOU R, CONSTANT N. The power of scale for parameter-efficient prompt tuning [EB/OL]. (2021-04-18) [2023-05-30]. . |
11 | KALCHBRENNER N, GREFENSTETTE E, BLUNSOM P. A convolutional neural network for modelling sentences [EB/OL]. (2014-04-08)[2023-05-30]. . |
12 | KIM Y. Convolutional neural networks for sentence classification [EB/OL]. (2014-08-25)[2023-05-30]. . |
13 | SOCHER R, LIN C C-Y, NG A Y, et al. Parsing natural scenes and natural language with recursive neural networks[C]// Proceedings of the 28th International Conference on Machine Learning. Madison, WI: Omnipress, 2011: 129-136. |
14 | HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. |
15 | CHUNG J, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling [EB/OL]. (2014-12-11)[2023-05-30]. . |
16 | PAN S J, YANG Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359. |
17 | PAN S J, TSANG I W, KWOK J T, et al. Domain adaptation via transfer component analysis [J]. IEEE Transactions on Neural Networks, 2010, 22(2): 199-210. |
18 | LONG M, WANG J, DING G, et al. Transfer feature learning with joint distribution adaptation [C]// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013: 2200-2207. |
19 | WEISS K, KHOSHGOFTAAR T M, WANG D D. A survey of transfer learning [J]. Journal of Big Data, 2016, 3: No. 9. |
20 | DAY O, KHOSHGOFTAAR T M. A survey on heterogeneous transfer learning [J]. Journal of Big Data, 2017, 4: No. 29. |
21 | PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations [C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). Stroudsburg: ACL, 2018:2227-2237. |
22 | RADFORD A, NARASIMHAN K, SALIMANS T,et al. Improving language understanding by generative pre-training [EB/OL]. [2023-05-30]. . |
23 | VASWANI A, SHAZEER N, PARMAR N,et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc, 2017:6000-6010. |
24 | DEVLIN J, CHANG M-W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [EB/OL]. (2018-10-11)[2023-05-30]. . |
25 | YU J, JIANG J. Adapting BERT for target-oriented multimodal sentiment classification [C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2019:5408-5414. |
26 | 方澄,李贝,韩萍,等.基于语法依存图的中文微博细粒度情感分类[J]. 计算机应用, 2023, 43(4): 1056-1061. |
FANG C, LI B, HAN P, et al. Fine-grained emotion classification of Chinese microblog based on syntactic dependency graph[J]. Journal of Computer Applications, 2023, 43(4): 1056-1061. | |
27 | YANG Z, DAI Z, YANG Y, et al. XLNet: generalized autoregressive pretraining for language understanding [EB/OL]. [2020-05-26]. . |
28 | PFEIFFER J, KAMATH A, RÜCKLÉ A,et al. AdapterFusion:non-destructive task composition for transfer learning [EB/OL]. (2020-05-01)[2023-05-30]. . |
29 | RÜCKLÉ A, GEIGLE G, GLOCKNER M, et al. AdapterDrop: on the efficiency of adapters in transformers [EB/OL]. (2020-10-22)[2023-05-30]. . |
30 | GAO T, FISCH A, CHEN D. Making pre-trained language models better few-shot learners [EB/OL]. (2020-12-31)[2023-05-30]. . |
31 | SHIN T, RAZEGHI Y, LOGAN IV R L, et al. AutoPrompt:eliciting knowledge from language models with automatically generated prompts [EB/OL]. (2020-10-29)[2023-05-30]. . |
32 | 张心月,刘蓉,魏驰宇,等.融合提示知识的方面级情感分析[J].计算机应用, 2023, 43(9): 2753-2759. |
ZHANG X Y, LIU R, WEI C Y, et al. Aspect-based sentiment analysis method with integrating prompt knowledge [J]. Journal of Computer Applications, 2023, 43(9): 2753-2759. | |
33 | MARGATINA K, BAZIOTIS C, POTAMIANOS A. Attention-based conditioning methods for external knowledge integration [EB/OL]. (2019-06-09)[2023-05-30]. . |
34 | BAO L, LAMBERT P, BADIA T. Attention and lexicon regularized LSTM for aspect-based sentiment analysis[C]/ /Proceedings of the 57th Annual Meeting of the Association For Computational Linguistics: Student Research Workshop. Stroudsburg: ACL, 2019: 253-259. |
35 | CAMBRIA E, LI Y, XING F Z, et al. SenticNet 6: ensemble application of symbolic and subsymbolic AI for sentiment analysis[C]// Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York: ACM, 2020: 105-114. |
36 | PETRONI F, ROCKTÄSCHEL T, LEWIS P, et al. Language models as knowledge bases? [EB/OL].(2019-09-03)[2023-05-30]. . |
37 | MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks [EB/OL]. (2017-06-09)[2023-05-30]. . |
38 | WANG Y, HUANG M, ZHU X, et al. Attention-based LSTM for aspect-level sentiment classification[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2016: 606-615. |
39 | SONG Y, WANG J, JIANG T, et al. Targeted sentiment classification with attentional encoder network [C]// Proceedings of the 28th International Conference on Artificial Neural Networks. Cham: Springer, 2019: 93-103. |
[1] | 张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371. |
[2] | 毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025. |
[3] | 游新冬, 问英姿, 佘鑫鹏, 吕学强. 面向煤矿机电设备领域的三元组抽取方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2026-2033. |
[4] | 姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785. |
[5] | 吴锦富, 柳毅. 基于随机噪声和自适应步长的快速对抗训练方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1807-1815. |
[6] | 余新言, 曾诚, 王乾, 何鹏, 丁晓玉. 基于知识增强和提示学习的小样本新闻主题分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1767-1774. |
[7] | 魏超, 陈艳平, 王凯, 秦永彬, 黄瑞章. 基于掩码提示与门控记忆网络校准的关系抽取方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1713-1719. |
[8] | 余杭, 周艳玲, 翟梦鑫, 刘涵. 基于预训练模型与标签融合的文本分类[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 709-714. |
[9] | 赖华, 孙童, 王文君, 余正涛, 高盛祥, 董凌. 多模态特征的越南语语音识别文本标点恢复[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 418-423. |
[10] | 王星, 刘贵娟, 陈志豪. 高斯混合模型与文本图卷积网络结合的虚假评论识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 360-368. |
[11] | 高颖杰, 林民, 斯日古楞null, 李斌, 张树钧. 基于片段抽取原型网络的古籍文本断句标点提示学习方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3815-3822. |
[12] | 郭晓, 陈艳平, 唐瑞雪, 黄瑞章, 秦永彬. 融合行为词的罪名预测多任务学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 159-166. |
[13] | 陈彤, 位纪伟, 何仕远, 宋井宽, 杨阳. 基于自适应攻击强度的对抗训练方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 94-100. |
[14] | 于碧辉, 蔡兴业, 魏靖烜. 基于提示学习的小样本文本分类方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2735-2740. |
[15] | 张小艳, 段正宇. 基于句级别GAN的跨语言零资源命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2406-2411. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||