基于BERT的不完全数据情感分类

doi:10.11772/j.issn.1001-9081.2020061066

计算机应用 ›› 2021, Vol. 41 ›› Issue (1): 139-144.DOI: 10.11772/j.issn.1001-9081.2020061066

所属专题：第八届中国数据挖掘会议(CCDM 2020)

• 第八届中国数据挖掘会议(CCDM 2020) • 上一篇下一篇

基于BERT的不完全数据情感分类

罗俊^1,2, 陈黎飞^1,2

1. 福建师范大学数学与信息学院, 福州 350117;
2. 数字福建环境监测物联网实验室(福建师范大学), 福州 350117

收稿日期:2020-05-31 修回日期:2020-08-03 发布日期:2020-11-12 出版日期:2021-01-10
通讯作者: 罗俊
作者简介:罗俊(1995-),男,江西南昌人,硕士研究生,主要研究方向:数据挖掘、自然语言处理;陈黎飞(1972-),男,福建长乐人,教授,博士,主要研究方向:统计机器学习、数据挖掘、模式识别。
基金资助:
福建省自然科学基金资助项目（2015J01238）；福建师范大学创新团队项目（IRTL1704）。

Sentiment classification of incomplete data based on bidirectional encoder representations from transformers

LUO Jun^1,2, CHEN Lifei^1,2

1. College of Mathematics and Informatics, Fujian Normal University, Fuzhou Fujian 350117, China;
2. Digital Fujian Internet-of-Things Laboratory of Environmental Monitoring(Fujian Normal University), Fuzhou Fujian 350117, China

Received:2020-05-31 Revised:2020-08-03 Online:2020-11-12 Published:2021-01-10
Supported by:
This work is partially supported by the Natural Science Foundation of Fujian Province (2015J01238), the Innovation Team Project of Fujian Normal University (IRTL1704).

摘要/Abstract

摘要： 不完全数据，如社交平台的互动信息、互联网电影资料库中的影评内容，广泛存在于现实生活中。而现有情感分类模型大多建立在完整的数据集上，没有考虑不完整数据对分类性能的影响。针对上述问题提出基于BERT的栈式降噪神经网络模型，用于面向不完全数据的情感分类。该模型由栈式降噪自编码器（SDAE）和BERT两部分组成。首先将经词嵌入处理的不完全数据输入到SDAE中进行去噪训练，以提取深层特征来重构缺失词和错误词的特征表示；接着将所得输出传入BERT预训练模型中进行精化以进一步改进词的特征向量表示。在两个常用的情感数据集上的实验结果表明，所提方法在不完全数据情感分类中的F1值和准确率分别提高了约6%和5%，验证了所提模型的有效性。

关键词: 不完全数据, 情感分类, BERT, 栈式降噪自编码器, 预训练模型

Abstract: Incomplete data, such as the interactive information on social platforms and the review contents in Internet movie datasets, widely exist in the real life. However, most existing sentiment classification models are built on the basis of complete data, without considering the impact of incomplete data on classification performance. To address this problem, a stacked denoising neural network model based on BERT (Bidirectional Encoder Representations from Transformers) was proposed for sentiment classification of incomplete data. This model was composed of two components:Stacked Denoising AutoEncoder (SDAE) and BERT. Firstly, the incomplete data processed by word-embedding was fed to the SDAE for denoising training in order to extract deep features to reconstruct the feature representation of the missing words and wrong words. Then, the obtained output was passed into the BERT pre-training model to further improve the feature vector representation of the words by refining. Experimental results on two commonly used sentiment datasets demonstrate that the proposed method has the F1 measure and classification accuracy in incomplete data classification improved by about 6% and 5% respectively, thus verifying the effectiveness of the proposed model.

Key words: incomplete data, sentiment classification, BERT (Bidirectional Encoder Representations from Transformers), Stacked Denoising AutoEncoder (SDAE), pre-training model

中图分类号:

TP391

罗俊, 陈黎飞. 基于BERT的不完全数据情感分类[J]. 计算机应用, 2021, 41(1): 139-144.

LUO Jun, CHEN Lifei. Sentiment classification of incomplete data based on bidirectional encoder representations from transformers[J]. Journal of Computer Applications, 2021, 41(1): 139-144.

参考文献

[1] ZHANG Y,JIN R,ZHOU Z. Understanding bag-of-words model:a statistical framework[J]. International Journal of Machine Learning and Cybernetics,2010,1(1/2/3/4):43-52.
[2] DAO N B,BERTET K,REVEL A. Reduction dimension of bags of visual words with FCA[C]//Proceedings of the 11th International Conference on Concept Lattices and Their Applications. Sun SITE Central Europe:CEUR-WS,2014:219-230.
[3] MIKOLOV T,CHEN K,CORRADO G,et al. Efficient estimation of word representations in vector space[EB/OL].[2019-12-28]. http://arxiv.org/pdf/1301.3781.pdf.
[4] 郁可人, 傅云斌, 董启文. 基于神经网络语言模型的分布式词向量研究进展[J]. 华东师范大学学报(自然科学版),2017(5):52-65,79.(YU K R,FU Y B,DONG Q W. Survey on distributed word embeddings based on neural network language models[J]. Journal of East China Normal University(Natural Science),2017(5):52-65,79.)
[5] SUI H,KHOO C,CHAN S. Sentiment classification of product reviews using SVM and decision tree introduction[J]. Advances in Classification Research Online,2003,14(1):42-52.
[6] HANSEN L K,SALAMON P. Neural network ensembles[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1990, 12(10):993-1001.
[7] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA:Associations for Computational Linguistics,2014:1746-1751.
[8] LIN L,GU Z,ZHANG Z,et al. Build Chinese language model with recurrent neural network[C]//Proceedings of the 12th KIPS International Conference on Ubiquitous Information Technologies and Applications/9th International Conference on Computer Science and its Applications,LNEE 474. Cham:Springer,2017:920-925.
[9] SAK H,SENIOR A W,BEAUFAYS F. Long short-term memory recurrent neural network architectures for large scale acoustic modeling[C]//Proceedings of the 15th Annual Conference of the International Speech Communication Association. Minneapolis:ISCA,2014:338-342.
[10] WANG H,WEI H,GUO J,et al. Ancient Chinese sentence segmentation based on bidirectional LSTM+CRF model[J]. Journal of Advanced Computational Intelligence and Intelligent Informatics,2019,23(4):719-725.
[11] ZHANG H,WANG J,ZHANG J,et al. YNU-HPCC at SemEval 2017 Task 4:using a multi-channel CNN-LSTM model for sentiment classification[C]//Proceedings of the 11th International Workshop on Semantic Evaluation. Stroudsburg,PA:Association for Computational Linguistics,2017:796-801.
[12] ZHOU P,SHI W,TIAN J,et al. Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics,2016:207-212.
[13] PANKO R R. Thinking is bad:implications of human error research for spreadsheet research and practice[EB/OL].[2020-02-20]. http://arxiv.org/pdf/0801.3114.pdf.
[14] VINCENT P,LAROCHELLE H,LAJOIE I,et al. Stacking denoising autoencoders:learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research,2010,11(110):3371-3408.
[15] 郭喻栋, 郭志刚, 席耀一. 基于降噪自编码器网络与词向量的信息推荐方法[J]. 计算机工程,2017,43(12):173-178. (GUO Y D,GUO Z G,XI Y Y. Information recommendation method based on denoising auto-encoder network and word vector[J]. Computer Engineering,2017,43(12):173-178.)
[16] QIU X,SUN T,XU Y,et al. Pre-trained models for natural language processing:a survey[EB/OL].[2020-02-20]. https://arxiv.org/pdf/2003.08271.pdf.
[17] DEVLIN J,CHANG M,LEE K,et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg,PA:Association for Computational Linguistics,2019:4171-4186.
[18] BENGIO Y,DUCHARME R,VINCENT P,et al. A neural probabilistic language model[J]. Journal of Machine Learning Research,2003,3:1137-1155.
[19] 郭宏远. 基于词向量和主题向量的文本分类算法研究[D]. 武汉:华中科技大学,2016:4-24.(GUO H Y. Text classification based on word vector and topic vector[D]. Wuhan:Huazhong University of Science and Technology,2016:4-24.)
[20] VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook,NY:Curran Associates Inc.,2017:6000-6010.
[21] PETERS M E, NEUMANN M, LAYYER M, et al. Deep contextualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA:Association for Computational Linguistics, 2018:2227-2237.
[22] WU X,ZHANG T,ZANG L,et al. Mask and infill:applying masked language model to sentiment transfer[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:5271-5277.
[23] SERGIO G C, LEE M. Stacked DeBERT:all attention in incomplete data for text classification[EB/OL].[2020-05-22]. http://arxiv.org/pdf/2001.00137.pdf.

基于BERT的不完全数据情感分类

Sentiment classification of incomplete data based on bidirectional encoder representations from transformers

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	黄于欣, 徐佳龙, 余正涛, 侯书楷, 周家啟. 基于生成提示的无监督文本情感转换方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2667-2673.
[2]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[3]	李晨阳, 张龙, 郑秋生, 钱少华. 基于扩散序列的多元可控文本生成[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2414-2420.
[4]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[5]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[6]	赵征宇, 罗景, 涂新辉. 基于多粒度语义融合的信息检索方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1775-1780.
[7]	余杭, 周艳玲, 翟梦鑫, 刘涵. 基于预训练模型与标签融合的文本分类[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 709-714.
[8]	赖华, 孙童, 王文君, 余正涛, 高盛祥, 董凌. 多模态特征的越南语语音识别文本标点恢复[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 418-423.
[9]	王楷天, 叶青, 程春雷. 基于异构图表示的中医电子病历分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 411-417.
[10]	林翔, 金彪, 尤玮婧, 姚志强, 熊金波. 基于脆弱指纹的深度神经网络模型完整性验证框架[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3479-3486.
[11]	田悦霖, 黄瑞章, 任丽娜. 融合局部语义特征的学者细粒度信息提取方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2707-2714.
[12]	张心月, 刘蓉, 魏驰宇, 方可. 融合提示知识的方面级情感分析方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2753-2759.
[13]	于碧辉, 蔡兴业, 魏靖烜. 基于提示学习的小样本文本分类方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2735-2740.
[14]	张小艳, 段正宇. 基于句级别GAN的跨语言零资源命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2406-2411.
[15]	拓雨欣, 薛涛. 融合指针网络与关系嵌入的三元组联合抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2116-2124.