《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (6): 1789-1795.DOI: 10.11772/j.issn.1001-9081.2021091638
• 第十八届CCF中国信息系统及应用大会 • 上一篇
收稿日期:
2021-09-27
修回日期:
2021-11-15
接受日期:
2021-11-17
发布日期:
2022-04-15
出版日期:
2022-06-10
通讯作者:
琚生根
作者简介:
江静(1996—),女,重庆人,硕士研究生,主要研究方向:自然语言处理、知识图谱基金资助:
Jing JIANG1, Yu CHEN2, Jieping SUN1, Shenggen JU1()
Received:
2021-09-27
Revised:
2021-11-15
Accepted:
2021-11-17
Online:
2022-04-15
Published:
2022-06-10
Contact:
Shenggen JU
About author:
JIANG Jing,born in 1996,M. S. candidate. Her research interests include natural language processing,knowledge graphSupported by:
摘要:
用于文本表示的预训练语言模型在各种文本分类任务上实现了较高的准确率,但仍然存在以下问题:一方面,预训练语言模型在计算出所有类别的后验概率后选择后验概率最大的类别作为其最终分类结果,然而在很多场景下,后验概率的质量能比分类结果提供更多的可靠信息;另一方面,预训练语言模型的分类器在为语义相似的文本分配不同标签时会出现性能下降的情况。针对上述两个问题,提出一种后验概率校准结合负例监督的模型PosCal-negative。该模型端到端地在训练过程中动态地对预测概率和经验后验概率之间的差异进行惩罚,并在训练过程中利用带有不同标签的文本来实现对编码器的负例监督,从而为每个类别生成不同的特征向量表示。实验结果表明:PosCal-negative模型在两个中文母婴护理文本分类数据集MATINF-C-AGE和MATINF-C-TOPIC的分类准确率分别达到了91.55%和69.19%,相比ERNIE模型分别提高了1.13个百分点和2.53个百分点。
中图分类号:
江静, 陈渝, 孙界平, 琚生根. 融合后验概率校准训练的文本分类算法[J]. 计算机应用, 2022, 42(6): 1789-1795.
Jing JIANG, Yu CHEN, Jieping SUN, Shenggen JU. Integrating posterior probability calibration training into text classification algorithm[J]. Journal of Computer Applications, 2022, 42(6): 1789-1795.
语句 | 标签 | BERT分类 |
---|---|---|
A cold is a legit disease. | — | Cold |
Oh my god! I caught a cold! | Cold | Cold |
表1 MedWeb数据集上用BERT进行文本分类的例子
Tab. 1 Examples of text classification using BERT on MedWeb dataset
语句 | 标签 | BERT分类 |
---|---|---|
A cold is a legit disease. | — | Cold |
Oh my god! I caught a cold! | Cold | Cold |
妇婴保健数据集的文本实例 | 类别 |
---|---|
宝宝为什么总是吐舌头啊? | 问题 |
我家宝宝出生快满四个月了,这几天我突然发现宝宝总是吐舌头,而且口水也很多,那么这到底是咋回事啊? | 描述 |
表2 MATINF-C数据集的实例
Tab. 2 Examples of MATINF-C dataset
妇婴保健数据集的文本实例 | 类别 |
---|---|
宝宝为什么总是吐舌头啊? | 问题 |
我家宝宝出生快满四个月了,这几天我突然发现宝宝总是吐舌头,而且口水也很多,那么这到底是咋回事啊? | 描述 |
参数 | AGE | TOPIC | 参数 | AGE | TOPIC |
---|---|---|---|---|---|
0.7 | 0.5 | u | 5 | 5 | |
0.3 | 0.5 | n | 4 | 4 |
表3 超参数设置
Tab. 3 Hyperparameter setting
参数 | AGE | TOPIC | 参数 | AGE | TOPIC |
---|---|---|---|---|---|
0.7 | 0.5 | u | 5 | 5 | |
0.3 | 0.5 | n | 4 | 4 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC | |
---|---|---|---|
CNN 及其 变种 模型 | Text CNN[ | 90.95 | 64.41 |
DCNN[ | 90.96 | 64.60 | |
RCNN[ | 90.81 | 63.56 | |
fastText[ | 87.76 | 61.81 | |
DPCNN[ | 91.02 | 65.92 | |
预训练 语言 模型 | BERT-base[ | 90.33 | 66.95 |
BERT-of-Theseus[ | 90.25 | 66.72 | |
ERNIE[ | 90.42 | 66.66 | |
后验 概率 校准 模型 | Temp[ | 90.86 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
表4 各模型的准确率对比 ( %)
Tab. 4 Comparison of accuracy of different models
模型 | MATINF-C-AGE | MATINF-C-TOPIC | |
---|---|---|---|
CNN 及其 变种 模型 | Text CNN[ | 90.95 | 64.41 |
DCNN[ | 90.96 | 64.60 | |
RCNN[ | 90.81 | 63.56 | |
fastText[ | 87.76 | 61.81 | |
DPCNN[ | 91.02 | 65.92 | |
预训练 语言 模型 | BERT-base[ | 90.33 | 66.95 |
BERT-of-Theseus[ | 90.25 | 66.72 | |
ERNIE[ | 90.42 | 66.66 | |
后验 概率 校准 模型 | Temp[ | 90.86 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 90.33 | 66.95 |
BERT-base+PosCal | 91.25 | 68.77 |
BERT-base+Negative | 90.87 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
表5 消融实验的准确率结果 ( %)
Tab. 5 Accuracy of ablation experiment
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 90.33 | 66.95 |
BERT-base+PosCal | 91.25 | 68.77 |
BERT-base+Negative | 90.87 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 0.117 976 | 0.116 114 |
Temp | 0.148 775 | 0.139 897 |
PosCal-negative | 0.113 868 | 0.105 009 |
表6 ECE对比
Tab. 6 Comparison of ECE
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 0.117 976 | 0.116 114 |
Temp | 0.148 775 | 0.139 897 |
PosCal-negative | 0.113 868 | 0.105 009 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
PosCal-ACE | 90.12 | 66.77 |
PosCal-AM | 90.56 | 67.48 |
PosCal-negative | 91.55 | 69.19 |
表7 负例监督模块准确率对比 ( %)
Tab. 7 Accuracy comparison of negative supervision module
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
PosCal-ACE | 90.12 | 66.77 |
PosCal-AM | 90.56 | 67.48 |
PosCal-negative | 91.55 | 69.19 |
1 | WANG S, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2012: 90-94. |
2 | WANG G Y, LI C Y, WANG W L, et al. Joint embedding of words and labels for text classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018:2321-2331. 10.18653/v1/p18-1216 |
3 | ZHANG X, ZHAO J B, LeCUN Y. Character-level convolutional networks for text classification[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 649-657. 10.1109/icip.2015.7351229 |
4 | SHEN D H, ZHANG Y Z, HENAO R, et al. Deconvolutional latent-variable model for text sequence matching[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 5438-5445. |
5 | YANG P C, SUN X, LI W, et al. SGM: sequence generation model for multi-label classification[C]// Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg, PA: ACL, 2018:3915-3926. 10.18653/v1/p19-1518 |
6 | JIANG X Q, OSL M, KIM J, et al. Calibrating predictive model estimates to support personalized medicine[J]. Journal of the American Medical Informatics Association, 2012, 19(2): 263-274. 10.1136/amiajnl-2011-000291 |
7 | MURPHY A H. A new vector partition of the probability score[J]. Journal of Applied Meteorology and Climatology, 1973, 12(4): 595-600. 10.1175/1520-0450(1973)012<0595:anvpot>2.0.co;2 |
8 | MURPHY A H, WINKLER R L. Reliability of subjective probability forecasts of precipitation and temperature[J]. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1977, 26(1): 41-47. 10.2307/2346866 |
9 | DEGROOT M H, FIENBERG S E. The comparison and evaluation of forecasters[J]. Journal of the Royal Statistical Society: Series D (The Statistician), 1983, 32(1/2): 12-22. 10.2307/2987588 |
10 | GNEITING T, RAFTERY A E. Weather forecasting with ensemble methods[J]. Science, 2005, 310(5746): 248-249. 10.1126/science.1115255 |
11 | BRÖCKER J. Reliability, sufficiency, and the decomposition of proper scores[J]. Quarterly Journal of the Royal Meteorological Society, 2009, 135(643): 1512-1519. 10.1002/qj.456 |
12 | NGUYEN K, O’CONNOR B. Posterior calibration and exploratory analysis for natural language processing models[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 1587-1598. 10.18653/v1/d15-1182 |
13 | CARD D, SMITH N A. The importance of calibration for estimating proportions from annotations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association of the Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA: ACL, 2018: 1636-1646. 10.18653/v1/n18-1148 |
14 | GUO C, PLEISS G, SUN Y, et al. On calibration of modern neural networks[C]// Proceedings of the 34th International Conference on Machine Learning. New York: JMLR.org, 2017: 1321-1330. |
15 | KUMAR A, LIANG P, MA T Y. Verified uncertainty calibration[C/OL]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. [2021-03-30].. |
16 | WAKAMIYA S, MORITA M, KANO Y, et al. Overview of the NTCIR-13: MedWeb task[C]// Proceedings of the 13th NTCIR Conference on Evaluation of Information Access Technologies. Tokyo: National Institute of Informatics, 2017: 40-49. |
17 | 刘婷婷,朱文东,刘广一. 基于深度学习的文本分类研究进展[J]. 电力信息与通信技术, 2018, 16(3):1-7. 10.16543/j.2095-641x.electric.power.ict.2018.03.001 |
LIU T T, ZHU W D, LIU G Y. Advances in deep learning based text classification[J]. Electric Power Information and Communication Technology, 2018, 16(3):1-7. 10.16543/j.2095-641x.electric.power.ict.2018.03.001 | |
18 | 何力,郑灶贤,项凤涛,等. 基于深度学习的文本分类技术研究进展[J]. 计算机工程, 2021, 47(2):1-11. 10.19678/j.issn.1000-3428.0059099 |
HE L, ZHENG Z X, XIANG F T, et al. Research progress of text classification technology based on deep learning[J]. Computer Engineering, 2021, 47(2):1-11. 10.19678/j.issn.1000-3428.0059099 | |
19 | ZADROZNY B, ELKAN C. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers[C]// Proceedings of the 18th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2001: 609-616. 10.1145/775047.775151 |
20 | NAEINI M P, COOPER G F, HAUSKRECHT M. Obtaining well calibrated probabilities using Bayesian binning[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2015: 2901-2907. 10.1137/1.9781611974010.24 |
21 | PLATT J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods[M]// SMOLA A J, BARTLETT P,SCHÖLKOPF B, et al. Advances in Large Margin Classifiers. Cambridge: MIT Press, 2000: 61-74. |
22 | OHASHI S, TAKAYAMA J, KAJIWARA T, et al. Text classification with negative supervision[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 351-357. 10.18653/v1/2020.acl-main.33 |
23 | XU C W, PEI J X, WU H T, et al. MATINF: a jointly labeled large-scale dataset for classification, question answering and summarization[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: CL, 2020: 3586-3596. 10.18653/v1/2020.acl-main.330 |
24 | KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2014: 1746-1751. 10.3115/v1/d14-1181 |
25 | KALCHBRENNER N, GREFENSTETTE E, BLUNSOM P. A convolutional neural network for modelling sentences[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2014: 655-665. 10.3115/v1/p14-1062 |
26 | 杜思佳,于海宁,张宏莉. 基于深度学习的文本分类研究进展[J]. 网络与信息安全学报, 2020, 6(4):1-13. 10.11959/j.issn.2096-109x.2020010 |
DU S J, YU H N, ZHANG H L. Survey of text classification methods based on deep learning[J]. Chinese Journal of Network and Information Security, 2020, 6(4):1-13. 10.11959/j.issn.2096-109x.2020010 | |
27 | LAI S W, XU L H, LIU K, et al. Recurrent convolutional neural networks for text classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015: 2267-2273. 10.1609/aaai.v33i01.33017370 |
28 | JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Volume 2 (Short Papers). Stroudsburg, PA: ACL, 2017:427-431. 10.18653/v1/e17-2068 |
29 | JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1 (Long Papers). Stroudsburg, PA: ACL, 2017: 562-570. 10.18653/v1/p17-1052 |
30 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: ACL, 2019: 4171-4186. 10.18653/v1/n19-1423 |
31 | XU C W, ZHOU W C S, GE T, et al. BERT-of-Theseus: compressing BERT by progressive module replacing[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 7859-7869. 10.18653/v1/2020.emnlp-main.633 |
32 | ZHANG Z Y, HAN X, LIU Z Y, et al. ERNIE: enhanced language representation with informative entities[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 1441-1451. 10.18653/v1/p19-1139 |
[1] | 张杨, 郝江波. 基于注意力机制和残差网络的恶意代码检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1708-1715. |
[2] | 苏珊, 张杨, 张冬雯. 基于深度学习的耦合度相关代码坏味检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1702-1707. |
[3] | 谢新林, 肖毅, 续欣莹. 基于神经网络架构搜索的肺结节分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1424-1430. |
[4] | 屈震, 李堃婷, 冯志玺. 基于有效通道注意力的遥感图像场景分类[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1431-1439. |
[5] | 邱永茹, 姚光乐, 冯杰, 崔昊宇. 基于半监督学习的单幅图像去雨算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1577-1582. |
[6] | 鲁永帅, 唐英杰, 马鑫然. 基于深度特征融合的无纺布低对比度浆丝缺陷检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1440-1446. |
[7] | 任炜, 白鹤翔. 基于全局与局部标签关系的多标签图像分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1383-1390. |
[8] | 杨先凤, 赵家和, 李自强. 融合字注释的文本分类模型[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1317-1323. |
[9] | 杨世刚, 刘勇国. 融合语料库特征与图注意力网络的短文本分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1324-1329. |
[10] | 季长清, 高志勇, 秦静, 汪祖民. 基于卷积神经网络的图像分类算法综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1044-1049. |
[11] | 张海丰, 曾诚, 潘列, 郝儒松, 温超东, 何鹏. 结合BERT和特征投影网络的新闻主题文本分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1116-1124. |
[12] | 陈浩杰, 范江亭, 刘勇. 深度强化学习解决动态旅行商问题[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1194-1200. |
[13] | 汪祖民, 张志豪, 秦静, 季长清. 基于卷积神经网络的机械故障诊断技术综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1036-1043. |
[14] | 王颖洁, 朱久祺, 汪祖民, 白凤波, 弓箭. 自然语言处理在文本情感分析领域应用综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1011-1020. |
[15] | 张锦, 屈佩琪, 孙程, 罗蒙. 基于改进YOLOv5的安全帽佩戴检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1292-1300. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||