Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (6): 1789-1795.DOI: 10.11772/j.issn.1001-9081.2021091638
Special Issue: 第十八届CCF中国信息系统及应用大会
• The 18th CCF Conference on Web Information Systems and Applications • Previous Articles Next Articles
Jing JIANG1, Yu CHEN2, Jieping SUN1, Shenggen JU1()
Received:
2021-09-27
Revised:
2021-11-15
Accepted:
2021-11-17
Online:
2022-04-15
Published:
2022-06-10
Contact:
Shenggen JU
About author:
JIANG Jing,born in 1996,M. S. candidate. Her research interests include natural language processing,knowledge graphSupported by:
通讯作者:
琚生根
作者简介:
江静(1996—),女,重庆人,硕士研究生,主要研究方向:自然语言处理、知识图谱基金资助:
CLC Number:
Jing JIANG, Yu CHEN, Jieping SUN, Shenggen JU. Integrating posterior probability calibration training into text classification algorithm[J]. Journal of Computer Applications, 2022, 42(6): 1789-1795.
江静, 陈渝, 孙界平, 琚生根. 融合后验概率校准训练的文本分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1789-1795.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021091638
语句 | 标签 | BERT分类 |
---|---|---|
A cold is a legit disease. | — | Cold |
Oh my god! I caught a cold! | Cold | Cold |
Tab. 1 Examples of text classification using BERT on MedWeb dataset
语句 | 标签 | BERT分类 |
---|---|---|
A cold is a legit disease. | — | Cold |
Oh my god! I caught a cold! | Cold | Cold |
妇婴保健数据集的文本实例 | 类别 |
---|---|
宝宝为什么总是吐舌头啊? | 问题 |
我家宝宝出生快满四个月了,这几天我突然发现宝宝总是吐舌头,而且口水也很多,那么这到底是咋回事啊? | 描述 |
Tab. 2 Examples of MATINF-C dataset
妇婴保健数据集的文本实例 | 类别 |
---|---|
宝宝为什么总是吐舌头啊? | 问题 |
我家宝宝出生快满四个月了,这几天我突然发现宝宝总是吐舌头,而且口水也很多,那么这到底是咋回事啊? | 描述 |
参数 | AGE | TOPIC | 参数 | AGE | TOPIC |
---|---|---|---|---|---|
0.7 | 0.5 | u | 5 | 5 | |
0.3 | 0.5 | n | 4 | 4 |
Tab. 3 Hyperparameter setting
参数 | AGE | TOPIC | 参数 | AGE | TOPIC |
---|---|---|---|---|---|
0.7 | 0.5 | u | 5 | 5 | |
0.3 | 0.5 | n | 4 | 4 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC | |
---|---|---|---|
CNN 及其 变种 模型 | Text CNN[ | 90.95 | 64.41 |
DCNN[ | 90.96 | 64.60 | |
RCNN[ | 90.81 | 63.56 | |
fastText[ | 87.76 | 61.81 | |
DPCNN[ | 91.02 | 65.92 | |
预训练 语言 模型 | BERT-base[ | 90.33 | 66.95 |
BERT-of-Theseus[ | 90.25 | 66.72 | |
ERNIE[ | 90.42 | 66.66 | |
后验 概率 校准 模型 | Temp[ | 90.86 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
Tab. 4 Comparison of accuracy of different models
模型 | MATINF-C-AGE | MATINF-C-TOPIC | |
---|---|---|---|
CNN 及其 变种 模型 | Text CNN[ | 90.95 | 64.41 |
DCNN[ | 90.96 | 64.60 | |
RCNN[ | 90.81 | 63.56 | |
fastText[ | 87.76 | 61.81 | |
DPCNN[ | 91.02 | 65.92 | |
预训练 语言 模型 | BERT-base[ | 90.33 | 66.95 |
BERT-of-Theseus[ | 90.25 | 66.72 | |
ERNIE[ | 90.42 | 66.66 | |
后验 概率 校准 模型 | Temp[ | 90.86 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 90.33 | 66.95 |
BERT-base+PosCal | 91.25 | 68.77 |
BERT-base+Negative | 90.87 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
Tab. 5 Accuracy of ablation experiment
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 90.33 | 66.95 |
BERT-base+PosCal | 91.25 | 68.77 |
BERT-base+Negative | 90.87 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 0.117 976 | 0.116 114 |
Temp | 0.148 775 | 0.139 897 |
PosCal-negative | 0.113 868 | 0.105 009 |
Tab. 6 Comparison of ECE
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 0.117 976 | 0.116 114 |
Temp | 0.148 775 | 0.139 897 |
PosCal-negative | 0.113 868 | 0.105 009 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
PosCal-ACE | 90.12 | 66.77 |
PosCal-AM | 90.56 | 67.48 |
PosCal-negative | 91.55 | 69.19 |
Tab. 7 Accuracy comparison of negative supervision module
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
PosCal-ACE | 90.12 | 66.77 |
PosCal-AM | 90.56 | 67.48 |
PosCal-negative | 91.55 | 69.19 |
1 | WANG S, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2012: 90-94. |
2 | WANG G Y, LI C Y, WANG W L, et al. Joint embedding of words and labels for text classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018:2321-2331. 10.18653/v1/p18-1216 |
3 | ZHANG X, ZHAO J B, LeCUN Y. Character-level convolutional networks for text classification[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 649-657. 10.1109/icip.2015.7351229 |
4 | SHEN D H, ZHANG Y Z, HENAO R, et al. Deconvolutional latent-variable model for text sequence matching[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 5438-5445. |
5 | YANG P C, SUN X, LI W, et al. SGM: sequence generation model for multi-label classification[C]// Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg, PA: ACL, 2018:3915-3926. 10.18653/v1/p19-1518 |
6 | JIANG X Q, OSL M, KIM J, et al. Calibrating predictive model estimates to support personalized medicine[J]. Journal of the American Medical Informatics Association, 2012, 19(2): 263-274. 10.1136/amiajnl-2011-000291 |
7 | MURPHY A H. A new vector partition of the probability score[J]. Journal of Applied Meteorology and Climatology, 1973, 12(4): 595-600. 10.1175/1520-0450(1973)012<0595:anvpot>2.0.co;2 |
8 | MURPHY A H, WINKLER R L. Reliability of subjective probability forecasts of precipitation and temperature[J]. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1977, 26(1): 41-47. 10.2307/2346866 |
9 | DEGROOT M H, FIENBERG S E. The comparison and evaluation of forecasters[J]. Journal of the Royal Statistical Society: Series D (The Statistician), 1983, 32(1/2): 12-22. 10.2307/2987588 |
10 | GNEITING T, RAFTERY A E. Weather forecasting with ensemble methods[J]. Science, 2005, 310(5746): 248-249. 10.1126/science.1115255 |
11 | BRÖCKER J. Reliability, sufficiency, and the decomposition of proper scores[J]. Quarterly Journal of the Royal Meteorological Society, 2009, 135(643): 1512-1519. 10.1002/qj.456 |
12 | NGUYEN K, O’CONNOR B. Posterior calibration and exploratory analysis for natural language processing models[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 1587-1598. 10.18653/v1/d15-1182 |
13 | CARD D, SMITH N A. The importance of calibration for estimating proportions from annotations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association of the Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA: ACL, 2018: 1636-1646. 10.18653/v1/n18-1148 |
14 | GUO C, PLEISS G, SUN Y, et al. On calibration of modern neural networks[C]// Proceedings of the 34th International Conference on Machine Learning. New York: JMLR.org, 2017: 1321-1330. |
15 | KUMAR A, LIANG P, MA T Y. Verified uncertainty calibration[C/OL]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. [2021-03-30].. |
16 | WAKAMIYA S, MORITA M, KANO Y, et al. Overview of the NTCIR-13: MedWeb task[C]// Proceedings of the 13th NTCIR Conference on Evaluation of Information Access Technologies. Tokyo: National Institute of Informatics, 2017: 40-49. |
17 | 刘婷婷,朱文东,刘广一. 基于深度学习的文本分类研究进展[J]. 电力信息与通信技术, 2018, 16(3):1-7. 10.16543/j.2095-641x.electric.power.ict.2018.03.001 |
LIU T T, ZHU W D, LIU G Y. Advances in deep learning based text classification[J]. Electric Power Information and Communication Technology, 2018, 16(3):1-7. 10.16543/j.2095-641x.electric.power.ict.2018.03.001 | |
18 | 何力,郑灶贤,项凤涛,等. 基于深度学习的文本分类技术研究进展[J]. 计算机工程, 2021, 47(2):1-11. 10.19678/j.issn.1000-3428.0059099 |
HE L, ZHENG Z X, XIANG F T, et al. Research progress of text classification technology based on deep learning[J]. Computer Engineering, 2021, 47(2):1-11. 10.19678/j.issn.1000-3428.0059099 | |
19 | ZADROZNY B, ELKAN C. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers[C]// Proceedings of the 18th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2001: 609-616. 10.1145/775047.775151 |
20 | NAEINI M P, COOPER G F, HAUSKRECHT M. Obtaining well calibrated probabilities using Bayesian binning[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2015: 2901-2907. 10.1137/1.9781611974010.24 |
21 | PLATT J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods[M]// SMOLA A J, BARTLETT P,SCHÖLKOPF B, et al. Advances in Large Margin Classifiers. Cambridge: MIT Press, 2000: 61-74. |
22 | OHASHI S, TAKAYAMA J, KAJIWARA T, et al. Text classification with negative supervision[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 351-357. 10.18653/v1/2020.acl-main.33 |
23 | XU C W, PEI J X, WU H T, et al. MATINF: a jointly labeled large-scale dataset for classification, question answering and summarization[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: CL, 2020: 3586-3596. 10.18653/v1/2020.acl-main.330 |
24 | KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2014: 1746-1751. 10.3115/v1/d14-1181 |
25 | KALCHBRENNER N, GREFENSTETTE E, BLUNSOM P. A convolutional neural network for modelling sentences[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2014: 655-665. 10.3115/v1/p14-1062 |
26 | 杜思佳,于海宁,张宏莉. 基于深度学习的文本分类研究进展[J]. 网络与信息安全学报, 2020, 6(4):1-13. 10.11959/j.issn.2096-109x.2020010 |
DU S J, YU H N, ZHANG H L. Survey of text classification methods based on deep learning[J]. Chinese Journal of Network and Information Security, 2020, 6(4):1-13. 10.11959/j.issn.2096-109x.2020010 | |
27 | LAI S W, XU L H, LIU K, et al. Recurrent convolutional neural networks for text classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015: 2267-2273. 10.1609/aaai.v33i01.33017370 |
28 | JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Volume 2 (Short Papers). Stroudsburg, PA: ACL, 2017:427-431. 10.18653/v1/e17-2068 |
29 | JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1 (Long Papers). Stroudsburg, PA: ACL, 2017: 562-570. 10.18653/v1/p17-1052 |
30 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: ACL, 2019: 4171-4186. 10.18653/v1/n19-1423 |
31 | XU C W, ZHOU W C S, GE T, et al. BERT-of-Theseus: compressing BERT by progressive module replacing[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 7859-7869. 10.18653/v1/2020.emnlp-main.633 |
32 | ZHANG Z Y, HAN X, LIU Z Y, et al. ERNIE: enhanced language representation with informative entities[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 1441-1451. 10.18653/v1/p19-1139 |
[1] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[2] | Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918. |
[3] | Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703. |
[4] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[5] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[6] | Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557. |
[7] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. |
[8] | Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650. |
[9] | Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263. |
[10] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[11] | Zhi ZHANG, Xin LI, Naifu YE, Kaixi HU. DKP: defending against model stealing attacks based on dark knowledge protection [J]. Journal of Computer Applications, 2024, 44(7): 2080-2086. |
[12] | Yiqun ZHAO, Zhiyu ZHANG, Xue DONG. Anisotropic travel time computation method based on dense residual connection physical information neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2310-2318. |
[13] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[14] | Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215. |
[15] | Yajuan ZHAO, Fanjun MENG, Xingjian XU. Review of online education learner knowledge tracing [J]. Journal of Computer Applications, 2024, 44(6): 1683-1698. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||