Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (6): 1789-1795.DOI: 10.11772/j.issn.1001-9081.2021091638
• The 18th CCF Conference on Web Information Systems and Applications • Previous Articles
Jing JIANG1, Yu CHEN2, Jieping SUN1, Shenggen JU1()
Received:
2021-09-27
Revised:
2021-11-15
Accepted:
2021-11-17
Online:
2022-04-15
Published:
2022-06-10
Contact:
Shenggen JU
About author:
JIANG Jing,born in 1996,M. S. candidate. Her research interests include natural language processing,knowledge graphSupported by:
通讯作者:
琚生根
作者简介:
江静(1996—),女,重庆人,硕士研究生,主要研究方向:自然语言处理、知识图谱基金资助:
CLC Number:
Jing JIANG, Yu CHEN, Jieping SUN, Shenggen JU. Integrating posterior probability calibration training into text classification algorithm[J]. Journal of Computer Applications, 2022, 42(6): 1789-1795.
江静, 陈渝, 孙界平, 琚生根. 融合后验概率校准训练的文本分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1789-1795.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021091638
语句 | 标签 | BERT分类 |
---|---|---|
A cold is a legit disease. | — | Cold |
Oh my god! I caught a cold! | Cold | Cold |
Tab. 1 Examples of text classification using BERT on MedWeb dataset
语句 | 标签 | BERT分类 |
---|---|---|
A cold is a legit disease. | — | Cold |
Oh my god! I caught a cold! | Cold | Cold |
妇婴保健数据集的文本实例 | 类别 |
---|---|
宝宝为什么总是吐舌头啊? | 问题 |
我家宝宝出生快满四个月了,这几天我突然发现宝宝总是吐舌头,而且口水也很多,那么这到底是咋回事啊? | 描述 |
Tab. 2 Examples of MATINF-C dataset
妇婴保健数据集的文本实例 | 类别 |
---|---|
宝宝为什么总是吐舌头啊? | 问题 |
我家宝宝出生快满四个月了,这几天我突然发现宝宝总是吐舌头,而且口水也很多,那么这到底是咋回事啊? | 描述 |
参数 | AGE | TOPIC | 参数 | AGE | TOPIC |
---|---|---|---|---|---|
0.7 | 0.5 | u | 5 | 5 | |
0.3 | 0.5 | n | 4 | 4 |
Tab. 3 Hyperparameter setting
参数 | AGE | TOPIC | 参数 | AGE | TOPIC |
---|---|---|---|---|---|
0.7 | 0.5 | u | 5 | 5 | |
0.3 | 0.5 | n | 4 | 4 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC | |
---|---|---|---|
CNN 及其 变种 模型 | Text CNN[ | 90.95 | 64.41 |
DCNN[ | 90.96 | 64.60 | |
RCNN[ | 90.81 | 63.56 | |
fastText[ | 87.76 | 61.81 | |
DPCNN[ | 91.02 | 65.92 | |
预训练 语言 模型 | BERT-base[ | 90.33 | 66.95 |
BERT-of-Theseus[ | 90.25 | 66.72 | |
ERNIE[ | 90.42 | 66.66 | |
后验 概率 校准 模型 | Temp[ | 90.86 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
Tab. 4 Comparison of accuracy of different models
模型 | MATINF-C-AGE | MATINF-C-TOPIC | |
---|---|---|---|
CNN 及其 变种 模型 | Text CNN[ | 90.95 | 64.41 |
DCNN[ | 90.96 | 64.60 | |
RCNN[ | 90.81 | 63.56 | |
fastText[ | 87.76 | 61.81 | |
DPCNN[ | 91.02 | 65.92 | |
预训练 语言 模型 | BERT-base[ | 90.33 | 66.95 |
BERT-of-Theseus[ | 90.25 | 66.72 | |
ERNIE[ | 90.42 | 66.66 | |
后验 概率 校准 模型 | Temp[ | 90.86 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 90.33 | 66.95 |
BERT-base+PosCal | 91.25 | 68.77 |
BERT-base+Negative | 90.87 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
Tab. 5 Accuracy of ablation experiment
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 90.33 | 66.95 |
BERT-base+PosCal | 91.25 | 68.77 |
BERT-base+Negative | 90.87 | 68.04 |
PosCal-negative | 91.55 | 69.19 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 0.117 976 | 0.116 114 |
Temp | 0.148 775 | 0.139 897 |
PosCal-negative | 0.113 868 | 0.105 009 |
Tab. 6 Comparison of ECE
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
BERT-base | 0.117 976 | 0.116 114 |
Temp | 0.148 775 | 0.139 897 |
PosCal-negative | 0.113 868 | 0.105 009 |
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
PosCal-ACE | 90.12 | 66.77 |
PosCal-AM | 90.56 | 67.48 |
PosCal-negative | 91.55 | 69.19 |
Tab. 7 Accuracy comparison of negative supervision module
模型 | MATINF-C-AGE | MATINF-C-TOPIC |
---|---|---|
PosCal-ACE | 90.12 | 66.77 |
PosCal-AM | 90.56 | 67.48 |
PosCal-negative | 91.55 | 69.19 |
1 | WANG S, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2012: 90-94. |
2 | WANG G Y, LI C Y, WANG W L, et al. Joint embedding of words and labels for text classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018:2321-2331. 10.18653/v1/p18-1216 |
3 | ZHANG X, ZHAO J B, LeCUN Y. Character-level convolutional networks for text classification[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 649-657. 10.1109/icip.2015.7351229 |
4 | SHEN D H, ZHANG Y Z, HENAO R, et al. Deconvolutional latent-variable model for text sequence matching[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 5438-5445. |
5 | YANG P C, SUN X, LI W, et al. SGM: sequence generation model for multi-label classification[C]// Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg, PA: ACL, 2018:3915-3926. 10.18653/v1/p19-1518 |
6 | JIANG X Q, OSL M, KIM J, et al. Calibrating predictive model estimates to support personalized medicine[J]. Journal of the American Medical Informatics Association, 2012, 19(2): 263-274. 10.1136/amiajnl-2011-000291 |
7 | MURPHY A H. A new vector partition of the probability score[J]. Journal of Applied Meteorology and Climatology, 1973, 12(4): 595-600. 10.1175/1520-0450(1973)012<0595:anvpot>2.0.co;2 |
8 | MURPHY A H, WINKLER R L. Reliability of subjective probability forecasts of precipitation and temperature[J]. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1977, 26(1): 41-47. 10.2307/2346866 |
9 | DEGROOT M H, FIENBERG S E. The comparison and evaluation of forecasters[J]. Journal of the Royal Statistical Society: Series D (The Statistician), 1983, 32(1/2): 12-22. 10.2307/2987588 |
10 | GNEITING T, RAFTERY A E. Weather forecasting with ensemble methods[J]. Science, 2005, 310(5746): 248-249. 10.1126/science.1115255 |
11 | BRÖCKER J. Reliability, sufficiency, and the decomposition of proper scores[J]. Quarterly Journal of the Royal Meteorological Society, 2009, 135(643): 1512-1519. 10.1002/qj.456 |
12 | NGUYEN K, O’CONNOR B. Posterior calibration and exploratory analysis for natural language processing models[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 1587-1598. 10.18653/v1/d15-1182 |
13 | CARD D, SMITH N A. The importance of calibration for estimating proportions from annotations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association of the Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA: ACL, 2018: 1636-1646. 10.18653/v1/n18-1148 |
14 | GUO C, PLEISS G, SUN Y, et al. On calibration of modern neural networks[C]// Proceedings of the 34th International Conference on Machine Learning. New York: JMLR.org, 2017: 1321-1330. |
15 | KUMAR A, LIANG P, MA T Y. Verified uncertainty calibration[C/OL]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. [2021-03-30].. |
16 | WAKAMIYA S, MORITA M, KANO Y, et al. Overview of the NTCIR-13: MedWeb task[C]// Proceedings of the 13th NTCIR Conference on Evaluation of Information Access Technologies. Tokyo: National Institute of Informatics, 2017: 40-49. |
17 | 刘婷婷,朱文东,刘广一. 基于深度学习的文本分类研究进展[J]. 电力信息与通信技术, 2018, 16(3):1-7. 10.16543/j.2095-641x.electric.power.ict.2018.03.001 |
LIU T T, ZHU W D, LIU G Y. Advances in deep learning based text classification[J]. Electric Power Information and Communication Technology, 2018, 16(3):1-7. 10.16543/j.2095-641x.electric.power.ict.2018.03.001 | |
18 | 何力,郑灶贤,项凤涛,等. 基于深度学习的文本分类技术研究进展[J]. 计算机工程, 2021, 47(2):1-11. 10.19678/j.issn.1000-3428.0059099 |
HE L, ZHENG Z X, XIANG F T, et al. Research progress of text classification technology based on deep learning[J]. Computer Engineering, 2021, 47(2):1-11. 10.19678/j.issn.1000-3428.0059099 | |
19 | ZADROZNY B, ELKAN C. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers[C]// Proceedings of the 18th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2001: 609-616. 10.1145/775047.775151 |
20 | NAEINI M P, COOPER G F, HAUSKRECHT M. Obtaining well calibrated probabilities using Bayesian binning[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2015: 2901-2907. 10.1137/1.9781611974010.24 |
21 | PLATT J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods[M]// SMOLA A J, BARTLETT P,SCHÖLKOPF B, et al. Advances in Large Margin Classifiers. Cambridge: MIT Press, 2000: 61-74. |
22 | OHASHI S, TAKAYAMA J, KAJIWARA T, et al. Text classification with negative supervision[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 351-357. 10.18653/v1/2020.acl-main.33 |
23 | XU C W, PEI J X, WU H T, et al. MATINF: a jointly labeled large-scale dataset for classification, question answering and summarization[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: CL, 2020: 3586-3596. 10.18653/v1/2020.acl-main.330 |
24 | KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2014: 1746-1751. 10.3115/v1/d14-1181 |
25 | KALCHBRENNER N, GREFENSTETTE E, BLUNSOM P. A convolutional neural network for modelling sentences[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2014: 655-665. 10.3115/v1/p14-1062 |
26 | 杜思佳,于海宁,张宏莉. 基于深度学习的文本分类研究进展[J]. 网络与信息安全学报, 2020, 6(4):1-13. 10.11959/j.issn.2096-109x.2020010 |
DU S J, YU H N, ZHANG H L. Survey of text classification methods based on deep learning[J]. Chinese Journal of Network and Information Security, 2020, 6(4):1-13. 10.11959/j.issn.2096-109x.2020010 | |
27 | LAI S W, XU L H, LIU K, et al. Recurrent convolutional neural networks for text classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015: 2267-2273. 10.1609/aaai.v33i01.33017370 |
28 | JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Volume 2 (Short Papers). Stroudsburg, PA: ACL, 2017:427-431. 10.18653/v1/e17-2068 |
29 | JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1 (Long Papers). Stroudsburg, PA: ACL, 2017: 562-570. 10.18653/v1/p17-1052 |
30 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: ACL, 2019: 4171-4186. 10.18653/v1/n19-1423 |
31 | XU C W, ZHOU W C S, GE T, et al. BERT-of-Theseus: compressing BERT by progressive module replacing[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 7859-7869. 10.18653/v1/2020.emnlp-main.633 |
32 | ZHANG Z Y, HAN X, LIU Z Y, et al. ERNIE: enhanced language representation with informative entities[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 1441-1451. 10.18653/v1/p19-1139 |
[1] | Yang ZHANG, Jiangbo HAO. Malicious code detection method based on attention mechanism and residual network [J]. Journal of Computer Applications, 2022, 42(6): 1708-1715. |
[2] | Shan SU, Yang ZHANG, Dongwen ZHANG. Coupling related code smell detection method based on deep learning [J]. Journal of Computer Applications, 2022, 42(6): 1702-1707. |
[3] | Wei REN, Hexiang BAI. Multi-label image classification method based on global and local label relationship [J]. Journal of Computer Applications, 2022, 42(5): 1383-1390. |
[4] | Xinlin XIE, Yi XIAO, Xinying XU. Lung nodule classification algorithm based on neural network architecture search [J]. Journal of Computer Applications, 2022, 42(5): 1424-1430. |
[5] | Yongshuai LU, Yingjie TANG, Xinran MA. Low contrast filament sizing defect detection method of non-woven fabric based on deep feature fusion [J]. Journal of Computer Applications, 2022, 42(5): 1440-1446. |
[6] | Zhen QU, Kunting LI, Zhixi FENG. Remote sensing image scene classification based on effective channel attention [J]. Journal of Computer Applications, 2022, 42(5): 1431-1439. |
[7] | Yongru QIU, Guangle YAO, Jie FENG, Haoyu CUI. Single image de-raining algorithm based on semi-supervised learning [J]. Journal of Computer Applications, 2022, 42(5): 1577-1582. |
[8] | Xianfeng YANG, Jiahe ZHAO, Ziqiang LI. Text classification model combining word annotations [J]. Journal of Computer Applications, 2022, 42(5): 1317-1323. |
[9] | Shigang YANG, Yongguo LIU. Short text classification method by fusing corpus features and graph attention network [J]. Journal of Computer Applications, 2022, 42(5): 1324-1329. |
[10] | Wangjing TANG, Bin XU, Meihan TONG, Meihuan HAN, Liming WANG, Qi ZHONG. Popular science text classification model enhanced by knowledge graph [J]. Journal of Computer Applications, 2022, 42(4): 1072-1078. |
[11] | Yongfeng DONG, Yahan DENG, Yao DONG, Yacong WANG. Survey of clustering based on deep learning [J]. Journal of Computer Applications, 2022, 42(4): 1021-1028. |
[12] | Jin ZHANG, Peiqi QU, Cheng SUN, Meng LUO. Safety helmet wearing detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2022, 42(4): 1292-1300. |
[13] | Yingjie WANG, Jiuqi ZHU, Zumin WANG, Fengbo BAI, Jian GONG. Review of applications of natural language processing in text sentiment analysis [J]. Journal of Computer Applications, 2022, 42(4): 1011-1020. |
[14] | Zhihua LIU, Wenjie CHEN, Aibin CHEN. Homologous spectrogram feature fusion with self-attention mechanism for bird sound classification [J]. Journal of Computer Applications, 2022, 42(4): 1260-1268. |
[15] | Changqing JI, Zhiyong GAO, Jing QIN, Zumin WANG. Review of image classification algorithms based on convolutional neural network [J]. Journal of Computer Applications, 2022, 42(4): 1044-1049. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||