Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (1): 64-70.DOI: 10.11772/j.issn.1001-9081.2021020335
• Artificial intelligence • Previous Articles Next Articles
Yu PENG, Xiaoyu LI(), Shijie HU, Xiaolei LIU, Weizhong QIAN
Received:
2021-03-08
Revised:
2021-05-12
Accepted:
2021-05-17
Online:
2021-05-24
Published:
2022-01-10
Contact:
Xiaoyu LI
About author:
HU Shijie, born in 1998, M. S. candidate. His research interests include deep learning, natural language processing.Supported by:
通讯作者:
李晓瑜
作者简介:
彭宇(1996—),男,四川眉山人,硕士研究生,主要研究方向:深度学习、自然语言处理基金资助:
CLC Number:
Yu PENG, Xiaoyu LI, Shijie HU, Xiaolei LIU, Weizhong QIAN. Three-stage question answering model based on BERT[J]. Journal of Computer Applications, 2022, 42(1): 64-70.
彭宇, 李晓瑜, 胡世杰, 刘晓磊, 钱伟中. 基于BERT的三阶段式问答模型[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 64-70.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021020335
数据集 | 语言类型 | 训练集样本数 | 测试集样本数 |
---|---|---|---|
SQuAD2.0 | 英文 | 130 319 | 11 873 |
CMRC2018 | 中文 | 10 321 | 3 351 |
Tab. 1 Statistics of experimental datasets
数据集 | 语言类型 | 训练集样本数 | 测试集样本数 |
---|---|---|---|
SQuAD2.0 | 英文 | 130 319 | 11 873 |
CMRC2018 | 中文 | 10 321 | 3 351 |
参数 | 值 |
---|---|
epochs | 4 |
batch_size | 24 |
max_seq_length | 368 |
dropout | 0.1 |
learning rate | 0.000 05 |
warm-up rate | 0.1 |
Tab. 2 Parameter setting
参数 | 值 |
---|---|
epochs | 4 |
batch_size | 24 |
max_seq_length | 368 |
dropout | 0.1 |
learning rate | 0.000 05 |
warm-up rate | 0.1 |
模型 | V1.1 | V2.0 | ||
---|---|---|---|---|
EM | F1 | EM | F1 | |
人类表现 | 80.3 | 90.5 | 86.3 | 89.0 |
BiDAF | 67.7 | 77.3 | 57.7 | 62.3 |
Match-LSTM | 67.6 | 76.8 | 60.3 | 63.5 |
SAN | 75.6 | 84.8 | 67.9 | 70.7 |
QANet | 73.6 | 82.7 | 62.5 | 66.4 |
BERTbase | 80.8 | 88.5 | 74.4 | 77.1 |
+BiDAF | 81.9 | 89.0 | 74.0 | 76.9 |
+SAN | 82.2 | 89.6 | 74.9 | 77.6 |
+本文模型 | 82.8 | 88.9 | 76.8 | 78.7 |
Tab. 3 Result comparison of different models on SQuAD dataset
模型 | V1.1 | V2.0 | ||
---|---|---|---|---|
EM | F1 | EM | F1 | |
人类表现 | 80.3 | 90.5 | 86.3 | 89.0 |
BiDAF | 67.7 | 77.3 | 57.7 | 62.3 |
Match-LSTM | 67.6 | 76.8 | 60.3 | 63.5 |
SAN | 75.6 | 84.8 | 67.9 | 70.7 |
QANet | 73.6 | 82.7 | 62.5 | 66.4 |
BERTbase | 80.8 | 88.5 | 74.4 | 77.1 |
+BiDAF | 81.9 | 89.0 | 74.0 | 76.9 |
+SAN | 82.2 | 89.6 | 74.9 | 77.6 |
+本文模型 | 82.8 | 88.9 | 76.8 | 78.7 |
模型 | EM | F1 |
---|---|---|
人类表现 | 91.08 | 97.35 |
T-Reader | 39.43 | 62.41 |
SXU-Reader | 40.29 | 66.45 |
R-NET | 45.42 | 69.83 |
GM-Reader | 56.32 | 77.41 |
MCA-Reader | 63.90 | 82.62 |
BERTbase | 63.60 | 83.90 |
+本文模型 | 65.00 | 85.10 |
Tab. 4 Result comparison of different models on CMRC2018 dataset
模型 | EM | F1 |
---|---|---|
人类表现 | 91.08 | 97.35 |
T-Reader | 39.43 | 62.41 |
SXU-Reader | 40.29 | 66.45 |
R-NET | 45.42 | 69.83 |
GM-Reader | 56.32 | 77.41 |
MCA-Reader | 63.90 | 82.62 |
BERTbase | 63.60 | 83.90 |
+本文模型 | 65.00 | 85.10 |
1 | HERMANN K M, KOČISKÝ T, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 1693-1701. 10.18653/v1/d16-1116 |
2 | CUI Y M, LIU T, CHE W X, et al. A span-extraction dataset for Chinese machine reading comprehension[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2019: 5883-5889. 10.18653/v1/d19-1600 |
3 | 王小捷,白子薇,李可,等. 机器阅读理解的研究进展[J]. 北京邮电大学学报, 2019, 42(6): 1-9. 10.13190/j.jbupt.2019-111 |
WANG X J, BAI Z W, LI K, et al. Survey on machine reading comprehension[J]. Journal of Beijing University of Posts and Telecommunications, 2019, 42(6): 1-9. 10.13190/j.jbupt.2019-111 | |
4 | RAJPURKAR P, ZHANG J, LOPYREV K, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2016: 2383-2392. 10.18653/v1/d16-1264 |
5 | KADLEC R, SCHMID M, BAJGAR O, et al. Text understanding with the attention sum reader network[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2016: 908-918. 10.18653/v1/p16-1086 |
6 | SEO M, KEMBHAVI A, FARHADI A, et al. Bi-directional attention flow for machine comprehension[EB/OL]. (2018-06-21) [2020-12-22].. 10.1109/cvpr.2017.571 |
7 | DHINGRA B, LIU H X, YANG Z L, et al. Gated-attention readers for text comprehension[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2017: 1832-1846. 10.18653/v1/p17-1168 |
8 | CUI Y M, CHEN Z P, WEI S, et al. Attention-over-attention neural networks for reading comprehension[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2017: 593-602. 10.18653/v1/p17-1055 |
9 | RAJPURKAR P, JIA R, LIANG P. Know what you don’t know: unanswerable questions for SQuAD[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 784-789. 10.18653/v1/p18-2124 |
10 | TRISCHLER A, WANG T, YUAN X D, et al. NewsQA: a machine comprehension dataset[C]// Proceedings of the 2nd Workshop on Representation Learning for NLP. Stroudsburg, PA: Association for Computational Linguistics, 2017: 191-200. 10.18653/v1/w17-2623 |
11 | YU A W, DOHAN D, LUONG M T, et al. QANet: combining local convolution with global self-attention for reading comprehension[EB/OL]. (2018-04-23) [2020-12-22].. |
12 | WANG S H, JIANG J. Machine comprehension using match-LSTM and answer pointer[EB/OL]. (2016-11-07) [2020-12-22].. 10.18653/v1/2020.findings-emnlp.370 |
13 | PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2014: 1532-1543. 10.3115/v1/d14-1162 |
14 | LE Q, MIKOLOV T. Distributed representations of sentences and documents[C]// Proceedings of the 31st International Conference on Machine Learning. New York: JMLR.org, 2014: 1188-1196. |
15 | PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 2227-2237. 10.18653/v1/n18-1202 |
16 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 4171-4186. 10.18653/v1/n19-1423 |
17 | LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT: a lite BERT for self-supervised learning of language representations[EB/OL]. (2020-02-09) [2020-12-22].. |
18 | HU M H, WEI F R, PENG Y X, et al. Read+ verify: machine reading comprehension with unanswerable questions[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 6529-6537. 10.1609/aaai.v33i01.33016529 |
19 | ZHANG Z S, YANG J J, ZHAO H. Retrospective reader for machine reading comprehension[C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2021:14506-14514. 10.1609/aaai.v34i05.6511 |
20 | CAI J, ZHU Z Z, NIE P, et al. A pairwise probe for understanding BERT fine-tuning on machine reading comprehension[C]// Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2020: 1665-1668. 10.1145/3397271.3401195 |
21 | SUN C, QIU X P, XU Y G, et al. How to fine-tune BERT for text classification?[C]// Proceedings of the 18th China National Conference on Chinese Computational Linguistics, LNCS11856. Cham: Springer, 2019: 194-206. |
22 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. 10.1016/s0262-4079(17)32358-8 |
[1] | Jingqi MA, Huan LEI, Minyi CHEN. Fall behavior detection algorithm for the elderly based on AlphaPose optimization model [J]. Journal of Computer Applications, 2022, 42(1): 294-301. |
[2] | Jie WU, Shitian ZHANG, Haibin XIE, Guang YANG. Semi-supervised knee abnormality classification based on multi-imaging center MRI data [J]. Journal of Computer Applications, 2022, 42(1): 316-324. |
[3] | CHEN Chengrui, SUN Ning, HE Shibiao, LIAO Yong. Deep learning-based joint channel estimation and equalization algorithm for C-V2X communications [J]. Journal of Computer Applications, 2021, 41(9): 2687-2693. |
[4] | ZHAO Hong, KONG Dongyi. Chinese description of image content based on fusion of image feature attention and adaptive attention [J]. Journal of Computer Applications, 2021, 41(9): 2496-2503. |
[5] | XU Jianglang, LI Linyan, WAN Xinjun, HU Fuyuan. Indoor scene recognition method combined with object detection [J]. Journal of Computer Applications, 2021, 41(9): 2720-2725. |
[6] | ZHENG Zhiqiang, HU Xin, WENG Zhi, WANG Yuhe, CHENG Xi. Cattle eye image feature extraction method based on improved DenseNet [J]. Journal of Computer Applications, 2021, 41(9): 2780-2784. |
[7] | XIE Defeng, JI Jianmin. Syntax-enhanced semantic parsing with syntax-aware representation [J]. Journal of Computer Applications, 2021, 41(9): 2489-2495. |
[8] | DAI Yurou, YANG Qing, ZHANG Fengli, ZHOU Fan. Trajectory prediction model of social network users based on self-supervised learning [J]. Journal of Computer Applications, 2021, 41(9): 2545-2551. |
[9] | LIU Yaxuan, ZHONG Yong. Joint extraction method of entities and relations based on subject attention [J]. Journal of Computer Applications, 2021, 41(9): 2517-2522. |
[10] | CAO Yuhong, XU Hai, LIU Sun'ao, WANG Zixiao, LI Hongliang. Review of deep learning-based medical image segmentation [J]. Journal of Computer Applications, 2021, 41(8): 2273-2287. |
[11] | QIN Binbin, PENG Liangkang, LU Xiangming, QIAN Jiangbo. Research progress on driver distracted driving detection [J]. Journal of Computer Applications, 2021, 41(8): 2330-2337. |
[12] | HE Zhenghai, XIAN Yantuan, WANG Meng, YU Zhengtao. Case reading comprehension method combining syntactic guidance and character attention mechanism [J]. Journal of Computer Applications, 2021, 41(8): 2427-2431. |
[13] | LI Yafang, LIANG Ye, FENG Weiwei, ZU Baokai, KANG Yujian. Deep network embedding method based on community optimization [J]. Journal of Computer Applications, 2021, 41(7): 1956-1963. |
[14] | GAO Qinquan, HUANG Bingcheng, LIU Wenzhe, TONG Tong. Bamboo strip surface defect detection method based on improved CenterNet [J]. Journal of Computer Applications, 2021, 41(7): 1933-1938. |
[15] | WANG Yue, JIANG Yiming, LAN Julong. Intrusion detection based on improved triplet network and K-nearest neighbor algorithm [J]. Journal of Computer Applications, 2021, 41(7): 1996-2002. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||