Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (2): 365-373.DOI: 10.11772/j.issn.1001-9081.2021122167
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Jie HU1,2(), Xiaoxi CHEN1, Yan ZHANG1,2
Received:
2021-12-29
Revised:
2022-06-04
Accepted:
2022-06-10
Online:
2022-06-30
Published:
2023-02-10
Contact:
Jie HU
About author:
CHEN Xiaoxi, born in 1997, M. S. candidate. Her research interests include natural language processing.Supported by:
通讯作者:
胡婕
作者简介:
陈晓茜(1997—),女,河南平顶山人,硕士研究生,主要研究方向:自然语言处理基金资助:
CLC Number:
Jie HU, Xiaoxi CHEN, Yan ZHANG. Answer selection model based on pooling and feature combination enhanced BERT[J]. Journal of Computer Applications, 2023, 43(2): 365-373.
胡婕, 陈晓茜, 张龑. 基于池化和特征组合增强BERT的答案选择模型[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 365-373.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021122167
数据集 | 样本数 | 文本的平均长度 | ||
---|---|---|---|---|
训练集 | 验证集 | 测试集 | ||
SemEval-2016CQA | 20 340 | 2 440 | 3 270 | 42 |
SemEval-2017CQA | 14 110 | 2 440 | 2 930 | 46 |
MSRP | 3 576 | 500 | 1 725 | 18 |
Tab. 1 Description of datasets
数据集 | 样本数 | 文本的平均长度 | ||
---|---|---|---|---|
训练集 | 验证集 | 测试集 | ||
SemEval-2016CQA | 20 340 | 2 440 | 3 270 | 42 |
SemEval-2017CQA | 14 110 | 2 440 | 2 930 | 46 |
MSRP | 3 576 | 500 | 1 725 | 18 |
参数 | 值 | 参数 | 值 |
---|---|---|---|
learning-rate | batch_size | 16 | |
optimization | Adam | numbers of topics | 70 |
epochs | 3 | LDA alpha values | 50 |
hidden_size | 768 |
Tab. 2 Parameter setting
参数 | 值 | 参数 | 值 |
---|---|---|---|
learning-rate | batch_size | 16 | |
optimization | Adam | numbers of topics | 70 |
epochs | 3 | LDA alpha values | 50 |
hidden_size | 768 |
模型 | SemEval-2016CQA | SemEval-2017CQA | ||
---|---|---|---|---|
准确率 | F1 | 准确率 | F1 | |
tBERT | 77.6 | 74.1 | 78.3 | 76.8 |
tBERT-AT | 78.6 | 74.9 | 79.4 | 77.9 |
tBERT-pooling | 78.0 | 74.4 | 78.6 | 77.2 |
tBERT-AT-pooling | 78.8 | 75.3 | 79.6 | 78.1 |
Tab. 3 Comparison of accuracy and F1 scores of tBERT,tBERT-AT,tBERT-pooling, and tBERT-AT-pooling models
模型 | SemEval-2016CQA | SemEval-2017CQA | ||
---|---|---|---|---|
准确率 | F1 | 准确率 | F1 | |
tBERT | 77.6 | 74.1 | 78.3 | 76.8 |
tBERT-AT | 78.6 | 74.9 | 79.4 | 77.9 |
tBERT-pooling | 78.0 | 74.4 | 78.6 | 77.2 |
tBERT-AT-pooling | 78.8 | 75.3 | 79.6 | 78.1 |
模型 | SemEval-2016CQA | SemEval-2017CQA | ||
---|---|---|---|---|
准确率 | F1 | 准确率 | F1 | |
tBERT | 77.6 | 74.1 | 78.3 | 76.8 |
tBERT-特征组合 | 77.9 | 74.3 | 78.5 | 77.0 |
tBERT-AT | 78.6 | 74.9 | 79.4 | 77.9 |
tBERT-AT-特征组合 | 78.9 | 75.1 | 79.5 | 78.1 |
tBERT-pooling | 78.0 | 74.4 | 78.6 | 77.2 |
tBERT-pooling-特征组合 | 78.4 | 74.7 | 78.8 | 77.5 |
tBERT-AT-pooling | 78.8 | 75.3 | 79.6 | 78.1 |
tBERT-AT-pooling-特征组合 | 79.2 | 75.6 | 79.9 | 78.6 |
Tab. 4 Comparison of accuracy and F1 scores of tBERT, tBERT-AT,tBERT-pooling and tBERT-AT-pooling models before and after introducing combination of topic information features
模型 | SemEval-2016CQA | SemEval-2017CQA | ||
---|---|---|---|---|
准确率 | F1 | 准确率 | F1 | |
tBERT | 77.6 | 74.1 | 78.3 | 76.8 |
tBERT-特征组合 | 77.9 | 74.3 | 78.5 | 77.0 |
tBERT-AT | 78.6 | 74.9 | 79.4 | 77.9 |
tBERT-AT-特征组合 | 78.9 | 75.1 | 79.5 | 78.1 |
tBERT-pooling | 78.0 | 74.4 | 78.6 | 77.2 |
tBERT-pooling-特征组合 | 78.4 | 74.7 | 78.8 | 77.5 |
tBERT-AT-pooling | 78.8 | 75.3 | 79.6 | 78.1 |
tBERT-AT-pooling-特征组合 | 79.2 | 75.6 | 79.9 | 78.6 |
模型 | SemEval-2016CQA | SemEval-017CQA | ||
---|---|---|---|---|
准确率 | F1 | 准确率 | F1 | |
tBERT-tanh | 77.6 | 74.1 | 78.3 | 76.8 |
tBERT-改进的激活函数 | 78.5 | 74.3 | 79.0 | 77.3 |
tBERT-AT-特征组合-tanh | 78.9 | 75.1 | 79.5 | 78.1 |
tBERT-AT-特征组合-改进的激活函数 | 79.1 | 75.3 | 79.7 | 78.4 |
tBERT-pooling-特征组合-tanh | 78.4 | 74.7 | 78.8 | 77.5 |
tBERT-pooling-特征组合-改进的激活函数 | 79.3 | 75.6 | 80.1 | 78.2 |
tBERT-AT-pooling-特征组合-tanh | 79.2 | 75.6 | 79.9 | 78.6 |
本文模型 | 80.7 | 76.1 | 80.5 | 79.9 |
Tab. 5 Comparison of accuracy and F1 scores of tBERT, tBERT-AT-feature combination, tBERT-pooling-feature combination and tBERT-AT-pooling-feature combination models before and after improving activation function
模型 | SemEval-2016CQA | SemEval-017CQA | ||
---|---|---|---|---|
准确率 | F1 | 准确率 | F1 | |
tBERT-tanh | 77.6 | 74.1 | 78.3 | 76.8 |
tBERT-改进的激活函数 | 78.5 | 74.3 | 79.0 | 77.3 |
tBERT-AT-特征组合-tanh | 78.9 | 75.1 | 79.5 | 78.1 |
tBERT-AT-特征组合-改进的激活函数 | 79.1 | 75.3 | 79.7 | 78.4 |
tBERT-pooling-特征组合-tanh | 78.4 | 74.7 | 78.8 | 77.5 |
tBERT-pooling-特征组合-改进的激活函数 | 79.3 | 75.6 | 80.1 | 78.2 |
tBERT-AT-pooling-特征组合-tanh | 79.2 | 75.6 | 79.9 | 78.6 |
本文模型 | 80.7 | 76.1 | 80.5 | 79.9 |
模型 | MSRP | |
---|---|---|
准确率 | F1 | |
tBERT-tanh | 89.5 | 88.4 |
tBERT-改进后的激活函数 | 89.8 | 88.6 |
Tab. 6 Comparison of accuracy and F1 scores of tBERT,tBERT before and after improving activation function on MSRP dataset
模型 | MSRP | |
---|---|---|
准确率 | F1 | |
tBERT-tanh | 89.5 | 88.4 |
tBERT-改进后的激活函数 | 89.8 | 88.6 |
模型 | SemEval-2016CQA | SemEval-2017CQA | ||
---|---|---|---|---|
准确率 | F1 | 准确率 | F1 | |
LDA主题模型 | 70.3 | 67.6 | 71.4 | 68.4 |
ECNU | 74.3 | 66.7 | 78.4 | 77.6 |
Siamese-BiLSTM | 74.6 | 68.7 | 75.3 | 67.1 |
UIA-LSTM-CNN | 78.2 | 68.4 | 77.1 | 76.4 |
AUANN | 80.5 | 74.5 | 78.5 | 79.8 |
BERT | 75.6 | 71.9 | 76.2 | 70.4 |
GMN-BERT | 76.7 | 72.8 | 77.5 | 71.6 |
BERT-pooling | 76.1 | 72.5 | 77.1 | 71.1 |
tBERT | 77.6 | 74.1 | 78.3 | 76.8 |
本文模型 | 80.7 | 76.1 | 80.5 | 79.9 |
Tab. 7 Comparison of accuracy and F1 scores of related models
模型 | SemEval-2016CQA | SemEval-2017CQA | ||
---|---|---|---|---|
准确率 | F1 | 准确率 | F1 | |
LDA主题模型 | 70.3 | 67.6 | 71.4 | 68.4 |
ECNU | 74.3 | 66.7 | 78.4 | 77.6 |
Siamese-BiLSTM | 74.6 | 68.7 | 75.3 | 67.1 |
UIA-LSTM-CNN | 78.2 | 68.4 | 77.1 | 76.4 |
AUANN | 80.5 | 74.5 | 78.5 | 79.8 |
BERT | 75.6 | 71.9 | 76.2 | 70.4 |
GMN-BERT | 76.7 | 72.8 | 77.5 | 71.6 |
BERT-pooling | 76.1 | 72.5 | 77.1 | 71.1 |
tBERT | 77.6 | 74.1 | 78.3 | 76.8 |
本文模型 | 80.7 | 76.1 | 80.5 | 79.9 |
模型 | 注意力可视化示例 |
---|---|
tBERT模型[ | 问题:How much salary? Hi everyone I’m in the process of negotiating my salary but I have no idea how much should be the salary of mechanical engineer with grade 5 in a government company and the benefits. This will be my first time in Qatar. Kindly help me. Thanks in advance. |
本文模型 | 问题:How much salary? Hi everyone I’m in the process of negotiating my salary but I have no idea how much should be the salary of mechanical engineer with grade 5 in a government company and the benefits. This will be my first time in Qatar. Kindly help me. Thanks in advance. |
Tab.8 Comparison of attention visualization to the same example between tBERT and proposed model
模型 | 注意力可视化示例 |
---|---|
tBERT模型[ | 问题:How much salary? Hi everyone I’m in the process of negotiating my salary but I have no idea how much should be the salary of mechanical engineer with grade 5 in a government company and the benefits. This will be my first time in Qatar. Kindly help me. Thanks in advance. |
本文模型 | 问题:How much salary? Hi everyone I’m in the process of negotiating my salary but I have no idea how much should be the salary of mechanical engineer with grade 5 in a government company and the benefits. This will be my first time in Qatar. Kindly help me. Thanks in advance. |
问题 | 答案 | |
---|---|---|
tBERT模型[ | 本文模型 | |
How much salary? Hi everyone I’m in the process of negotiating my salary but I have no idea how much should be the salary of mechanical engineer with grade 5 in a government company and the benefits. This will be my first time in Qatar. Kindly help me. Thanks in advance. | Hey; I am a Mechanical Engineer as well and working in Qatar. You can email me and we can discus it further. | That should be around 12-15 and you should get free government housing and a 3 000 mobile and internet allowance. That’s it. |
Tab.9 Comparison of answers to the same question predicted by tBERT and proposed model
问题 | 答案 | |
---|---|---|
tBERT模型[ | 本文模型 | |
How much salary? Hi everyone I’m in the process of negotiating my salary but I have no idea how much should be the salary of mechanical engineer with grade 5 in a government company and the benefits. This will be my first time in Qatar. Kindly help me. Thanks in advance. | Hey; I am a Mechanical Engineer as well and working in Qatar. You can email me and we can discus it further. | That should be around 12-15 and you should get free government housing and a 3 000 mobile and internet allowance. That’s it. |
1 | ASKAR M T R, HUANG J X, HOQUE E. Contextualized embeddings based transformer encoder for sentence similarity modeling in answer selection task[C]// Proceedings of the 12th Language Resources and Evaluation Conference. [S.l.]: European Language Resources Association, 2020: 5505-5514. |
2 | YANG L, AI Q Y, GUO J F, et al. aNMM: ranking short answer texts with attention-based neural matching model[C]// Proceedings of the 25th ACM International Conference on Information and Knowledge Management. New York: ACM, 2016: 287-296. 10.1145/2983323.2983818 |
3 | YANG R Q, ZHANG J H, GAO X, et al. Simple and effective text matching with richer alignment features[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019:4699-4709. 10.18653/v1/p19-1465 |
4 | NECULOIU P, VERSTEEGH M, ROTARU M. Learning text similarity with Siamese recurrent networks[C]// Proceedings of the 1st Workshop on Representation Learning for NLP. Stroudsburg, PA: ACL, 2016: 148-157. 10.18653/v1/w16-1617 |
5 | BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022. |
6 | MIHAYLOV T, NAKOV P. SemanticZ at SemEval-2016 Task 3: ranking relevant answers in community question answering using semantic similarity based on fine-tuned word embeddings[C]// Proceedings of the 10th International Workshop on Semantic Evaluation. Stroudsburg, PA: ACL, 2016: 879-886. 10.18653/v1/s16-1136 |
7 | WU G S, SHENG Y X, LAN M, et al. ECNU at SemEval-2017 task 3: using traditional and deep learning methods to address community question answering task[C]// Proceedings of the 11th International Workshop on Semantic Evaluation. Stroudsburg, PA: ACL, 2017: 365-369. 10.18653/v1/s17-2060 |
8 | WEN J H, MA J W, FENG Y L, et al. Hybrid attentive answer selection in CQA with deep users modelling[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 2556-2563. 10.1609/aaai.v32i1.11840 |
9 | XIE Y X, SHEN Y, LI Y L, et al. Attentive user-engaged adversarial neural network for community question answering[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020:9322-9329. 10.1609/aaai.v34i05.6472 |
10 | MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. (2013-09-07) [2021-01-06].. 10.3126/jiee.v3i1.34327 |
11 | PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2014: 1532-1543. 10.3115/v1/d14-1162 |
12 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: ACL, 2019:4171-4186. 10.18653/v1/n18-2 |
13 | LASKAR M T R, HOQUE E, HUANG J X. Utilizing bidirectional encoder representations from transformers for answer selection[C]// Proceedings of the 2019 International Conference on Applied Mathematics, Modeling and Computational Science, PROMS 343. Cham: Springer, 2021: 693-703. 10.1007/978-3-030-63591-6_63 |
14 | CHEN L, ZHAO Y B, LV B, et al. Neural graph matching networks for Chinese short text matching[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020:6152-6158. 10.18653/v1/2020.acl-main.547 |
15 | PEINELT N, NGUYEN D, LIAKATA M. tBERT: topic models and BERT joining forces for semantic similarity detection[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 7047-7055. 10.18653/v1/2020.acl-main.630 |
16 | REIMERS N, GUREVYCH I. Sentence-BERT: sentence embeddings using Siamese BERT-networks[EB/OL]. (2019-08-27) [2020-03-24].. 10.18653/v1/d19-1410 |
17 | 栾克鑫,杜新凯,孙承杰,等. 基于注意力机制的句子排序方法[J]. 中文信息学报, 2018, 32(1):123-130. 10.3969/j.issn.1003-0077.2018.01.016 |
LUAN K X, DU X K, SUN C J, et al. Sentence ordering based on attention mechanism[J]. Journal of Chinese Information Processing, 2018, 32(1):123-130. 10.3969/j.issn.1003-0077.2018.01.016 | |
18 | NAKOV P, MÀRQUEZ L, MOSHITTI A, et al. SemEval-2016 Task 3: community question answering[C]// Proceedings of the 10th International Workshop on Semantic Evaluation. Stroudsburg, PA: ACL, 2016:525-545. 10.18653/v1/s16-1083 |
19 | NAKOV P, HOOGEVEEN D, MÀRQUEZ L, et al. SemEval-2017 Task 3: community question answering[C]// Proceedings of the 11th International Workshop on Semantic Evaluation. Stroudsburg, PA: ACL, 2017:27-48. 10.18653/v1/s17-2003 |
[1] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. |
[2] | Qianhui LU, Yu ZHANG, Mengling WANG, Tingwei WU, Yuzhong SHAN. Classification model of nuclear power equipment quality text based on improved recurrent pooling network [J]. Journal of Computer Applications, 2024, 44(7): 2034-2040. |
[3] | Hang YU, Yanling ZHOU, Mengxin ZHAI, Han LIU. Text classification based on pre-training model and label fusion [J]. Journal of Computer Applications, 2024, 44(3): 709-714. |
[4] | Kaitian WANG, Qing YE, Chunlei CHENG. Classification method for traditional Chinese medicine electronic medical records based on heterogeneous graph representation [J]. Journal of Computer Applications, 2024, 44(2): 411-417. |
[5] | Nengbing HU, Biao CAI, Xu LI, Danhua CAO. Graph classification method based on graph pooling contrast learning [J]. Journal of Computer Applications, 2024, 44(11): 3327-3334. |
[6] | Kansong CHEN, Yuan ZHENG, Lijun XU, Zhouyu WANG, Zhe ZHANG, Fujuan YAO. ShuffaceNet: face recognition neural network based on ThetaMEX global pooling [J]. Journal of Computer Applications, 2023, 43(8): 2572-2580. |
[7] | Qinghai XU, Shifei DING, Tongfeng SUN, Jian ZHANG, Lili GUO. Improved capsule network based on multipath feature [J]. Journal of Computer Applications, 2023, 43(5): 1330-1335. |
[8] | Huiru WANG, Xiuhong LI, Zhe LI, Chunming MA, Zeyu REN, Dan YANG. Survey of multimodal pre-training models [J]. Journal of Computer Applications, 2023, 43(4): 991-1004. |
[9] | Zeyu WANG, Shuhui BU, Wei HUANG, Yuanpan ZHENG, Qinggang WU, Xu ZHANG. Local and global context attentive fusion network for traffic scene parsing [J]. Journal of Computer Applications, 2023, 43(3): 713-722. |
[10] | Yingmao YAO, Xiaoyan JIANG. Video-based person re-identification method based on graph convolution network and self-attention graph pooling [J]. Journal of Computer Applications, 2023, 43(3): 728-735. |
[11] | Xiaofei JI, Kexin ZHANG, Lirong TANG. Book spine segmentation algorithm based on improved DeepLabv3+ network [J]. Journal of Computer Applications, 2023, 43(12): 3927-3932. |
[12] | Zixing YU, Shaojun QU, Xin HE, Zhuo WANG. High-low dimensional feature guided real-time semantic segmentation network [J]. Journal of Computer Applications, 2023, 43(10): 3077-3085. |
[13] | Wen HAO, Yang WANG, Hainan WEI. Semantic segmentation of point cloud scenes based on multi-feature fusion [J]. Journal of Computer Applications, 2023, 43(10): 3202-3208. |
[14] | Jun MA, Zhen YAO, Cuifeng XU, Shouhong CHEN. Multi-UAV real-time tracking algorithm based on improved PP-YOLO and Deep-SORT [J]. Journal of Computer Applications, 2022, 42(9): 2885-2892. |
[15] | Jie HU, Yan HU, Mengchi LIU, Yan ZHANG. Chinese named entity recognition based on knowledge base entity enhanced BERT model [J]. Journal of Computer Applications, 2022, 42(9): 2680-2685. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||