《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (10): 3281-3287.DOI: 10.11772/j.issn.1001-9081.2023101558
• 第40届CCF中国数据库学术会议(NDBC 2023) • 上一篇 下一篇
收稿日期:
2023-11-13
修回日期:
2023-12-28
接受日期:
2024-01-02
发布日期:
2024-10-15
出版日期:
2024-10-10
通讯作者:
朱小飞
作者简介:
高肇泽(1996—),男,山东枣庄人,硕士研究生,CCF会员,主要研究方向:自然语言处理、立场检测基金资助:
Zhaoze GAO, Xiaofei ZHU(), Nengqiang XIANG
Received:
2023-11-13
Revised:
2023-12-28
Accepted:
2024-01-02
Online:
2024-10-15
Published:
2024-10-10
Contact:
Xiaofei ZHU
About author:
GAO Zhaoze, born in 1996, M. S. candidate. His research interests include natural language processing, stance detection.Supported by:
摘要:
生成伪标签是半监督立场检测的一种有效策略。在现实应用中,生成的伪标签质量存在差异,然而现有的工作将生成伪标签的质量视为是同等的,且没有充分考虑类别不平衡对伪标签生成质量的影响。为了解决上述2个问题,提出基于类别感知课程学习的半监督立场检测模型(SDCL)。首先,使用预训练分类模型对无标签推文生成伪标签;其次,根据伪标签质量的高低对推文按类别排序,并选取每个类别前k个高质量推文;最后,将各个类别选出的推文合并后重新排序,并把排序后带有伪标签的推文再输入分类模型,从而进一步优化模型参数。实验结果表明,与基线模型中表现最好的SANDS (Stance Analysis via Network Distant Supervision)相比,所提模型在3种不同划分(有标签推文总数为500、1 000和1 500)情况下,在StanceUS数据集上的宏平均(Mac-F1)分数分别提高了2、1和3个百分点,在StanceIN数据集上的Mac-F1分数均提高了1个百分点,验证了所提模型的有效性。
中图分类号:
高肇泽, 朱小飞, 项能强. 基于类别感知课程学习的半监督立场检测[J]. 计算机应用, 2024, 44(10): 3281-3287.
Zhaoze GAO, Xiaofei ZHU, Nengqiang XIANG. Semi-supervised stance detection based on category-aware curriculum learning[J]. Journal of Computer Applications, 2024, 44(10): 3281-3287.
数据集 | 标签类别 | 有标签推文数 | 测试 样本数 | ||
---|---|---|---|---|---|
划分方式1 | 划分方式2 | 划分方式3 | |||
StanceUS | Pro-Dem | 320 | 654 | 981 | 1 543 |
Anti-Dem | 9 | 22 | 32 | 46 | |
Pro-Rep | 133 | 253 | 381 | 576 | |
Anti-Rep | 13 | 17 | 29 | 46 | |
other | 25 | 54 | 77 | 111 | |
总数 | 500 | 1 000 | 1 500 | 2 322 | |
StanceIN | Pro-BJP | 67 | 149 | 208 | 360 |
Anti-BJP | 136 | 275 | 425 | 680 | |
Pro-INC | 24 | 60 | 83 | 158 | |
Anti-INC | 2 | 3 | 6 | 15 | |
Pro-AAP | 35 | 61 | 99 | 142 | |
Anti-AAP | 52 | 101 | 163 | 210 | |
other | 184 | 351 | 516 | 1 120 | |
总数 | 500 | 1 000 | 1 500 | 2 685 |
表1 标注数据集在不同划分下的训练和测试样本分布
Tab. 1 Distribution of training and test samples for labeled datasets under different splits
数据集 | 标签类别 | 有标签推文数 | 测试 样本数 | ||
---|---|---|---|---|---|
划分方式1 | 划分方式2 | 划分方式3 | |||
StanceUS | Pro-Dem | 320 | 654 | 981 | 1 543 |
Anti-Dem | 9 | 22 | 32 | 46 | |
Pro-Rep | 133 | 253 | 381 | 576 | |
Anti-Rep | 13 | 17 | 29 | 46 | |
other | 25 | 54 | 77 | 111 | |
总数 | 500 | 1 000 | 1 500 | 2 322 | |
StanceIN | Pro-BJP | 67 | 149 | 208 | 360 |
Anti-BJP | 136 | 275 | 425 | 680 | |
Pro-INC | 24 | 60 | 83 | 158 | |
Anti-INC | 2 | 3 | 6 | 15 | |
Pro-AAP | 35 | 61 | 99 | 142 | |
Anti-AAP | 52 | 101 | 163 | 210 | |
other | 184 | 351 | 516 | 1 120 | |
总数 | 500 | 1 000 | 1 500 | 2 685 |
模型 | StanceUS | StanceIN | ||||
---|---|---|---|---|---|---|
划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | |
SiamNet | 0.39 | 0.43 | 0.42 | 0.12 | 0.14 | 0.13 |
BICE | 0.27 | 0.30 | 0.33 | 0.16 | 0.17 | 0.23 |
TAN | 0.38 | 0.46 | 0.45 | 0.14 | 0.14 | 0.17 |
SVM | 0.37 | 0.37 | 0.45 | 0.13 | 0.13 | 0.16 |
BERT | 0.39 | 0.50 | 0.51 | 0.17 | 0.17 | 0.21 |
ConvNet | 0.37 | 0.43 | 0.45 | 0.35 | 0.40 | 0.41 |
BLSTM | 0.35 | 0.43 | 0.44 | 0.31 | 0.39 | 0.38 |
LS-SVM | 0.39 | 0.42 | 0.44 | 0.18 | 0.19 | 0.18 |
ST-ConvNet | 0.13 | 0.15 | 0.16 | 0.10 | 0.11 | 0.11 |
ST-BLSTM | 0.13 | 0.16 | 0.19 | 0.09 | 0.12 | 0.11 |
UST | 0.35 | 0.42 | 0.41 | 0.12 | 0.16 | 0.16 |
GCN-ConvNet | 0.41 | 0.45 | 0.47 | 0.33 | 0.35 | 0.40 |
GCN-BLSTM | 0.39 | 0.42 | 0.46 | 0.36 | 0.41 | 0.42 |
SANDS | 0.49 | 0.53 | 0.55 | 0.42 | 0.45 | 0.47 |
SDCL | 0.51 | 0.54 | 0.58 | 0.43 | 0.46 | 0.48 |
表2 StanceUS和StanceIN数据集上的标记训练数据在不同划分下不同模型的Mac-F1分数
Tab. 2 Mac-F1 scores of different models under different splits of labeled training data on StanceUS and StanceIN datasets
模型 | StanceUS | StanceIN | ||||
---|---|---|---|---|---|---|
划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | |
SiamNet | 0.39 | 0.43 | 0.42 | 0.12 | 0.14 | 0.13 |
BICE | 0.27 | 0.30 | 0.33 | 0.16 | 0.17 | 0.23 |
TAN | 0.38 | 0.46 | 0.45 | 0.14 | 0.14 | 0.17 |
SVM | 0.37 | 0.37 | 0.45 | 0.13 | 0.13 | 0.16 |
BERT | 0.39 | 0.50 | 0.51 | 0.17 | 0.17 | 0.21 |
ConvNet | 0.37 | 0.43 | 0.45 | 0.35 | 0.40 | 0.41 |
BLSTM | 0.35 | 0.43 | 0.44 | 0.31 | 0.39 | 0.38 |
LS-SVM | 0.39 | 0.42 | 0.44 | 0.18 | 0.19 | 0.18 |
ST-ConvNet | 0.13 | 0.15 | 0.16 | 0.10 | 0.11 | 0.11 |
ST-BLSTM | 0.13 | 0.16 | 0.19 | 0.09 | 0.12 | 0.11 |
UST | 0.35 | 0.42 | 0.41 | 0.12 | 0.16 | 0.16 |
GCN-ConvNet | 0.41 | 0.45 | 0.47 | 0.33 | 0.35 | 0.40 |
GCN-BLSTM | 0.39 | 0.42 | 0.46 | 0.36 | 0.41 | 0.42 |
SANDS | 0.49 | 0.53 | 0.55 | 0.42 | 0.45 | 0.47 |
SDCL | 0.51 | 0.54 | 0.58 | 0.43 | 0.46 | 0.48 |
模型 | StanceUS | StanceIN | ||||
---|---|---|---|---|---|---|
划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | |
SDCL | 0.505 | 0.541 | 0.576 | 0.432 | 0.461 | 0.483 |
w/o curr | 0.499 | 0.539 | 0.575 | 0.424 | 0.455 | 0.476 |
w/o pre | 0.470 | 0.529 | 0.560 | 0.408 | 0.443 | 0.458 |
w/o c-a | 0.492 | 0.535 | 0.568 | 0.418 | 0.450 | 0.475 |
表3 SDCL在2个数据集上的消融实验结果(Mac-F1)
Tab. 3 Ablation experiment results (Mac-F1) of SDCL on two datasets
模型 | StanceUS | StanceIN | ||||
---|---|---|---|---|---|---|
划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | |
SDCL | 0.505 | 0.541 | 0.576 | 0.432 | 0.461 | 0.483 |
w/o curr | 0.499 | 0.539 | 0.575 | 0.424 | 0.455 | 0.476 |
w/o pre | 0.470 | 0.529 | 0.560 | 0.408 | 0.443 | 0.458 |
w/o c-a | 0.492 | 0.535 | 0.568 | 0.418 | 0.450 | 0.475 |
无标签推文 使用比例/% | StanceUS | StanceIN | ||||
---|---|---|---|---|---|---|
划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | |
100 | 0.505 | 0.541 | 0.576 | 0.432 | 0.461 | 0.483 |
80 | 0.494 | 0.540 | 0.574 | 0.424 | 0.456 | 0.477 |
50 | 0.490 | 0.536 | 0.570 | 0.423 | 0.456 | 0.475 |
30 | 0.465 | 0.526 | 0.554 | 0.418 | 0.450 | 0.469 |
10 | 0.454 | 0.516 | 0.550 | 0.407 | 0.442 | 0.466 |
表4 SDCL使用不同比例未标记数据训练的Mac-F1
Tab. 4 Mac-F1 of SDCL trained by different proportions of unlabeled data
无标签推文 使用比例/% | StanceUS | StanceIN | ||||
---|---|---|---|---|---|---|
划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | 划分 方 | |
100 | 0.505 | 0.541 | 0.576 | 0.432 | 0.461 | 0.483 |
80 | 0.494 | 0.540 | 0.574 | 0.424 | 0.456 | 0.477 |
50 | 0.490 | 0.536 | 0.570 | 0.423 | 0.456 | 0.475 |
30 | 0.465 | 0.526 | 0.554 | 0.418 | 0.450 | 0.469 |
10 | 0.454 | 0.516 | 0.550 | 0.407 | 0.442 | 0.466 |
1 | AUGENSTEIN I, ROCKTÄSCHEL T, VLACHOS A, et al. Stance detection with bidirectional conditional encoding[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2016: 876-885. |
2 | XU C, PARIS C, NEPAL S, et al. Cross-target stance classification with self-attention networks[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg: ACL, 2018: 778-783. |
3 | GRIMMINGER L, KLINGER R. Hate towards the political opponent: a Twitter corpus study of the 2020 US elections on the basis of offensive speech and stance detection[C]// Proceedings of the 11th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Stroudsburg: ACL, 2021: 171-180. |
4 | GRČAR M, CHEREPNALKOSKI D, MOZETIČ I, et al. Stance and influence of Twitter users regarding the Brexit referendum[J]. Computational Social Networks, 2017, 4: No.6. |
5 | 张斌,王莉,杨延杰. 联合立场的过程跟踪式多任务谣言验证模型[J]. 计算机应用, 2022, 42(11): 3371-3378. |
ZHANG B, WANG L, YANG Y J. Process tracking multi‑task rumor verification model combined with stance[J]. Journal of Computer Applications, 2022, 42(11): 3371-3378. | |
6 | 李峤,刘宇. 基于机器学习的推特谣言立场分析研究[J]. 电子设计工程, 2019, 27(21):36-39, 44. |
LI Q, LIU Y. Research on Twitter rumor standpoint analysis based on machine learning[J]. Electronic Design Engineering, 2019, 27(21):36-39, 44. | |
7 | JIANG L, YU M, ZHOU M, et al. Target-dependent Twitter sentiment classification[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2011: 151-160. |
8 | WANG B, LIAKATA M, ZUBIAGA A, et al. TDParse: multi-target-specific sentiment recognition on Twitter[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2017: 483-493. |
9 | GUPTA D, SINGH K, CHAKRABARTI S, et al. Multi-task learning for target-dependent sentiment classification[C]// Proceedings of the 2019 Pacific-Asia Conference on Knowledge Discovery and Data Mining, LNCS 11439. Cham: Springer, 2019: 185-197. |
10 | KUMAR S, CARLEY K M. Tree LSTMs with convolution units to predict stance and rumor veracity in social media conversations[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 5047-5058. |
11 | XUAN K, XIA R. Rumor stance classification via machine learning with text, user and propagation features[C]// Proceedings of the 2019 International Conference on Data Mining Workshops. Piscataway: IEEE, 2019: 560-566. |
12 | ZENG L, STARBIRD K, SPIRO E S. #Unconfirmed: classifying rumor stance in crisis-related social media messages[C]// Proceedings of the 2016 International AAAI Conference on Web and Social Media. Palo Alto, CA: AAAI Press, 2016: 747-750. |
13 | DARWISH K, STEFANOV P, AUPETIT M, et al. Unsupervised user stance detection on Twitter[C]// Proceedings of the 2020 International AAAI Conference on Web and Social Media. Palo Alto, CA: AAAI Press, 2020: 141-152. |
14 | STEFANOV P, DARWISH K, ATANASOV A, et al. Predicting the topical stance and political leaning of media using tweets[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 527-537. |
15 | MOHAMMAD S M, SOBHANI P, KIRITCHENKO S. Stance and sentiment in tweets[J]. ACM Transactions on Internet Technology, 2017, 17(3): No.26. |
16 | ZOTOVA E, AGERRI R, NUÑEZ M, et al. Multilingual stance detection in tweets: the Catalonia independence corpus[C]// Proceedings of the 12th Language Resources and Evaluation Conference. [S.l.]: European Language Resources Association, 2020: 1368-1375. |
17 | CONFORTI C, BERNDT J, PILEHVAR M T, et al. Will-They-Won’t-They: a very large dataset for stance detection on Twitter[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 1715-1724. |
18 | DU J, XU R, HE Y, et al. Stance classification with target-specific neural attention networks[C]// Proceedings of the 26th International Joint Conferences on Artificial Intelligence. California: ijcai.org, 2017: 3988-3994. |
19 | DUTTA S, CAUR S, CHAKRABARTI S, et al. Semi-supervised stance detection of tweets via distant network supervision[C]// Proceedings of the 15th ACM International Conference on Web Search and Data Mining. New York: ACM, 2022: 241-251. |
20 | MAGDY W, DARWISH K, ABOKHODAIR N, et al. #ISISisNotIslam or #DeportAllMuslims? Predicting unspoken views[C]// Proceedings of the 8th ACM Conference on Web Science. New York: ACM, 2016: 95-106. |
21 | BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]// Proceedings of the 11th Annual Conference on Computational Learning Theory. New York: ACM, 1998: 92-100. |
22 | KIRITCHENKO S, MATWIN S. Email classification with co-training[C]// Proceedings of the 2001 Conference of the Centre for Advanced Studies on Collaborative Research. Riverton, NJ: IBM Corporation, 2001: 1-10. |
23 | CHEN M, WEINBERGER K Q, CHEN Y. Automatic feature decomposition for single view co-training[C]// Proceedings of the 28th International Conference on Machine Learning. Madison, WI: Omnipress, 2011: 953-960. |
24 | WAN X. Bilingual co-training for sentiment classification of Chinese product reviews[J]. Computational Linguistics, 2011, 37(3): 587-616. |
25 | CHEN J, FENG J, SUN X, et al. Co-training semi-supervised deep learning for sentiment classification of MOOC forum posts[J]. Symmetry, 2020, 12(1): No.8. |
26 | BENGIO Y, LOURADOUR J, COLLOBERT R, et al. Curriculum learning[C]// Proceedings of the 26th Annual International Conference on Machine Learning. New York: ACM, 2009: 41-48. |
27 | SACHAN M, XING E. Easy questions first? A case study on curriculum learning for question answering[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2016: 453-463. |
28 | TAY Y, WANG S, TUAN L A, et al. Simple and effective curriculum pointer-generator networks for reading comprehension over long narratives[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 4922-4931. |
29 | PLATANIOS E A, STRETCU O, NEUBIG G, et al. Competence-based curriculum learning for neural machine translation[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 1162-1172. |
30 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
31 | ZHOU D, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[C]// Proceedings of the 16th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2003: 321-328. |
32 | MUKHERJEE S, AWADALLAH A H. Uncertainty-aware self-training for few-shot text classification[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2020: 21199-21212. |
33 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22) [2023-05-18].. |
[1] | 张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333. |
[2] | 李晨倩, 刘俊. 基于半监督和多尺度级联注意力的超声颈动脉斑块分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2604-2610. |
[3] | 周妍, 李阳. 用于脑卒中病灶分割的具有注意力机制的校正交叉伪监督方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1942-1948. |
[4] | 张帅华, 张淑芬, 周明川, 徐超, 陈学斌. 基于半监督联邦学习的恶意流量检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3487-3494. |
[5] | 胡新荣, 陈静雪, 黄子键, 王帮超, 姚迅, 刘军平, 朱强, 杨捷. 基于图卷积网络的掩码数据增强[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3335-3344. |
[6] | 王瑞琪, 纪淑娟, 曹宁, 郭亚杰. 基于一致性训练的半监督虚假招聘广告检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2932-2939. |
[7] | 姜春茂, 吴鹏, 李志聪. 基于Seeds集和成对约束的半监督三支聚类集成[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1481-1488. |
[8] | 石利锋, 倪郑威. 基于槽位相关信息提取的对话状态追踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1430-1437. |
[9] | 伏博毅, 彭云聪, 蓝鑫, 秦小林. 基于深度学习的标签噪声学习算法综述[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 674-684. |
[10] | 方昕, 黄泽鑫, 张聿晗, 高天, 潘嘉, 付中华, 高建清, 刘俊华, 邹亮. 基于时域波形的半监督端到端虚假语音检测方法[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 227-231. |
[11] | 李锦烨, 黄瑞章, 秦永彬, 陈艳平, 田小瑜. 基于反绎学习的裁判文书量刑情节识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1802-1807. |
[12] | 邱永茹, 姚光乐, 冯杰, 崔昊宇. 基于半监督学习的单幅图像去雨算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1577-1582. |
[13] | 殷雨昌, 王洪元, 陈莉, 冯尊登, 肖宇. 基于单标注样本的多损失学习与联合度量视频行人重识别[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 764-769. |
[14] | 李懿恒, 杜晨曦, 杨燕燕, 李翔宇. 基于伪标签一致度的不平衡数据特征选择算法[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 475-484. |
[15] | 陆宇, 赵凌云, 白斌雯, 姜震. 基于改进的半监督聚类的不平衡分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(12): 3750-3755. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||