基于类别感知课程学习的半监督立场检测

doi:10.11772/j.issn.1001-9081.2023101558

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (10): 3281-3287.DOI: 10.11772/j.issn.1001-9081.2023101558

• 第40届CCF中国数据库学术会议(NDBC 2023) • 上一篇下一篇

基于类别感知课程学习的半监督立场检测

高肇泽, 朱小飞(), 项能强

重庆理工大学计算机科学与工程学院，重庆 400054

收稿日期:2023-11-13 修回日期:2023-12-28 接受日期:2024-01-02 发布日期:2024-10-15 出版日期:2024-10-10
通讯作者: 朱小飞
作者简介:高肇泽（1996—），男，山东枣庄人，硕士研究生，CCF会员，主要研究方向：自然语言处理、立场检测
朱小飞（1979—），男，重庆人，教授，博士，CCF会员，主要研究方向：自然语言处理、信息检索 zxf@cqut.edu.cn
项能强（1998—），男，四川达州人，硕士研究生，CCF会员，主要研究方向：自然语言处理、社交网络。
基金资助:
重庆市自然科学基金资助项目(CSTB2022NSCQ?MSX1672);重庆市教育委员会科学技术研究计划重大项目(KJZD?M202201102);重庆理工大学校级联合资助项目(gzlcx20233248)

Semi-supervised stance detection based on category-aware curriculum learning

Zhaoze GAO, Xiaofei ZHU(), Nengqiang XIANG

College of Computer Science and Engineering，Chongqing University of Technology，Chongqing 400054，China

Received:2023-11-13 Revised:2023-12-28 Accepted:2024-01-02 Online:2024-10-15 Published:2024-10-10
Contact: Xiaofei ZHU
About author:GAO Zhaoze， born in 1996， M. S. candidate. His research interests include natural language processing， stance detection.
XIANG Nengqiang， born in 1998， M. S. candidate. His research interests include natural language processing， social network.
Supported by:
Chongqing Natural Science Foundation(CSTB2022NSCQ-MSX1672);Science and Technology Research Plan Major Project of Chongqing Municipal Education Commission(KJZD-M202201102);Chongqing University of Technology School-Level Joint Funding Project(gzlcx20233248)

摘要/Abstract

摘要：

生成伪标签是半监督立场检测的一种有效策略。在现实应用中，生成的伪标签质量存在差异，然而现有的工作将生成伪标签的质量视为是同等的，且没有充分考虑类别不平衡对伪标签生成质量的影响。为了解决上述2个问题，提出基于类别感知课程学习的半监督立场检测模型（SDCL）。首先，使用预训练分类模型对无标签推文生成伪标签；其次，根据伪标签质量的高低对推文按类别排序，并选取每个类别前k个高质量推文；最后，将各个类别选出的推文合并后重新排序，并把排序后带有伪标签的推文再输入分类模型，从而进一步优化模型参数。实验结果表明，与基线模型中表现最好的SANDS （Stance Analysis via Network Distant Supervision）相比，所提模型在3种不同划分（有标签推文总数为500、1 000和1 500）情况下，在StanceUS数据集上的宏平均（Mac-F1）分数分别提高了2、1和3个百分点，在StanceIN数据集上的Mac-F1分数均提高了1个百分点，验证了所提模型的有效性。

关键词: 半监督, 立场检测, 类别不平衡, 课程学习, 伪标签生成

Abstract:

Pseudo-label generation emerges as an effective strategy in semi-supervised stance detection. In practical applications， variations are observed in the quality of generated pseudo-labels. However， in the existing working， the quality of these labels is regarded as equivalent. Furthermore， the influence of category imbalance on the quality of pseudo-label generation is not fully considered. To address these issues， a Semi-supervised stance Detection model based on Category-aware curriculum Learning （SDCL） was proposed. Firstly， a pre-trained classification model was employed to generate pseudo-labels for unlabeled tweets. Then， tweets were sorted by category based on the quality of pseudo-labels， and the top k high-quality tweets for each category were selected. Finally， the selected tweets from each category were merged， re-sorted， and input into the classification model with pseudo-labels， thereby further optimizing the model parameters. Experimental results indicate that compared to the best-performing baseline model， SANDS （Stance Analysis via Network Distant Supervision）， the proposed model demonstrates improvements in Mac-F1 （Macro-averaged F1） scores on StanceUS dataset by 2， 1， and 3 percentage points respectively under three different splits （with 500， 1 000， and 1 500 labeled tweets）. Similarly， on StanceIN dataset， the proposed model exhibits enhancements in Mac-F1 scores by 1 percentage point under the three splits， thereby validating the effectiveness of the proposed model.

Key words: semi-supervised, stance detection, category imbalance, curriculum learning, pseudo-label generation

中图分类号:

TP391.1

高肇泽, 朱小飞, 项能强. 基于类别感知课程学习的半监督立场检测[J]. 计算机应用, 2024, 44(10): 3281-3287.

Zhaoze GAO, Xiaofei ZHU, Nengqiang XIANG. Semi-supervised stance detection based on category-aware curriculum learning[J]. Journal of Computer Applications, 2024, 44(10): 3281-3287.

图/表 7

参考文献 33

1	AUGENSTEIN I， ROCKTÄSCHEL T， VLACHOS A， et al. Stance detection with bidirectional conditional encoding［C］// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2016： 876-885.
2	XU C， PARIS C， NEPAL S， et al. Cross-target stance classification with self-attention networks［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 2： Short Papers）. Stroudsburg： ACL， 2018： 778-783.
3	GRIMMINGER L， KLINGER R. Hate towards the political opponent： a Twitter corpus study of the 2020 US elections on the basis of offensive speech and stance detection［C］// Proceedings of the 11th Workshop on Computational Approaches to Subjectivity， Sentiment and Social Media Analysis. Stroudsburg： ACL， 2021： 171-180.
4	GRČAR M， CHEREPNALKOSKI D， MOZETIČ I， et al. Stance and influence of Twitter users regarding the Brexit referendum［J］. Computational Social Networks， 2017， 4： No.6.
5	张斌，王莉，杨延杰. 联合立场的过程跟踪式多任务谣言验证模型［J］. 计算机应用， 2022， 42（11）： 3371-3378.
	ZHANG B， WANG L， YANG Y J. Process tracking multi‑task rumor verification model combined with stance［J］. Journal of Computer Applications， 2022， 42（11）： 3371-3378.
6	李峤，刘宇. 基于机器学习的推特谣言立场分析研究［J］. 电子设计工程， 2019， 27（21）：36-39， 44.
	LI Q， LIU Y. Research on Twitter rumor standpoint analysis based on machine learning［J］. Electronic Design Engineering， 2019， 27（21）：36-39， 44.
7	JIANG L， YU M， ZHOU M， et al. Target-dependent Twitter sentiment classification［C］// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2011： 151-160.
8	WANG B， LIAKATA M， ZUBIAGA A， et al. TDParse： multi-target-specific sentiment recognition on Twitter［C］// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2017： 483-493.
9	GUPTA D， SINGH K， CHAKRABARTI S， et al. Multi-task learning for target-dependent sentiment classification［C］// Proceedings of the 2019 Pacific-Asia Conference on Knowledge Discovery and Data Mining， LNCS 11439. Cham： Springer， 2019： 185-197.
10	KUMAR S， CARLEY K M. Tree LSTMs with convolution units to predict stance and rumor veracity in social media conversations［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019： 5047-5058.
11	XUAN K， XIA R. Rumor stance classification via machine learning with text， user and propagation features［C］// Proceedings of the 2019 International Conference on Data Mining Workshops. Piscataway： IEEE， 2019： 560-566.
12	ZENG L， STARBIRD K， SPIRO E S. #Unconfirmed： classifying rumor stance in crisis-related social media messages［C］// Proceedings of the 2016 International AAAI Conference on Web and Social Media. Palo Alto， CA： AAAI Press， 2016： 747-750.
13	DARWISH K， STEFANOV P， AUPETIT M， et al. Unsupervised user stance detection on Twitter［C］// Proceedings of the 2020 International AAAI Conference on Web and Social Media. Palo Alto， CA： AAAI Press， 2020： 141-152.
14	STEFANOV P， DARWISH K， ATANASOV A， et al. Predicting the topical stance and political leaning of media using tweets［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 527-537.
15	MOHAMMAD S M， SOBHANI P， KIRITCHENKO S. Stance and sentiment in tweets［J］. ACM Transactions on Internet Technology， 2017， 17（3）： No.26.
16	ZOTOVA E， AGERRI R， NUÑEZ M， et al. Multilingual stance detection in tweets： the Catalonia independence corpus［C］// Proceedings of the 12th Language Resources and Evaluation Conference. ［S.l.］： European Language Resources Association， 2020： 1368-1375.
17	CONFORTI C， BERNDT J， PILEHVAR M T， et al. Will-They-Won’t-They： a very large dataset for stance detection on Twitter［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 1715-1724.
18	DU J， XU R， HE Y， et al. Stance classification with target-specific neural attention networks［C］// Proceedings of the 26th International Joint Conferences on Artificial Intelligence. California： ijcai.org， 2017： 3988-3994.
19	DUTTA S， CAUR S， CHAKRABARTI S， et al. Semi-supervised stance detection of tweets via distant network supervision［C］// Proceedings of the 15th ACM International Conference on Web Search and Data Mining. New York： ACM， 2022： 241-251.
20	MAGDY W， DARWISH K， ABOKHODAIR N， et al. #ISISisNotIslam or #DeportAllMuslims？ Predicting unspoken views［C］// Proceedings of the 8th ACM Conference on Web Science. New York： ACM， 2016： 95-106.
21	BLUM A， MITCHELL T. Combining labeled and unlabeled data with co-training［C］// Proceedings of the 11th Annual Conference on Computational Learning Theory. New York： ACM， 1998： 92-100.
22	KIRITCHENKO S， MATWIN S. Email classification with co-training［C］// Proceedings of the 2001 Conference of the Centre for Advanced Studies on Collaborative Research. Riverton， NJ： IBM Corporation， 2001： 1-10.
23	CHEN M， WEINBERGER K Q， CHEN Y. Automatic feature decomposition for single view co-training［C］// Proceedings of the 28th International Conference on Machine Learning. Madison， WI： Omnipress， 2011： 953-960.
24	WAN X. Bilingual co-training for sentiment classification of Chinese product reviews［J］. Computational Linguistics， 2011， 37（3）： 587-616.
25	CHEN J， FENG J， SUN X， et al. Co-training semi-supervised deep learning for sentiment classification of MOOC forum posts［J］. Symmetry， 2020， 12（1）： No.8.
26	BENGIO Y， LOURADOUR J， COLLOBERT R， et al. Curriculum learning［C］// Proceedings of the 26th Annual International Conference on Machine Learning. New York： ACM， 2009： 41-48.
27	SACHAN M， XING E. Easy questions first？ A case study on curriculum learning for question answering［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2016： 453-463.
28	TAY Y， WANG S， TUAN L A， et al. Simple and effective curriculum pointer-generator networks for reading comprehension over long narratives［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019： 4922-4931.
29	PLATANIOS E A， STRETCU O， NEUBIG G， et al. Competence-based curriculum learning for neural machine translation［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 1162-1172.
30	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
31	ZHOU D， BOUSQUET O， LAL T N， et al. Learning with local and global consistency［C］// Proceedings of the 16th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2003： 321-328.
32	MUKHERJEE S， AWADALLAH A H. Uncertainty-aware self-training for few-shot text classification［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020： 21199-21212.
33	KIPF T N， WELLING M. Semi-supervised classification with graph convolutional networks［EB/OL］. （2017-02-22）［2023-05-18］..

数据集	标签类别	有标签推文数			测试样本数
数据集	标签类别	划分方式1	划分方式2	划分方式3	测试样本数
StanceUS	Pro-Dem	320	654	981	1 543
	Anti-Dem	9	22	32	46
	Pro-Rep	133	253	381	576
	Anti-Rep	13	17	29	46
	other	25	54	77	111
	总数	500	1 000	1 500	2 322
StanceIN	Pro-BJP	67	149	208	360
	Anti-BJP	136	275	425	680
	Pro-INC	24	60	83	158
	Anti-INC	2	3	6	15
	Pro-AAP	35	61	99	142
	Anti-AAP	52	101	163	210
	other	184	351	516	1 120
	总数	500	1 000	1 500	2 685

数据集	标签类别	有标签推文数			测试样本数
数据集	标签类别	划分方式1	划分方式2	划分方式3	测试样本数
StanceUS	Pro-Dem	320	654	981	1 543
	Anti-Dem	9	22	32	46
	Pro-Rep	133	253	381	576
	Anti-Rep	13	17	29	46
	other	25	54	77	111
	总数	500	1 000	1 500	2 322
StanceIN	Pro-BJP	67	149	208	360
	Anti-BJP	136	275	425	680
	Pro-INC	24	60	83	158
	Anti-INC	2	3	6	15
	Pro-AAP	35	61	99	142
	Anti-AAP	52	101	163	210
	other	184	351	516	1 120
	总数	500	1 000	1 500	2 685

模型	StanceUS			StanceIN
模型	划分方式1	划分方式2	划分方式3	划分方式1	划分方式2	划分方式3
SiamNet	0.39	0.43	0.42	0.12	0.14	0.13
BICE	0.27	0.30	0.33	0.16	0.17	0.23
TAN	0.38	0.46	0.45	0.14	0.14	0.17
SVM	0.37	0.37	0.45	0.13	0.13	0.16
BERT	0.39	0.50	0.51	0.17	0.17	0.21
ConvNet	0.37	0.43	0.45	0.35	0.40	0.41
BLSTM	0.35	0.43	0.44	0.31	0.39	0.38
LS-SVM	0.39	0.42	0.44	0.18	0.19	0.18
ST-ConvNet	0.13	0.15	0.16	0.10	0.11	0.11
ST-BLSTM	0.13	0.16	0.19	0.09	0.12	0.11
UST	0.35	0.42	0.41	0.12	0.16	0.16
GCN-ConvNet	0.41	0.45	0.47	0.33	0.35	0.40
GCN-BLSTM	0.39	0.42	0.46	0.36	0.41	0.42
SANDS	0.49	0.53	0.55	0.42	0.45	0.47
SDCL	0.51	0.54	0.58	0.43	0.46	0.48

模型	StanceUS			StanceIN
模型	划分方式1	划分方式2	划分方式3	划分方式1	划分方式2	划分方式3
SiamNet	0.39	0.43	0.42	0.12	0.14	0.13
BICE	0.27	0.30	0.33	0.16	0.17	0.23
TAN	0.38	0.46	0.45	0.14	0.14	0.17
SVM	0.37	0.37	0.45	0.13	0.13	0.16
BERT	0.39	0.50	0.51	0.17	0.17	0.21
ConvNet	0.37	0.43	0.45	0.35	0.40	0.41
BLSTM	0.35	0.43	0.44	0.31	0.39	0.38
LS-SVM	0.39	0.42	0.44	0.18	0.19	0.18
ST-ConvNet	0.13	0.15	0.16	0.10	0.11	0.11
ST-BLSTM	0.13	0.16	0.19	0.09	0.12	0.11
UST	0.35	0.42	0.41	0.12	0.16	0.16
GCN-ConvNet	0.41	0.45	0.47	0.33	0.35	0.40
GCN-BLSTM	0.39	0.42	0.46	0.36	0.41	0.42
SANDS	0.49	0.53	0.55	0.42	0.45	0.47
SDCL	0.51	0.54	0.58	0.43	0.46	0.48

模型	StanceUS			StanceIN
模型	划分方式1	划分方式2	划分方式3	划分方式1	划分方式2	划分方式3
SDCL	0.505	0.541	0.576	0.432	0.461	0.483
w/o curr	0.499	0.539	0.575	0.424	0.455	0.476
w/o pre	0.470	0.529	0.560	0.408	0.443	0.458
w/o c-a	0.492	0.535	0.568	0.418	0.450	0.475

基于类别感知课程学习的半监督立场检测

Semi-supervised stance detection based on category-aware curriculum learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 33

相关文章 15

编辑推荐

Metrics

无标签推文使用比例/%	StanceUS			StanceIN
无标签推文使用比例/%	划分方式1	划分方式2	划分方式3	划分方式1	划分方式2	划分方式3
100	0.505	0.541	0.576	0.432	0.461	0.483
80	0.494	0.540	0.574	0.424	0.456	0.477
50	0.490	0.536	0.570	0.423	0.456	0.475
30	0.465	0.526	0.554	0.418	0.450	0.469
10	0.454	0.516	0.550	0.407	0.442	0.466

[1]	张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333.
[2]	李晨倩, 刘俊. 基于半监督和多尺度级联注意力的超声颈动脉斑块分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2604-2610.
[3]	周妍, 李阳. 用于脑卒中病灶分割的具有注意力机制的校正交叉伪监督方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1942-1948.
[4]	张帅华, 张淑芬, 周明川, 徐超, 陈学斌. 基于半监督联邦学习的恶意流量检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3487-3494.
[5]	胡新荣, 陈静雪, 黄子键, 王帮超, 姚迅, 刘军平, 朱强, 杨捷. 基于图卷积网络的掩码数据增强[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3335-3344.
[6]	王瑞琪, 纪淑娟, 曹宁, 郭亚杰. 基于一致性训练的半监督虚假招聘广告检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2932-2939.
[7]	姜春茂, 吴鹏, 李志聪. 基于Seeds集和成对约束的半监督三支聚类集成[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1481-1488.
[8]	石利锋, 倪郑威. 基于槽位相关信息提取的对话状态追踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1430-1437.
[9]	伏博毅, 彭云聪, 蓝鑫, 秦小林. 基于深度学习的标签噪声学习算法综述[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 674-684.
[10]	方昕, 黄泽鑫, 张聿晗, 高天, 潘嘉, 付中华, 高建清, 刘俊华, 邹亮. 基于时域波形的半监督端到端虚假语音检测方法[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 227-231.
[11]	李锦烨, 黄瑞章, 秦永彬, 陈艳平, 田小瑜. 基于反绎学习的裁判文书量刑情节识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1802-1807.
[12]	邱永茹, 姚光乐, 冯杰, 崔昊宇. 基于半监督学习的单幅图像去雨算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1577-1582.
[13]	殷雨昌, 王洪元, 陈莉, 冯尊登, 肖宇. 基于单标注样本的多损失学习与联合度量视频行人重识别[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 764-769.
[14]	李懿恒, 杜晨曦, 杨燕燕, 李翔宇. 基于伪标签一致度的不平衡数据特征选择算法[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 475-484.
[15]	陆宇, 赵凌云, 白斌雯, 姜震. 基于改进的半监督聚类的不平衡分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(12): 3750-3755.