Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (6): 1734-1742.DOI: 10.11772/j.issn.1001-9081.2023060851
Special Issue: CCF第38届中国计算机应用大会 (CCF NCCA 2023)
• The 38th CCF National Conference of Computer Applications (CCF NCCA 2023) • Previous Articles Next Articles
Xiaoxia JIANG1,2,3, Ruizhang HUANG1,2,3(), Ruina BAI1,2,3, Lina REN1,2,3, Yanping CHEN1,2,3
Received:
2023-07-04
Revised:
2023-08-03
Accepted:
2023-08-08
Online:
2023-08-23
Published:
2024-06-10
Contact:
Ruizhang HUANG
About author:
JIANG Xiaoxia, born in 1998, M.S. candidate. Her research interests include natural language processing, text mining, machine learning.Supported by:
蒋小霞1,2,3, 黄瑞章1,2,3(), 白瑞娜1,2,3, 任丽娜1,2,3, 陈艳平1,2,3
通讯作者:
黄瑞章
作者简介:
蒋小霞(1998—),女,贵州安顺人,硕士研究生,主要研究方向:自然语言处理、文本挖掘、机器学习基金资助:
CLC Number:
Xiaoxia JIANG, Ruizhang HUANG, Ruina BAI, Lina REN, Yanping CHEN. Deep event clustering method based on event representation and contrastive learning[J]. Journal of Computer Applications, 2024, 44(6): 1734-1742.
蒋小霞, 黄瑞章, 白瑞娜, 任丽娜, 陈艳平. 基于事件表示和对比学习的深度事件聚类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1734-1742.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023060851
数据集 | DuEE | FewFC | Military | ACE2005 | ||||
---|---|---|---|---|---|---|---|---|
句子数 | 实体数 | 句子数 | 实体数 | 句子数 | 实体数 | 句子数 | 实体数 | |
训练 集 | 6 897 | 21 552 | 4 931 | 20 512 | 4 549 | 19 525 | 1 213 | 4 418 |
测试 集 | 5 027 | 15 546 | 2 655 | 11 016 | 2 448 | 10 559 | 654 | 2 453 |
Tab. 1 Statistical information of samples and entities in training sets and test sets
数据集 | DuEE | FewFC | Military | ACE2005 | ||||
---|---|---|---|---|---|---|---|---|
句子数 | 实体数 | 句子数 | 实体数 | 句子数 | 实体数 | 句子数 | 实体数 | |
训练 集 | 6 897 | 21 552 | 4 931 | 20 512 | 4 549 | 19 525 | 1 213 | 4 418 |
测试 集 | 5 027 | 15 546 | 2 655 | 11 016 | 2 448 | 10 559 | 654 | 2 453 |
数据集 | 事件类别 | 事件数 | 样本数 |
---|---|---|---|
DuEE | 产品行为 | 894 | 4 882 |
财经/交易 | 377 | ||
组织行为 | 189 | ||
竞赛行为 | 860 | ||
组织关系 | 629 | ||
灾害/意外 | 337 | ||
司法行为 | 667 | ||
交往 | 203 | ||
人生 | 726 | ||
FewFC | 收购 | 388 | 2 535 |
股权股份转让 | 133 | ||
质押 | 390 | ||
减持 | 152 | ||
担保 | 265 | ||
签署合同 | 238 | ||
判决 | 323 | ||
起诉 | 246 | ||
投资 | 93 | ||
中标 | 307 | ||
Military | 展示 | 294 | 2 182 |
演习 | 1 357 | ||
实验 | 306 | ||
保障 | 20 | ||
意外事故 | 99 | ||
支援 | 35 | ||
部署 | 71 | ||
ACE2005 | Transaction | 33 | 580 |
Personnel | 54 | ||
Conflict | 67 | ||
Contact | 58 | ||
Movement | 156 | ||
Justice | 108 | ||
Business | 45 | ||
Life | 59 |
Tab. 2 Statistical information for each type of events
数据集 | 事件类别 | 事件数 | 样本数 |
---|---|---|---|
DuEE | 产品行为 | 894 | 4 882 |
财经/交易 | 377 | ||
组织行为 | 189 | ||
竞赛行为 | 860 | ||
组织关系 | 629 | ||
灾害/意外 | 337 | ||
司法行为 | 667 | ||
交往 | 203 | ||
人生 | 726 | ||
FewFC | 收购 | 388 | 2 535 |
股权股份转让 | 133 | ||
质押 | 390 | ||
减持 | 152 | ||
担保 | 265 | ||
签署合同 | 238 | ||
判决 | 323 | ||
起诉 | 246 | ||
投资 | 93 | ||
中标 | 307 | ||
Military | 展示 | 294 | 2 182 |
演习 | 1 357 | ||
实验 | 306 | ||
保障 | 20 | ||
意外事故 | 99 | ||
支援 | 35 | ||
部署 | 71 | ||
ACE2005 | Transaction | 33 | 580 |
Personnel | 54 | ||
Conflict | 67 | ||
Contact | 58 | ||
Movement | 156 | ||
Justice | 108 | ||
Business | 45 | ||
Life | 59 |
方法 | DuEE | FewFC | Military | ACE2005 | ||||
---|---|---|---|---|---|---|---|---|
ACC | NMI | ACC | NMI | ACC | NMI | ACC | NMI | |
K-means | 41.39 | 40.93 | 39.46 | 38.05 | 34.65 | 20.15 | 32.95 | 20.55 |
DCN | 34.65 | 19.42 | 30.70 | 19.32 | 43.82 | 7.97 | 24.45 | 7.51 |
DEC | 57.28 | 43.18 | 51.85 | 43.18 | 38.13 | 20.49 | 34.88 | |
IDEC | 58.10 | 44.46 | 51.32 | 43.44 | 35.50 | 21.05 | ||
DEKM | 47.56 | 37.19 | 49.20 | 38.07 | 31.27 | 15.88 | 30.65 | 16.25 |
SDCN | 38.01 | 20.77 | 34.21 | 16.12 | 26.62 | 7.51 | ||
DCMSF | 57.37 | 39.23 | 45.66 | 33.40 | 35.99 | 20.58 | 21.83 | |
EDESC | 49.04 | 38.83 | 43.57 | 20.07 | 34.33 | 19.44 | ||
DEC_ERCL | 70.50 | 64.10 | 64.90 | 63.74 | 51.50 | 43.36 | 49.69 | 39.45 |
Tab.3 Index values of different methods on various datasets
方法 | DuEE | FewFC | Military | ACE2005 | ||||
---|---|---|---|---|---|---|---|---|
ACC | NMI | ACC | NMI | ACC | NMI | ACC | NMI | |
K-means | 41.39 | 40.93 | 39.46 | 38.05 | 34.65 | 20.15 | 32.95 | 20.55 |
DCN | 34.65 | 19.42 | 30.70 | 19.32 | 43.82 | 7.97 | 24.45 | 7.51 |
DEC | 57.28 | 43.18 | 51.85 | 43.18 | 38.13 | 20.49 | 34.88 | |
IDEC | 58.10 | 44.46 | 51.32 | 43.44 | 35.50 | 21.05 | ||
DEKM | 47.56 | 37.19 | 49.20 | 38.07 | 31.27 | 15.88 | 30.65 | 16.25 |
SDCN | 38.01 | 20.77 | 34.21 | 16.12 | 26.62 | 7.51 | ||
DCMSF | 57.37 | 39.23 | 45.66 | 33.40 | 35.99 | 20.58 | 21.83 | |
EDESC | 49.04 | 38.83 | 43.57 | 20.07 | 34.33 | 19.44 | ||
DEC_ERCL | 70.50 | 64.10 | 64.90 | 63.74 | 51.50 | 43.36 | 49.69 | 39.45 |
方法 | DuEE | FewFC | Military | ACE2005 | ||||
---|---|---|---|---|---|---|---|---|
ACC | NMI | ACC | NMI | ACC | NMI | ACC | NMI | |
DEC_ERCL | 70.50 | 64.10 | 64.90 | 63.74 | 53.34 | 45.14 | 49.69 | 39.45 |
λt=λr | 61.29 | 50.45 | 50.46 | 44.90 | 36.08 | 20.52 | 48.98 | 36.52 |
w/o cl | 65.69 | 58.95 | 60.36 | 60.08 | 50.47 | 43.40 | 47.49 | 37.61 |
w/o KL | 52.76 | 43.34 | 44.96 | 37.22 | 52.76 | 36.58 | 46.67 | 33.43 |
w/o W2V(Onehot) | 23.86 | 10.42 | 22.37 | 9.58 | 32.29 | 2.65 | 20.34 | 4.03 |
Tab. 4 Results of ablation experiments
方法 | DuEE | FewFC | Military | ACE2005 | ||||
---|---|---|---|---|---|---|---|---|
ACC | NMI | ACC | NMI | ACC | NMI | ACC | NMI | |
DEC_ERCL | 70.50 | 64.10 | 64.90 | 63.74 | 53.34 | 45.14 | 49.69 | 39.45 |
λt=λr | 61.29 | 50.45 | 50.46 | 44.90 | 36.08 | 20.52 | 48.98 | 36.52 |
w/o cl | 65.69 | 58.95 | 60.36 | 60.08 | 50.47 | 43.40 | 47.49 | 37.61 |
w/o KL | 52.76 | 43.34 | 44.96 | 37.22 | 52.76 | 36.58 | 46.67 | 33.43 |
w/o W2V(Onehot) | 23.86 | 10.42 | 22.37 | 9.58 | 32.29 | 2.65 | 20.34 | 4.03 |
1 | 马春明,李秀红,李哲,等.事件抽取综述[J].计算机应用,2022,42(10):2975-2989. |
MA C M, LI X H, LI Z, et al. Survey of event extraction[J]. Journal of Computer Applications, 2022, 42(10): 2975-2989. | |
2 | 廖阔,丁效,秦兵,等.事件表示学习综述[J].智能计算机与应用,2020,10 (6):12-18. |
LIAO K, DING X, QIN B, et al. Event represent learning: a survey[J]. Intelligent Computer and Applications, 2020, 10(6): 12-18. | |
3 | DENG S, ZHANG N, LI L, et al. OntoED: low-resource event detection with ontology embedding [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 2828-2839. |
4 | LIU J, CHEN Y, XU J. Saliency as evidence: event detection with trigger saliency attribution [C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 4573-4585. |
5 | WANG S, YU M, CHANG S, et al. Query and extract: refining event extraction as type-oriented binary decoding [C]// Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg: ACL, 2022: 169-182. |
6 | LI R, ZHAO W, YANG C, et al. Treasures outside contexts: improving event detection via global statistics[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 2625-2635. |
7 | CHEN Y, XU L, LIU K, et al. Event extraction via dynamic multi-pooling convolutional neural networks[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2015: 167-176. |
8 | XIE J, GIRSHICK R, FARHADI A. Unsupervised deep embedding for clustering analysis[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: JMLR.org, 2016: 478-487. |
9 | GUO W, LIN K, YE W. Deep embedded K-means clustering[C]// Proceedings of the 21st International Conference on Data Mining Workshops. Piscataway: IEEE, 2021: 686-694. |
10 | 任丽娜,秦永彬,黄瑞章,等.基于多层子空间语义融合的深度文本聚类[J].计算机应用研究,2023,40(1):70-74,79. |
REN L N, QIN Y B, HUANG R Z, et al. Deep document clustering model via muti-layer subspace semantic fusion[J]. Application Research of Computers, 2023, 40(1): 70-74,79. | |
11 | GOLZARI OSKOUEI A, BALAFAR M A, MOTAMED C. EDCWRN: efficient deep clustering with the weight of representations and the help of neighbors[J]. Applied Intelligence, 2023, 53(5): 5845-5867. |
12 | SAEED Z, ABBASI R A, RAZZAK M I, et al. Event detection in Twitter stream using weighted dynamic heartbeat graph approach[J]. IEEE Computational Intelligence Magazine, 2019, 14(3): 29-38. |
13 | 彭博远,彭冬亮,谷雨,等.融合语义与事件特征的重大事件趋势预测[J].计算机工程与应用,2020,56(17):173-180. |
PEN B Y, PENG D L, GU Y, et al. Trend prediction for mega-event by fusing semantics and event characteristics[J]. Computer Engineering and Applications, 2020, 56(17): 173-180. | |
14 | LI J, MA X. Research on hot news discovery model based on user interest and topic discovery[J]. Cluster Computing, 2019, 22(4): 8483-8491. |
15 | GAO J, WANG W, YU C, et al. Improving event representation via simultaneous weakly supervised contrastive learning and clustering[C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 3036-3049. |
16 | LL X, LI F, PAN L, et al. DuEE: a large-scale dataset for Chinese event extraction in real-world scenarios[C]// Proceedings of the 9th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2020: 534-545. |
17 | ZHOU Y, CHEN Y, ZHAO J, et al. What the role is vs. what plays the role: semi-supervised event argument extraction via dual question answering [C]// Proceedings of the 35th AAAI Conference on Artificial Intelligencee. Palo Alto: AAAI Press, 2021: 14638-14646. |
18 | HUANG H, SUN J, WEI H, et al. A dataset of domain events based on open-source military news [EB/OL].[2023-05-30]. . |
19 | DODDINGTON G R, MITCHELL A, PRZYBOCKI M A, et al. The Automatic Content Extraction (ACE) program: tasks, data, and evaluation [C]// Proceedings of the 4th International Conference on Language Resources and Evaluation. Lisbon, Portugal: European Language Resources Association, 2004: 837-840. |
20 | ZHANG Y, YANG J. Chinese NER using lattice LSTM[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2018: 1554-1564. |
21 | DING N, LI Z, LIU Z, et al. Event detection with trigger-aware lattice neural network[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 347-356. |
22 | LI Z, DING X, LIU T. Constructing narrative event evolutionary graph for script event prediction[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 4201-4207. |
23 | GRANROTH-WILDING M, ClARK S. What happens next? Event prediction using a compositional neural network model[J]. Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 2727-2733. |
24 | DING X, ZHANG Y, LIU T, et al. Deep learning for event-driven stock prediction[C]// Proceedings of the 24th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2015: 2327-2333. |
25 | WEBER N, BALASUBRAMANIAN N, CHAMBERS N. Event representations with tensor-based compositions [C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 4946-4953. |
26 | YANG B, FU X, SIDIROPOULOS N D, et al. Towards K-means-friendly spaces: simultaneous deep learning and clustering[C]// Proceedings of the 34th International Conference on Machine Learning.New York: JMLR.org, 2017: 3861-3870. |
27 | GUO X, GAO L, LIU X, et al. Improved deep embedded clustering with local structure preservation [C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017: 1753-1759. |
28 | CAI J, FAN J, GUO W, et al. Efficient deep embedded subspace clustering [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 21-30. |
29 | LI J, FEI H, LIU J, et al. Unified named entity recognition as word-word relation classification[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 10965-10973. |
30 | HARTIGAN J A, WONG M A. A K-means clustering algorithm[J]. Journal of the Royal Statistical Society. Series C (Applied Statistics), 1979, 28(1): 100-108. |
31 | BO D, WANG X, SHI C, et al. Structural deep clustering network[C]// Proceedings of the Web Conference 2020. New York: ACM, 2020: 1400-1410. |
32 | VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9: 2579-2605. |
[1] | Xingyao YANG, Yu CHEN, Jiong YU, Zulian ZHANG, Jiaying CHEN, Dongxiao WANG. Recommendation model combining self-features and contrastive learning [J]. Journal of Computer Applications, 2024, 44(9): 2704-2710. |
[2] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[3] | Jiong WANG, Taotao TANG, Caiyan JIA. PAGCL: positive augmentation graph contrastive learning recommendation method without negative sampling [J]. Journal of Computer Applications, 2024, 44(5): 1485-1492. |
[4] | Jie GUO, Jiayu LIN, Zuhong LIANG, Xiaobo LUO, Haitao SUN. Recommendation method based on knowledge‑awareness and cross-level contrastive learning [J]. Journal of Computer Applications, 2024, 44(4): 1121-1127. |
[5] | Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames [J]. Journal of Computer Applications, 2024, 44(3): 931-937. |
[6] | Weichao DANG, Lei ZHANG, Gaimei GAO, Chunxia LIU. Weakly supervised action localization method with snippet contrastive learning [J]. Journal of Computer Applications, 2024, 44(2): 548-555. |
[7] | Xingyao YANG, Hongtao SHEN, Zulian ZHANG, Jiong YU, Jiaying CHEN, Dongxiao WANG. Sequential recommendation based on hierarchical filter and temporal convolution enhanced self-attention network [J]. Journal of Computer Applications, 2024, 44(10): 3090-3096. |
[8] | Yunhua ZHU, Bing KONG, Lihua ZHOU, Hongmei CHEN, Chongming BAO. Multi-view clustering network guided by graph contrastive learning [J]. Journal of Computer Applications, 2024, 44(10): 3267-3274. |
[9] | Yirui HUANG, Junwei LUO, Jingqiang CHEN. Multi-modal dialog reply retrieval based on contrast learning and GIF tag [J]. Journal of Computer Applications, 2024, 44(1): 32-38. |
[10] | Wei TONG, Liyang HE, Rui LI, Wei HUANG, Zhenya HUANG, Qi LIU. Efficient similar exercise retrieval model based on unsupervised semantic hashing [J]. Journal of Computer Applications, 2024, 44(1): 206-216. |
[11] | Ziyi HE, Yan YANG, Yiling ZHANG. Multi-view clustering network with deep fusion [J]. Journal of Computer Applications, 2023, 43(9): 2651-2656. |
[12] | Shengwei MA, Ruizhang HUANG, Lina REN, Chuan LIN. Structured deep text clustering model based on multi-layer semantic fusion [J]. Journal of Computer Applications, 2023, 43(8): 2364-2369. |
[13] | Jingsheng LEI, Kaijun LA, Shengying YANG, Yi WU. Joint entity and relation extraction based on contextual semantic enhancement [J]. Journal of Computer Applications, 2023, 43(5): 1438-1444. |
[14] | Rong GAO, Jiawei SHEN, Xiongkai SHAO, Xinyun WU. Instance segmentation algorithm based on Fastformer and self-supervised contrastive learning [J]. Journal of Computer Applications, 2023, 43(4): 1062-1070. |
[15] | Wenbo LI, Bo LIU, Lingling TAO, Fen LUO, Hang ZHANG. Deep spectral clustering algorithm with L1 regularization [J]. Journal of Computer Applications, 2023, 43(12): 3662-3667. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||