Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (11): 3513-3519.DOI: 10.11772/j.issn.1001-9081.2022010106
Special Issue: CCF 2021中国数字服务大会
• ChinaService 2021 • Previous Articles Next Articles
Rui XIAO, Mingyi LIU, Zhiying TU, Zhongjie WANG()
Received:
2022-01-27
Revised:
2022-03-20
Accepted:
2022-04-02
Online:
2022-11-14
Published:
2022-11-10
Contact:
Zhongjie WANG
About author:
XIAO Rui, born in 1997, M. S. candidate. His research interests include service computing.Supported by:
通讯作者:
王忠杰
作者简介:
肖锐(1997—),男,重庆人,硕士研究生,主要研究方向:服务计算基金资助:
CLC Number:
Rui XIAO, Mingyi LIU, Zhiying TU, Zhongjie WANG. Personal event detection method based on text mining in social media[J]. Journal of Computer Applications, 2022, 42(11): 3513-3519.
肖锐, 刘明义, 涂志莹, 王忠杰. 基于社交媒体文本挖掘的个人事件检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3513-3519.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022010106
微博 | 是否包含 定义的事件 | 事件类型 | 事件元素 |
---|---|---|---|
赖在梳妆台上不肯走的大尾巴狗 | 否 | ― | ― |
昨天看完电影<盲山>,吓得半天没缓过来。 | 是 | audiovisual | (audiovisual)‒(User[用户], 盲山, 昨天,‒, 过去) |
老爸生日,我们今天吃到了各种菌类:羊肚菌、黑松露 | 是 | birth,eat | (birth)‒(老爸, 生日,‒,‒, 现在) (eat)‒(我们, 羊肚菌、黑松露, 今天,‒, 现在) |
Tab. 1 Examples of event extraction from Weibo text
微博 | 是否包含 定义的事件 | 事件类型 | 事件元素 |
---|---|---|---|
赖在梳妆台上不肯走的大尾巴狗 | 否 | ― | ― |
昨天看完电影<盲山>,吓得半天没缓过来。 | 是 | audiovisual | (audiovisual)‒(User[用户], 盲山, 昨天,‒, 过去) |
老爸生日,我们今天吃到了各种菌类:羊肚菌、黑松露 | 是 | birth,eat | (birth)‒(老爸, 生日,‒,‒, 现在) (eat)‒(我们, 羊肚菌、黑松露, 今天,‒, 现在) |
模型 | ||||
---|---|---|---|---|
logistic regression | 75.16 | 73.86 | 56.32 | 63.91 |
naive Bayes | 67.83 | 63.45 | 65.33 | 64.38 |
BERT+FullConnect | 86.53 | 80.25 | 80.74 | 80.49 |
BERT+BiLSTM | 87.51 | 83.01 | 80.43 | 81.70 |
BERT+BiLSTM+Attention | 88.63 | 80.18 | 85.42 | 82.71 |
Tab. 2 Personal event detection results
模型 | ||||
---|---|---|---|---|
logistic regression | 75.16 | 73.86 | 56.32 | 63.91 |
naive Bayes | 67.83 | 63.45 | 65.33 | 64.38 |
BERT+FullConnect | 86.53 | 80.25 | 80.74 | 80.49 |
BERT+BiLSTM | 87.51 | 83.01 | 80.43 | 81.70 |
BERT+BiLSTM+Attention | 88.63 | 80.18 | 85.42 | 82.71 |
模型 | |||
---|---|---|---|
KNN | 86.21 | 46.34 | 60.28 |
random forest | 83.46 | 48.72 | 61.52 |
decision tree | 73.10 | 61.63 | 66.59 |
BERT+BiLSTM | 88.05 | 88.81 | 88.08 |
BERT+BiLSTM+Attention | 86.54 | 90.26 | 88.09 |
BERT+FullConnect | 91.18 | 92.88 | 91.67 |
Tab. 3 Personal event multi?label classification results
模型 | |||
---|---|---|---|
KNN | 86.21 | 46.34 | 60.28 |
random forest | 83.46 | 48.72 | 61.52 |
decision tree | 73.10 | 61.63 | 66.59 |
BERT+BiLSTM | 88.05 | 88.81 | 88.08 |
BERT+BiLSTM+Attention | 86.54 | 90.26 | 88.09 |
BERT+FullConnect | 91.18 | 92.88 | 91.67 |
模型 | |||
---|---|---|---|
CRF | 52.72 | 15.65 | 24.13 |
BiLSTM+CRF | 67.83 | 47.67 | 55.99 |
BERT+FullConnect | 77.13 | 73.23 | 74.90 |
BERT+CRF | 78.60 | 75.50 | 76.92 |
BERT+BiLSTM+CRF | 80.41 | 74.75 | 77.38 |
Tab. 4 Personal event element extraction results on word level
模型 | |||
---|---|---|---|
CRF | 52.72 | 15.65 | 24.13 |
BiLSTM+CRF | 67.83 | 47.67 | 55.99 |
BERT+FullConnect | 77.13 | 73.23 | 74.90 |
BERT+CRF | 78.60 | 75.50 | 76.92 |
BERT+BiLSTM+CRF | 80.41 | 74.75 | 77.38 |
模型 | |||||
---|---|---|---|---|---|
Overall | Sub | Obj | Time | Place | |
CRF | 18.57 | 21.88 | 13.63 | 31.32 | 8.62 |
BiLSTM+CRF | 52.81 | 49.33 | 53.94 | 55.12 | 47.24 |
BERT+FullConnect | 68.12 | 64.43 | 70.05 | 62.34 | 66.08 |
BERT+CRF | 68.44 | 66.03 | 70.92 | 62.52 | 68.43 |
BERT+BiLSTM+CRF | 69.46 | 66.74 | 70.76 | 65.77 | 70.53 |
Tab. 5 Personal event elements extraction results on entity level
模型 | |||||
---|---|---|---|---|---|
Overall | Sub | Obj | Time | Place | |
CRF | 18.57 | 21.88 | 13.63 | 31.32 | 8.62 |
BiLSTM+CRF | 52.81 | 49.33 | 53.94 | 55.12 | 47.24 |
BERT+FullConnect | 68.12 | 64.43 | 70.05 | 62.34 | 66.08 |
BERT+CRF | 68.44 | 66.03 | 70.92 | 62.52 | 68.43 |
BERT+BiLSTM+CRF | 69.46 | 66.74 | 70.76 | 65.77 | 70.53 |
阶段1模型 | 阶段2模型 | 阶段3模型 | |||
---|---|---|---|---|---|
BERT+BiLSTM | 82.05 | BERT+BiLSTM+Attention | 79.61 | BERT+CRF | 69.82 |
BERT+BiLSTM+CRF | 71.87 | ||||
BERT+FullConnect | 81.12 | BERT+CRF | 70.61 | ||
BERT+BiLSTM+CRF | 72.49 | ||||
BERT+BiLSTM+Attention | 83.72 | BERT+BiLSTM+Attention | 80.08 | BERT+CRF | 70.30 |
BERT+BiLSTM+CRF | 72.13 | ||||
BERT+FullConnect | 82.78 | BERT+CRF | 70.43 | ||
BERT+BiLSTM+CRF | 73.41 |
Tab. 6 Extraction results of whole Pipeline system
阶段1模型 | 阶段2模型 | 阶段3模型 | |||
---|---|---|---|---|---|
BERT+BiLSTM | 82.05 | BERT+BiLSTM+Attention | 79.61 | BERT+CRF | 69.82 |
BERT+BiLSTM+CRF | 71.87 | ||||
BERT+FullConnect | 81.12 | BERT+CRF | 70.61 | ||
BERT+BiLSTM+CRF | 72.49 | ||||
BERT+BiLSTM+Attention | 83.72 | BERT+BiLSTM+Attention | 80.08 | BERT+CRF | 70.30 |
BERT+BiLSTM+CRF | 72.13 | ||||
BERT+FullConnect | 82.78 | BERT+CRF | 70.43 | ||
BERT+BiLSTM+CRF | 73.41 |
1 | DI EUGENIO B, GREEN N, SUBBA R. Detecting life events in feeds from Twitter[C]// Proceedings of the IEEE 7th International Conference on Semantic Computing. Piscataway: IEEE, 2013: 274-277. 10.1109/icsc.2013.54 |
2 | DICKINSON T, FERNÁNDEZ M, THOMAS L A, et al. Identifying prominent life events on Twitter[C]// Proceedings of the 8th International Conference on Knowledge Capture. New York: ACM, 2015: No.4. 10.1145/2815833.2815845 |
3 | CHOUDHURY S, ALANI H. Personal life event detection from social media[C]// Late‑breaking Results, Doctoral Consortium and Workshop Proceedings of the 25th ACM Hypertext and Social Media Conference. Aachen: CEUR‑WS.org, 2014: No.SP2014_04. |
4 | DHILLON P S, FOSTER D, UNGAR L. Multi‑view learning of word embeddings via CCA[C]// Proceedings of the 24th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2011: 199-207. |
5 | KHODABAKHSH M, KAHANI M, BAGHERI E, et al. Detecting life events from twitter based on temporal semantic features[J]. Knowledge‑Based Systems, 2018, 148: 1-16. 10.1016/j.knosys.2018.02.021 |
6 | LI J W, RITTER A, CARDIE C, et al. Major life event extraction from twitter based on congratulations/condolences speech acts[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2014: 1997-2007. 10.3115/v1/d14-1214 |
7 | BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022. |
8 | CHOUDHURY S, ALANI H. Detecting presence of personal events in Twitter streams[C]// Proceedings of the 2014 International Conference on Social Informatics, LNCS 8852. Cham: Springer, 2015: 157-166. |
9 | LI J W, CARDIE C. Timeline generation: tracking individuals on twitter[C]// Proceedings of the 23rd International Conference on World Wide Web. New York: ACM, 2014: 643-652. 10.1145/2566486.2567969 |
10 | HOCHREITER S, SCHMIDHUBER J. Long short‑term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. 10.1162/neco.1997.9.8.1735 |
11 | SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. 10.1109/78.650093 |
12 | ZHOU C T, SUN C L, LIU Z Y, et al. A C‑LSTM neural network for text classification[EB/OL]. (2015-11-30) [2021-09-10].. |
13 | YEN A‑Z, HUANG H‑H, CHEN H‑H. Detecting personal life events from Twitter by multi‑task LSTM[C]// Proceedings of the 2018 Web Conference Companion. Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee, 2018: 21-22. 10.1145/3184558.3186909 |
14 | YEN A Z, HUANG H H, CHEN H H. Personal knowledge base construction from text‑based lifelogs[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2019: 185-194. 10.1145/3331184.3331209 |
15 | YEN A Z, HUANG H H, CHEN H H. Multimodal joint learning for personal knowledge base construction from Twitter‑based lifelogs[J]. Information Processing and Management, 2020, 57(6): No.102148. 10.1016/j.ipm.2019.102148 |
16 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre‑training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: ACL, 2019: 4171-4186. 10.18653/v1/n18-2 |
17 | ZHOU P, SHI W, TIAN J, et al. Attention‑based bidirectional long short‑term memory networks for relation classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, PA: ACL, 2016: 207-212. 10.18653/v1/p16-2034 |
18 | LAFFERTY J D, McCALLUM A, PEREIRA F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the 18th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2001: 282-289. |
[1] | Huanliang SUN, Siyi WANG, Junling LIU, Jingke XU. Help-seeking information extraction model for flood event in social media data [J]. Journal of Computer Applications, 2024, 44(8): 2437-2445. |
[2] | Wanting JI, Wenyi LU, Yuhang MA, Linlin DING, Baoyan SONG, Haolin ZHANG. Machine reading comprehension event detection based on relation-enhanced graph convolutional network [J]. Journal of Computer Applications, 2024, 44(10): 3288-3293. |
[3] | Caiqian BAO, Jianmin XU, Guofang ZHANG. Extended belief network recommendation model based on user dynamic interaction behavior [J]. Journal of Computer Applications, 2023, 43(4): 1115-1121. |
[4] | Ping LUO, Ling DING, Xue YANG, Yang XIANG. Chinese event detection based on data augmentation and weakly supervised adversarial training [J]. Journal of Computer Applications, 2022, 42(10): 2990-2995. |
[5] | LI Yanzhi, FAN Yong, GAO Lin. Anomaly detection of oil drilling water flow based on shape flow [J]. Journal of Computer Applications, 2021, 41(6): 1842-1848. |
[6] | MENG Xiangrui, YANG Wenzhong, WANG Ting. Survey of sentiment analysis based on image and text fusion [J]. Journal of Computer Applications, 2021, 41(2): 307-317. |
[7] | Cui WANG, Yafei ZHANG, Junjun GUO, Shengxiang GAO, Zhengtao YU. Event detection without trigger words incorporating syntactic information [J]. Journal of Computer Applications, 2021, 41(12): 3534-3539. |
[8] | GUO Kexin, ZHANG Yuxiang. Visual-textual sentiment analysis method based on multi-level spatial attention [J]. Journal of Computer Applications, 2021, 41(10): 2835-2841. |
[9] | LI Shanshan, YANG Wenzhong, WANG Ting, WANG Lihua. Survey of sub-topic detection technology based on internet social media [J]. Journal of Computer Applications, 2020, 40(6): 1565-1573. |
[10] | CAI Guoyong, HE Xinhao, CHU Yangyang. Visual sentiment analysis by combining global and local regions of image [J]. Journal of Computer Applications, 2019, 39(8): 2181-2185. |
[11] | MAO Yingchi, QI Hai, JIE Qing, WANG Longbao. M-TAEDA: temporal abnormal event detection algorithm for multivariate time-series data of water quality [J]. Journal of Computer Applications, 2017, 37(1): 138-144. |
[12] | PAN Lei, ZHOU Huan, WANG Minghui. Real-time detection method of abnormal event in crowds [J]. Journal of Computer Applications, 2016, 36(6): 1719-1723. |
[13] | LU Zhigang, SUN Yadan. Multidimensional collaborative intelligence recommendation based on social media context [J]. Journal of Computer Applications, 2016, 36(3): 740-745. |
[14] | CAI Guoyong, XIA Binbin. Multimedia sentiment analysis based on convolutional neural network [J]. Journal of Computer Applications, 2016, 36(2): 428-431. |
[15] | MAO Yingchi, JIE Qing, CHEN Hao. Online abnormal event detection with spatio-temporal relationship in river networks [J]. Journal of Computer Applications, 2015, 35(11): 3106-3111. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||