Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (2): 473-478.DOI: 10.11772/j.issn.1001-9081.2019101768
• CCF NDBC 2019 • Previous Articles Next Articles
					
						                                                                                                                                                                                                                    Yang LI1, Wei ZHANG1( ), Chen PENG2
), Chen PENG2
												  
						
						
						
					
				
Received:2019-09-18
															
							
																	Revised:2019-10-18
															
							
																	Accepted:2019-10-24
															
							
							
																	Online:2019-10-31
															
							
																	Published:2020-02-10
															
							
						Contact:
								Wei ZHANG   
													About author:LI Yang, born in 1994, M. S. candidate. His research interests include data mining.Supported by:通讯作者:
					张伟
							作者简介:李扬(1994—),男,山西运城人,硕士研究生,主要研究方向:数据挖掘基金资助:CLC Number:
Yang LI, Wei ZHANG, Chen PENG. Target-dependent method for authorship attribution[J]. Journal of Computer Applications, 2020, 40(2): 473-478.
李扬, 张伟, 彭晨. 目标依赖的作者身份识别方法[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 473-478.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2019101768
| 符号 | 描述 | 符号 | 描述 | 
|---|---|---|---|
| 文本最大长度 | 词向量表 | ||
| 卷积核的数目 | 文档向量 | ||
| 激活函数 | 商品ID向量表 | ||
| 二维卷积操作 | 商品ID向量 | ||
| 词向量维度 | 
Tab. 1 Symbol definition
| 符号 | 描述 | 符号 | 描述 | 
|---|---|---|---|
| 文本最大长度 | 词向量表 | ||
| 卷积核的数目 | 文档向量 | ||
| 激活函数 | 商品ID向量表 | ||
| 二维卷积操作 | 商品ID向量 | ||
| 词向量维度 | 
| 数据集 | 商品数量 | 用户数量 | 评论数/用户 | 评论数/商品 | 总评论数 | 
|---|---|---|---|---|---|
| 电影评论 | 250 | 610 | 37.37 | 91.17 | 22 793 | 
| CD评论 | 600 | 800 | 51.27 | 38.45 | 30 763 | 
Tab. 2 Dataset statistics
| 数据集 | 商品数量 | 用户数量 | 评论数/用户 | 评论数/商品 | 总评论数 | 
|---|---|---|---|---|---|
| 电影评论 | 250 | 610 | 37.37 | 91.17 | 22 793 | 
| CD评论 | 600 | 800 | 51.27 | 38.45 | 30 763 | 
| 名称 | 层数 | 数值 | 
|---|---|---|
| 最大长度L | — | 1 000 | 
| 向量维度d | — | 300 | 
| 卷积 | 3 | |
| 全连接 | 1 | # of classes | 
Tab. 3 Neural network architecture and hyperparameters
| 名称 | 层数 | 数值 | 
|---|---|---|
| 最大长度L | — | 1 000 | 
| 向量维度d | — | 300 | 
| 卷积 | 3 | |
| 全连接 | 1 | # of classes | 
| 方法 | 电影评论数据集 | CD评论数据集 | ||||
|---|---|---|---|---|---|---|
| Acc | Rmacro | F1macro | Acc | Rmacro | F1macro | |
| CNN-2 | 0.519 | 0.411 | 0.415 | 0.683 | 0.581 | 0.579 | 
| LSTM-1 | 0.363 | 0.262 | 0.259 | 0.464 | 0.362 | 0.363 | 
| SVM | 0.452 | 0.354 | 0.351 | 0.619 | 0.523 | 0.521 | 
| RF | 0.307 | 0.209 | 0.205 | 0.492 | 0.401 | 0.399 | 
| Syntax-CNN | 0.505 | 0.401 | 0.405 | 0.656 | 0.566 | 0.565 | 
| LDA-S | 0.285 | 0.188 | 0.186 | 0.349 | 0.251 | 0.252 | 
| CNN product | 0.018 | 0.006 | 0.003 | 0.012 | 0.003 | 0.004 | 
| 前期融合 | 0.556 | 0.449 | 0.443 | 0.708 | 0.612 | 0.608 | 
| 后期融合 | 0.569 | 0.467 | 0.465 | 0.725 | 0.621 | 0.622 | 
Tab. 4 Comparison of evaluation results of different methods on two datasets
| 方法 | 电影评论数据集 | CD评论数据集 | ||||
|---|---|---|---|---|---|---|
| Acc | Rmacro | F1macro | Acc | Rmacro | F1macro | |
| CNN-2 | 0.519 | 0.411 | 0.415 | 0.683 | 0.581 | 0.579 | 
| LSTM-1 | 0.363 | 0.262 | 0.259 | 0.464 | 0.362 | 0.363 | 
| SVM | 0.452 | 0.354 | 0.351 | 0.619 | 0.523 | 0.521 | 
| RF | 0.307 | 0.209 | 0.205 | 0.492 | 0.401 | 0.399 | 
| Syntax-CNN | 0.505 | 0.401 | 0.405 | 0.656 | 0.566 | 0.565 | 
| LDA-S | 0.285 | 0.188 | 0.186 | 0.349 | 0.251 | 0.252 | 
| CNN product | 0.018 | 0.006 | 0.003 | 0.012 | 0.003 | 0.004 | 
| 前期融合 | 0.556 | 0.449 | 0.443 | 0.708 | 0.612 | 0.608 | 
| 后期融合 | 0.569 | 0.467 | 0.465 | 0.725 | 0.621 | 0.622 | 
| 方法 | 电影评论 | CD评论 | 
|---|---|---|
| CNN-2 | 0.519 | 0.682 | 
| 前期融合 | 0.522 | 0.686 | 
| 后期融合 | 0.540 | 0.706 | 
Tab. 5 Impact of target-dependence information on Acc based on n-gram feature
| 方法 | 电影评论 | CD评论 | 
|---|---|---|
| CNN-2 | 0.519 | 0.682 | 
| 前期融合 | 0.522 | 0.686 | 
| 后期融合 | 0.540 | 0.706 | 
| 方法 | 电影评论 | CD评论 | 
|---|---|---|
| CNN-2 | 0.548 | 0.703 | 
| 前期融合 | 0.554 | 0.710 | 
| 后期融合 | 0.568 | 0.725 | 
Tab. 6 Impact of target-dependence information on Acc based on pre-trained feature
| 方法 | 电影评论 | CD评论 | 
|---|---|---|
| CNN-2 | 0.548 | 0.703 | 
| 前期融合 | 0.554 | 0.710 | 
| 后期融合 | 0.568 | 0.725 | 
| 1 | SCHWARTZ R, TSUR O, RAPPOPORT A, et al. Authorship attribution of micro-messages[C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2013: 1880-1891. | 
| 2 | LAYTON R, WATTERS P, DAZELEY R. Authorship attribution for twitter in 140 characters or less[C]// Proceedings of the 2nd Cybercrime and Trustworthy Computing Workshop. Piscataway: IEEE, 2010: 1-8. 10.1109/ctc.2010.17 | 
| 3 | KOPPEL M, SCHLER J. Authorship verification as a one-class classification problem[C]// Proceedings of the 21st International Conference on Machine Learning. New York: ACM, 2004: 1-7. 10.1145/1015330.1015448 | 
| 4 | TANG D, QIN B, LIU T. Document modeling with gated recurrent neural network for sentiment classification[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2015: 1422-1432. 10.18653/v1/d15-1167 | 
| 5 | TAI K S, SOCHER R, MANNING C D. Improved semantic representations from tree-structured long short-term memory networks[EB/OL]. [2019-02-20]. . 10.3115/v1/p15-1150 | 
| 6 | KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2014: 1746-1751. 10.3115/v1/d14-1181 | 
| 7 | ZHANG X, ZHAO J, LECUN Y. Character-level convolutional networks for text classification[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2015: 649-657. 10.1109/icip.2015.7351229 | 
| 8 | ZHANG W, YUAN Q, HAN J, et al. Collaborative multi-Level embedding learning from reviews for rating prediction[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2016: 2986-2992. 10.1609/aaai.v34i04.5826 | 
| 9 | ZHANG W, WANG J. Integrating topic and latent factors for scalable personalized review-based rating prediction[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(11): 3013-3027. 10.1109/tkde.2016.2598740 | 
| 10 | SEROUSSI Y, ZUKERMAN I, BOHNERT F. Authorship attribution with latent Dirichlet allocation[C]// Proceedings of the 15th Conference on Computational Natural Language Learning. Stroudsburg, PA: Association for Computational Linguistics, 2011: 181-189. 10.1145/1995966.1995976 | 
| 11 | ZHANG R, HU Z, GUO H, et al. Syntax encoding with application in authorship attribution[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2018: 2742-2753. 10.18653/v1/d18-1294 | 
| 12 | MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. New York: Curran Associates Inc., 2013: 3111-3119. | 
| 13 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. [2019-02-20]. . 10.18653/v1/n19-1423 | 
| 14 | ATREY P K, HOSSAIN M A, SADDIK A EL, et al. Multimodal fusion for multimedia analysis: a survey[J]. Multimedia Systems, 2010, 16(6): 345-379. 10.1007/s00530-010-0182-0 | 
| 15 | SHRESTHA P, SIERRA S, GONZÁLEZ F, et al. Convolutional neural networks for authorship attribution of short texts[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2017: 669-674. 10.18653/v1/e17-2106 | 
| 16 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. [2019-02-20]. . | 
| 17 | LI Y, YE J. Learning adversarial networks for semi-supervised text classification via policy gradient[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2018: 1715-1723. 10.1145/3219819.3219956 | 
| 18 | SABOUR S, FROSST N, HINTON G E. Dynamic routing between capsules[C]// Proceedings of the 2017 Conference on Neural Information Processing Systems.[S.l.]: CUED Publications database, 2017: 3856-3866. | 
| 19 | ZHAO W, YE J, YANG M, et al. Investigating capsule networks with dynamic routing for text classification[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2018:3110-3119. 10.18653/v1/d18-1350 | 
| [1] | Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910. | 
| [2] | Xianglan WU, Yang XIAO, Mengying LIU, Mingming LIU. Text-to-SQL model based on semantic enhanced schema linking [J]. Journal of Computer Applications, 2024, 44(9): 2689-2695. | 
| [3] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. | 
| [4] | Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization [J]. Journal of Computer Applications, 2024, 44(7): 1987-1994. | 
| [5] | Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242. | 
| [6] | Chao WEI, Yanping CHEN, Kai WANG, Yongbin QIN, Ruizhang HUANG. Relation extraction method based on mask prompt and gated memory network calibration [J]. Journal of Computer Applications, 2024, 44(6): 1713-1719. | 
| [7] | Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919. | 
| [8] | Jianjing LI, Guanfeng LI, Feizhou QIN, Weijun LI. Multi-relation approximate reasoning model based on uncertain knowledge graph embedding [J]. Journal of Computer Applications, 2024, 44(6): 1751-1759. | 
| [9] | Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545. | 
| [10] | Wenshuo GAO, Xiaoyun CHEN. Point cloud classification network based on node structure [J]. Journal of Computer Applications, 2024, 44(5): 1471-1478. | 
| [11] | Jie WANG, Hua MENG. Image classification algorithm based on overall topological structure of point cloud [J]. Journal of Computer Applications, 2024, 44(4): 1107-1113. | 
| [12] | Tianhua CHEN, Jiaxuan ZHU, Jie YIN. Bird recognition algorithm based on attention mechanism [J]. Journal of Computer Applications, 2024, 44(4): 1114-1120. | 
| [13] | Lijun XU, Hui LI, Zuyang LIU, Kansong CHEN, Weixuan MA. 3D-GA-Unet: MRI image segmentation algorithm for glioma based on 3D-Ghost CNN [J]. Journal of Computer Applications, 2024, 44(4): 1294-1302. | 
| [14] | Ruifeng HOU, Pengcheng ZHANG, Liyuan ZHANG, Zhiguo GUI, Yi LIU, Haowen ZHANG, Shubin WANG. Iterative denoising network based on total variation regular term expansion [J]. Journal of Computer Applications, 2024, 44(3): 916-921. | 
| [15] | Jingxian ZHOU, Xina LI. UAV detection and recognition based on improved convolutional neural network and radio frequency fingerprint [J]. Journal of Computer Applications, 2024, 44(3): 876-882. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||