Abstract:
Considering the one-sidedness of features used in many Protein-Protein Interaction (PPI) extraction methods, a new approach was proposed to extract rich features from context information and syntax structure for PPI extraction. Various features, such as lexicon, position, distance, dependency syntax and deep syntax features constitute feature set, and the Support Vector Machine (SVM) classifier was used for PPI extraction. The experimental evaluation on multiple PPI corpora reveals that the rich features can utilize more comprehensive information to reduce the risk of missing some important features. This method achieves state-of-the-art performance with respect to comparable evaluations, with 59.2% F-score and 85.6% Area Under Curve (AUC) on the AImed corpus.
王健 冀明辉 林鸿飞 杨志豪. 基于上下文环境和句法分析的蛋白质关系抽取[J]. 计算机应用, 2012, 32(04): 1074-1077.
WANG Jian JI Ming-hui LIN Hong-fei YANG Zhi-hao. Protein-protein interaction extraction based on contextual and syntactic features. Journal of Computer Applications, 2012, 32(04): 1074-1077.
XENARIOS I, RICH D W, SALWINSKI L, et al.DIP: The database of interacting proteins[J]. Nucleic Acids Research, 2000, 28(1):289-291.
[4]
BUNESCU R, MOONEY R, RAMANI A. Integrating co-occurrence statistics with information extraction for robust retrieval of protein interactions from Medline[C]// BioNLP06: Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis. Stroudsburg: Association for Computational Linguistics, 2006:49-56.
[5]
FUNDEL K, KUFFER R, ZIMMER R. RelEx-relation extraction using dependency parse trees[J]. Bioinformatics, 2006, 23(3):365–371.
[6]
NIELSEN L A. Extracting protein-protein interactions using simple contextual features[C]// BioNLP06: Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis. Stroudsburg: Association for Computational Linguistics, 2006:120-121.
[7]
MIYAO Y, SAETRE R, SAGAE K, et al. Task-oriented evaluation of syntactic parsers and their representations[EB/OL].[2011-05-01]. http://www.aclweb.org/anthology-new/P/P08/P08-1006.pdf.
[8]
BUNESCU R C, MOONEY R J. A shortest path dependency kernel for relation extraction[C]// HLT05: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2005:724-731.
[9]
AIROLA A, PYYSALO S, BJRNE J, et al. All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning[J]. BMC Bioinformatics, 2008, 9(Suppl 11):S2.
[10]
MIWA M, STRE R, MIYAO Y, et al. Protein-protein interaction extraction by leveraging multiple kernels and parsers[J]. International Journal of Medical Informatics, 2009, 78(12):39-46.
[11]
KIM S, YOON J, YANG J, et al.Walk-weighted subsequence kernels for protein-protein interaction extraction[J]. BMC Bioinformatics, 2010,11:107.
[12]
SAGAE K, TSUJII J. Dependency parsing and domain adaptation with LR models and parser ensembles[EB/OL].[2011-06-01].