|
Chinese cross document co-reference resolution based on SVM classification and semantics
ZHAO Zhiwei GU Jinghang HU Yanan QIAN Longhua ZHOU Guodong
Journal of Computer Applications
2013, 33 (04):
984-987.
DOI: 10.3724/SP.J.1087.2013.00984
The task of Cross-Document Co-reference Resolution (CDCR) aims to merge those words distributed in different texts which refer to the same entity together to form co-reference chains. The traditional research on CDCR addresses name disambiguation posed in information retrieval using clustering methods. This paper transformed CDCR as a classification problem by using an Support Vector Machine (SVM) classifier to resolve both name disambiguation and variant consolidation, both of which were prevalent in information extraction. This method can effectively integrate various features, such as morphological, phonetic, and semantic knowledge collected from the corpus and the Internet. The experiment on a Chinese cross-document co-reference corpus shows the classification method outperforms clustering methods in both precision and recall.
Reference |
Related Articles |
Metrics
|
|