Journal of Computer Applications

• Database Technology • Previous Articles     Next Articles

Extraction of Chinese term based on Chi-square test

<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>W<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a><a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>m<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>i<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>H<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>U<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>T<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>i<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>-<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>t<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>i<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>H<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>E<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>Y<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>Z<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>H<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>A<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>N<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>G<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>   

  • Received:2007-06-25 Revised:2007-08-20 Online:2007-12-01 Published:2007-12-01
  • Contact: Wen-min HU

基于卡方检验的汉语术语抽取

胡文敏 何婷婷 张勇   

  1. 华中师范大学计算机科学系;国家语言资源监测与研究中心网络媒体分中心 华中师范大学计算机科学系;国家语言资源监测与研究中心网络媒体分中心 华中师范大学计算机科学系;国家语言资源监测与研究中心网络媒体分中心
  • 通讯作者: 胡文敏

Abstract: Discovering the term has very important applications in Chinese information processing and language learning. A method for the extraction of Chinese term based on Chi-square test was proposed. First, download Web documents and build a corpus, then prime words were extracted by using the F-MI parameter improved by mutual-information,while combined words were extracted by the Chi-square test with the help of decomposition of prime string. The experiments show that the algorithm can effectively improve the precision in the extraction of Chinese term.

Key words: Chi-square test, decomposition of prime string, mutual information

摘要: 发现术语在中文信息处理和语言学习方面具有非常重要的作用和意义。提出了一种基于卡方检验的汉语术语抽取方法:先从网络上下载语料,然后使用改进的互信息参数(F-MI)抽取结构简单的质串,并在其基础上进一步使用卡方检验结合质子串分解方法抽取具有复杂结构的合串。实验结果显示,该算法有效地提高了汉语术语抽取的精确度。

关键词: 卡方检验, 质子串分解, 互信息

CLC Number: