Journal of Computer Applications ›› 2011, Vol. 31 ›› Issue (04): 1070-1073.DOI: 10.3724/SP.J.1087.2011.01070
• Artificial intelligence • Previous Articles Next Articles
Shi-chao CHEN,Bin YU
Received:
Revised:
Online:
Published:
Contact:
陈士超,郁滨
通讯作者:
作者简介:
Abstract: In order to reduce the impact of problems inherent in the mutual information method on the filtering effect, a method of candidate term filtration and extraction was proposed. And a determination algorithm based on partial evaluating indicator was given, which can give the best upper and lower thresholds fast and accurately through data sampling, statistics and computation. Compared with the method of mutual information filtration with single threshold, the proposed method filtered and extracted candidate terms by setting two thresholds in the premise of not changing the calculating formula of mutual information. The experimental results show that the proposed method can improve the precision rate and F-measurement significantly under the same conditions.
Key words: term extraction, term filtration, mutual information, threshold, evaluating indicator
摘要: 为了降低互信息方法固有问题对术语过滤效果的影响,提出一种双阈值互信息过滤方法,给出了一种基于局部评价指标的阈值确定算法,通过数据抽样、统计和计算,能够快速精确地给出最优上下限阈值。相比单阈值互信息过滤方法,在不更改互信息计算公式的前提下,通过设置双阈值的方法进行候选术语过滤与抽取。实验结果表明,在相同条件下,该方法能够显著提高准确率和F-测度值。
关键词: 术语抽取, 术语过滤, 互信息, 阈值, 评价指标
CLC Number:
TP182
Shi-chao CHEN Bin YU. Method of mutual information filtration with dual-threshold for term extraction[J]. Journal of Computer Applications, 2011, 31(04): 1070-1073.
陈士超 郁滨. 面向术语抽取的双阈值互信息过滤方法[J]. 计算机应用, 2011, 31(04): 1070-1073.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.3724/SP.J.1087.2011.01070
https://www.joca.cn/EN/Y2011/V31/I04/1070