Journal of Computer Applications ›› 2005, Vol. 25 ›› Issue (05): 1039-1041.DOI: 10.3724/SP.J.1087.2005.1039
• Artificial intelligence and simulation • Previous Articles Next Articles
ZHU Xu-ju,YANG Jian-gang
Online:
Published:
朱旭巨,杨建刚
Abstract: BCF(Blank Count Feature), a new feature of the construction of projection profile, was presented for Chinese text analysis. BVSA(BCF Vector Smoothing Algorithm), as the preprocessing of projection profile, was presented to be applied before the analysis. After applying BVSA, the projection profile of Chinese text showed an important phenomenon - the convergence of BCF vector. The statistic experiment proved that it was a common and stable phenomenon, and the convergence phenomenon would still be available even if the font style, or font size, or even the text writing/printing style changed. Based on it, the algorithm designed for text line extraction achieved good effect.
Key words: Chinese text analysis, projection profile, BCF, BVSA
摘要: 针对汉字文本分析,提出了一种新的文本特征———空白线特征(BCF)来进行文本投影轮廓生成。在对生成的投影轮廓进行分析之前,应用BCF矢量平滑算法(BVSA)对它进行预处理。处理后的投影轮廓揭示了汉字文本的一个重要现象,就是BCF矢量中间聚集现象。通过统计实验验证,这是一个稳定的现象,也就是说,不同字体、不同字号、印刷体和手写体等等文本风格的不同,都不影响汉字文本的BCF矢量中间聚集现象。应用这个现象对汉字文本进行行分离,取得了良好效果。
关键词: 汉字文本分析, 投影轮廓, BCF, BCF矢量平滑算法
CLC Number:
TP391.12
ZHU Xu-ju,YANG Jian-gang. Central convergence and stability of BCF vector for Chinese text[J]. Journal of Computer Applications, 2005, 25(05): 1039-1041.
朱旭巨,杨建刚. 汉字文本BCF矢量中间聚集现象及其稳定性[J]. 计算机应用, 2005, 25(05): 1039-1041.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.3724/SP.J.1087.2005.1039
http://www.joca.cn/EN/Y2005/V25/I05/1039