Journal of Computer Applications ›› 2012, Vol. 32 ›› Issue (08): 2309-2312.DOI: 10.3724/SP.J.1087.2012.02309

• Graphics and image technology • Previous Articles     Next Articles

Uighur characters recognition based on locality preserving projection and hidden Markov model

LIU Wei1,LI He-cheng2   

  1. 1. Department of Physics, Qinghai Normal University, Xining Qinghai 810008, China
    2. Department of Mathematics, Qinghai Normal University, Xining Qinghai 810008, China
  • Received:2012-01-16 Revised:2012-03-22 Online:2012-08-28 Published:2012-08-01
  • Contact: LIU Wei

基于局部保持投影与隐马尔可夫模型的维文字符识别

刘卫1,李和成2   

  1. 1. 青海师范大学 物理系,西宁 810008
    2. 青海师范大学 数学系,西宁 810008
  • 通讯作者: 刘卫
  • 作者简介:刘卫(1975-),男,山东滕州人,副教授,主要研究方向:图像处理、模式识别、机器学习;
    李和成(1973-),男,青海乐都人,教授,博士,主要研究方向:进化计算、数据挖掘。
  • 基金资助:
    复杂双层规划问题的高性能可信进化算法研究

Abstract: Concerning the shortcomings of classical Hidden Markov Model (HMM) in handwritten Uighur characters recognition, such as largly varied width of characters, slow convergent speed and premature convergence, a new Uighur characters recognition algorithm was proposed in combination with Locality Preserving Projection (LPP) and HMM. Firstly, the aspect ratio of original image was maintained by a highly-normalized method. Sub-images were obtained by using sliding window, and observation sequences were extracted from these windows. Secondly, the observation sequences were mapped into low-dimensional space based on LPP, and the scale of adjacency matrix was reduced via the random sampling technique. Finally, HMM was trained by adopting obtained observation sequences. The algorithm decreases dimension of observation vectors, accelerates the convergence, and prevents premature convergence effectively. The simulation results show the LPP-HMM algorithm is efficient and robust, which decrease average convergence steps as well as errors.

Key words: Hidden Markov Model (HMM), Locality Preserving Projection (LPP), Uighur characters recognition, normalization, convergence

摘要: 针对传统隐马尔可夫模型(HMM)在对手写维吾尔文字符建模时,字符宽度变化大,模型训练收敛缓慢,且易陷入局部极值的问题,提出一种基于保局投影(LPP)与HMM相结合的维吾尔字符识别方法。首先,通过高度归一化保持原图像的宽高比,用滑动窗获取子图像序列,形成观测向量序列;其次,采用局部保持投影将观测序列映射到低维空间,并用随机抽样方法降低邻接图矩阵的规模;最后,采用新观测序列训练HMM。该算法在降维的同时提高了HMM的收敛速度,降低了陷入局部极值的风险。实验结果显示,算法的平均收敛步数减少,错误率降低,表明算法是有效的。

关键词: 隐马尔可夫模型, 局部保持投影, 维文识别, 归一化, 收敛

CLC Number: