计算机应用 ›› 2012, Vol. 32 ›› Issue (08): 2299-2304.DOI: 10.3724/SP.J.1087.2012.02299

• 图形图像技术 • 上一篇    下一篇

基于距离分布直方图的数字识别算法

吴少泓,王云宽,孙涛,李兵   

  1. 中国科学院 自动化研究所,北京 100190
  • 收稿日期:2012-02-15 修回日期:2012-03-21 发布日期:2012-08-28 出版日期:2012-08-01
  • 通讯作者: 吴少泓
  • 作者简介:吴少泓(1986-),女,江西景德镇人,博士研究生,主要研究方向:机器视觉、智能控制;
    王云宽(1966-),男,山西忻州人,研究员,博士生导师,主要研究方向:机器视觉、智能控制;
    孙涛(1984-),男,辽宁沈阳人,硕士研究生,主要研究方向:机器视觉、智能交通;
    李兵(1982-),男,黑龙江哈尔滨人,博士,主要研究方向:机器视觉、智能控制。
  • 基金资助:
    国家自然科学基金资助项目(61174175)

Digit recognition based on distance distribution histogram

WU Shao-hong,WANG Yun-kuan,SUN Tao,LI Bing   

  1. Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2012-02-15 Revised:2012-03-21 Online:2012-08-28 Published:2012-08-01
  • Contact: WU Shao-hong

摘要: 由于自由字体与手写体数字形态的多变性,以往研究中具有较高准确率的算法往往牺牲了快速性,而具有实时性优势的算法却容易引起错误率的上升。针对这个问题,提出一种适用于快速数字识别的特征描述子——距离分布直方图(DDH),并在形状上下文的基础上提出一种既易于实现又具一定鲁棒性的描述子——形状累积直方图(SAH);然后将上述两个特征与其他改进后的拓扑特征相结合,组成最终的多特征矢量,由于其子矢量是由不同方法提取出的独特的特征,因此具有互补性;与此同时,算法中用三种组合特征训练了三个支持向量机来作分类器,综合它们给出的结果和自信度来给出最后的分类结果。在自建数据集、MNIST和USPS数据集上的实验结果显示,平均正确率最高达到了99.21%,证明了算法的高效性和鲁棒性。

关键词: 特征提取, 距离映射, 相对链码, 形状上下文, 支持向量机

Abstract: Due to the mutability of unstrained or handwritten digits, most algorithms in previous study either forfeited easy implementation for high accuracy, or vice versa. This paper proposed a new feature descriptor named Distance Distribution Histogram (DDH) and adapted Shape Accumulate Histogram (SAH) feature descriptor based on shape context which was not only easy to implement, but also was robust to noise and distortion. To make hybrid features more comprehensive, some other adapted topological features were combined. The new congregated features were complementary as they were formed from different original feature sets extracted by different means. What's more, they were not complicate. Meanwhile, three Support Vector Machine (SVM) with different feature vector were used as classifier and their results were integrated to get the final classification. The average accurate rate of several experiments based on self-established data sets, MNIST and USPS is as high as 99.21%, which demonstrates that the proposed algorithm is robust and effective.

Key words: feature extraction, distance map, relative chain code, shape context, Support Vector Machine (SVM)

中图分类号: