Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (9): 2648-2651.DOI: 10.11772/j.issn.1001-9081.2017.09.2648

Previous Articles     Next Articles

Text image restoration algorithm based on sparse coding and ridge regression

WANG Zhiyi1, BI Duyan1, XIONG Lei1, FAN Zunlin1, ZHANG Xiaoyu2   

  1. 1. College of Aeronautics and Astronautics Engineering, Air Force Engineering University, Xi'an Shaanxi 710038, China;
    2. Command Automation Station, Xinjiang Military Area, Urumqi Xinjiang 830042, China
  • Received:2017-03-09 Revised:2017-03-21 Online:2017-09-10 Published:2017-09-13
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61372167,61379104).

基于稀疏编码和岭回归的文本图像复原算法

王之毅1, 毕笃彦1, 熊磊1, 凡遵林1, 张晓瑜2   

  1. 1. 空军工程大学 航空航天工程学院, 西安 710038;
    2. 新疆军区 自动化站, 乌鲁木齐 830042
  • 通讯作者: 王之毅,1970696669@qq.com
  • 作者简介:王之毅(1982-),男,河南郸城人,助理工程师,硕士研究生,主要研究方向:图像处理;毕笃彦(1962-),男,陕西扶风人,教授,博士,主要研究方向:图像处理、模式识别;熊磊(1976-),男,江西南昌人,副教授,博士,主要研究方向:图像处理、计算机视觉;凡遵林(1991-),男,湖南郴州人,博士研究生,主要研究方向:图像处理、模式识别;张晓瑜(1983-),男,河南镇平人,助理工程师,硕士,主要研究方向:机器学习、人工智能。
  • 基金资助:
    国家自然科学基金资助项目(61372167, 61379104)。

Abstract: To solve the problem that sparse coding in text image restoration has the shortcomings of limited expression of dictionary atoms and high computation complexity, a novel text image restoration algorithm was proposed based on sparse coding and ridge regression. Firstly, patches were used to train the dictionary for sparse representation at training stage and the sampled image were clustered based on the Euclidean distances between the sampled image patches and the dictionary atoms. Then, the ridge regressors between low-quality text image patches and clear text image patches were constructed in local manifold space to achieve the local multi-linear expansion of dictionary atoms and fast calculation. At last, the clear text image patches were directly calculated at testing stage by searching for the most similar dictionary atoms with low-quality text image patches without calculating the sparse coding of low-quality text image patches. The experimental results show that compared with the existing sparse coding algorithm, the proposed algorithm has improved Peak Signal-to-Noise Ratio (PSNR) by 0.3 to 1.1 dB and reduced computing time at one or two orders of magnitude. Therefore, this method provides a good and fast solution for text image restoration.

Key words: text image restoration, sparse coding, manifold space, ridge regression, clustering

摘要: 为解决现有稀疏编码方法在文本图像复原中存在的编码码元表述空间有限和计算时间长的问题,提出了一种基于岭回归的稀疏编码文本图像复原方法。首先,该方法在训练阶段使用样本图像块训练出用于稀疏表达的字典,并根据样本图像块和编码码元之间的欧氏距离对样本图像块进行聚类;其次,在局部流形空间构建低质量文本图像块和清晰文本图像块之间的岭回归,实现对编码码元表述空间的局部多线性扩展和快速计算;最后,在测试阶段搜索和低质量文本图像最相近的编码码元,计算出近似的清晰文本图像块,从而避免计算耗时的低质量文本图像块的稀疏编码。实验结果表明,所提算法在恢复的图像质量上相比现有的基于稀疏编码的算法在峰值信噪比上高0.3~1.1 dB,耗时降低了1~2个数量级,为提高文本图像复原质量和提升算法运算速度提供了一种解决方案。

关键词: 文本图像复原, 稀疏编码, 流形空间, 岭回归, 聚类

CLC Number: