计算机应用 ›› 2018, Vol. 38 ›› Issue (1): 264-269.DOI: 10.11772/j.issn.1001-9081.2017061389

• 虚拟现实与多媒体计算 • 上一篇    下一篇

基于色彩空间的最大稳定极值区域的自然场景文本检测

范一华, 邓德祥, 颜佳   

  1. 武汉大学 电子信息学院, 武汉 430072
  • 收稿日期:2017-06-07 修回日期:2017-08-05 出版日期:2018-01-10 发布日期:2018-01-22
  • 通讯作者: 邓德祥
  • 作者简介:范一华(1993-),女,河南新乡人,硕士研究生,主要研究方向:图像自然语言处理、字符识别;邓德祥(1961-),男,湖北荆州人,教授,硕士,主要研究方向:计算机视觉、目标跟踪;颜佳(1983-),男,湖北天门人,讲师,博士,主要研究方向:目标跟踪、图像质量评价。

Natural scene text detection based on maximally stable extremal region in color space

FAN Yihua, DENG Dexiang, YAN Jia   

  1. College of Electronic Information, Wuhan University, Wuhan Hubei 430072, China
  • Received:2017-06-07 Revised:2017-08-05 Online:2018-01-10 Published:2018-01-22

摘要: 针对传统的最大稳定极值区域(MSER)方法无法很好地提取低对比度图像文本区域的问题,提出一种新的基于边缘增强的场景文本检测方法。首先,通过方向梯度值(HOG)有效地改进MSER方法,增强MSER方法对低对比度图像的鲁棒性,并在色彩空间分别求取最大稳定极值区域;其次,利用贝叶斯模型进行分类,主要采用笔画宽度、边缘梯度方向、拐角点三个平移旋转不变性特征剔除非字符区域;最后,利用字符的几何特性将字符整合成文本行,在公共数据集国际分析与文档识别(ICDAR)2003和ICDAR 2013评估了算法性能。实验结果表明,基于色彩空间的边缘增强的MSER方法能够解决背景复杂和不能从对比度低的场景图像中正确提取文本区域的问题。基于贝叶斯模型的分类方法在小样本的情况下能够更好地筛选字符,实现较高的召回率。相比传统的MSER进行文本检测的方法,所提方法提高了系统的检测率和实时性。

关键词: 文本检测, 边缘增强, 最大稳定极值区域, 颜色空间, 贝叶斯模型

Abstract: To solve the problem that the text regions can not be extracted well in low contrast images by traditional Maximally Stable Extremal Regions (MSER) method, a novel scene text detection method based on edge enhancement was proposed. Firstly, the MSER method was effectively improved by Histogram of Oriented Gradients (HOG), the robustness of MSER method was enhanced to low contrast images and MSER was applied in color space. Secondly, the Bayesian model was used for the classification of characters, three features with translation and rotation invariance including stroke width, edge gradient direction and inflexion point were used to delete non-character regions. Finally, the characters were grouped into text lines by geometric characteristics of characters. The proposed algorithm's performance on standard benchmarks, such as International Conference on Document Analysis and Recognition (ICDAR) 2003 and ICDAR 2013, was evaluated. The experimental results demonstrate that MSER based on edge enhancement in color space can correctly extract text regions from complex and low contrast images. The Bayesian model based classification method can detect characters from small sample set with high recall. Compared with traditional MSER based method of text detection, the proposed algorithm can improve the detection rate and real-time performance of the system.

Key words: text detection, edge enhancement, Maximally Stable Extremal Region (MSER), color space, Bayesian model

中图分类号: