• •    

基于色彩空间的MSER的自然场景文本检测

范一华,邓德祥,颜佳   

  1. 武汉大学电子信息学院
  • 收稿日期:2017-06-07 修回日期:2017-08-23 发布日期:2017-08-23
  • 通讯作者: 范一华

Scene text detection based on MSER in color space

  • Received:2017-06-07 Revised:2017-08-23 Online:2017-08-23
  • Contact: Yi-Hua Fan

摘要: 针对传统的最大稳定极值区域(MSER)无法很好的提取低对比度图像的文本区域的问题,提出一种新的方法基于边缘增强的场景文本检测。根据先验知识可以知道图像边缘处的梯度值最大,因此利用方向梯度算子(HOG)获得图像中每个像素点的梯度信息,将方向梯度值与原图像进行相关运算就可以有效的增强图像的边缘信息,从而提高图像的对比度。原图像转换到灰度图像时会损失颜色信息,颜色空间包含了很多有价值的纹理信息。为了使改进后的MSER更好的发挥作用,采用在更符合人类视觉效果的HSI空间进行MSER,以此提取更多感兴趣的文本候选区域。其次,利用贝叶斯模型进行分类,主要采用笔画宽度、边缘梯度方向和拐角点三个特征剔除非字符区域。最后将筛选后的字符组合成文本行。在公共数据集ICDAR 2003和ICDAR 2013评估算法性能,实验结果表明,基于边缘增强的MSER方法能够更好的提取图像的文本区域,对于背景复杂和低对比度的场景图像有很好的鲁棒性和实时性,可以实现较高的召回率。

关键词: 文本检测, 边缘增强, 最大稳定极值区域, 颜色空间, 贝叶斯模型

Abstract: In order to solve the problem of text regions can not be extracted well in low contrast images by MSER, a novel method based on edge enhancement was proposed. According to the prior knowledge, it can be known the gradient at the edge of the image is the largest, the contrast of images can be improved by edge enhancement, the correlation operation of original image and orientation gradient which obtained by HOG feature descriptor can effectively enhance the edge information of image. The color information can be lost when the original image is converted to grayscale image, however, the color space contains a wealth of valuable information. To make the improved MSER work better, the system applied MSER in HSI color space which satisfies human visual perception better, this improvement can extract more interesting text regions. Then, the Bayesian model was used for the classification of characters, the main three features of stroke width (SW), edge gradient direction and inflexion points were used to filter non characters. Finally, grouped the characters into text lines. The system demonstrates the performance on standard benchmarks, such as ICDAR 2003 and ICDAR 2013, the experimental results demonstrate that the method based on edge enhancement can extract more interesting text regions, it is robust to complex and low contrast images and can achieve a higher recall rate.

Key words: text detection, edge enhancement, Maximally Stable Extremal Regions (MSER), color space, Bayesian model

中图分类号: