计算机应用 ›› 2012, Vol. 32 ›› Issue (08): 2305-2312.DOI: 10.3724/SP.J.1087.2012.02305

• 图形图像技术 • 上一篇    下一篇

基于笔画相关加权的视频图像文字识别

苏畅1,2,胡晓冬2,王斌辅2,尚凤军2   

  1. 1. 美国康奈尔大学 计算机系,纽约州 伊萨卡市 14853,美国
    2. 重庆邮电大学 计算机科学与技术学院,重庆 40065
  • 收稿日期:2011-11-14 修回日期:2012-01-09 发布日期:2012-08-28 出版日期:2012-08-01
  • 通讯作者: 胡晓冬
  • 作者简介:苏畅(1979-),女,辽宁鞍山人,副教授,博士,主要研究方向:无线传感器网络通信协议、物联网;
    胡晓冬(1986-),男,河南新密人,硕士研究生,主要研究方向:图像处理、文字识别;
    王斌辅(1990-)男,河南漯河人,主要研究方向:图像处理;
    尚凤军(1972-),男,内蒙古宁城人,副教授,博士,主要研究方向:移动IPv6、无线传感网络。
  • 基金资助:
    重庆市教委科学技术研究项目(KJ110504);重庆市科委自然科学基金资助项目(2009BB2081);教育部留学回国人员科研启动基金资助项目(教外司留[2010]1174)

Video image character recognition based on stroke-related weight

SU Chang1,2,HU Xiao-dong1,WANG Bin-fu1,SHANG Feng-jun1   

  1. 1. College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
    2. Department of Computer Science, Cornell University, Ithaca, NY, 14853, USA
  • Received:2011-11-14 Revised:2012-01-09 Online:2012-08-28 Published:2012-08-01
  • Contact: HU Xiao-dong

摘要: 为了提取影视视频图像中的字幕信息,提出一套鲁棒的方法:首先采用图像的边缘特征对字幕信息进行区域定位,并给出结合边缘信息的方法对图像文字进行二值化;其次,采用投影法和区域生成方法定位单个文字;最后,充分考虑到文字笔画的拓扑结构,进行相邻子网格笔画结构相关性的判定,并采用笔画模糊隶属度完成弹性网格特征的提取。该方法在复杂的背景图像中能够有效得到文字的二值图像,并保证了提取特征的稳定性、健壮性,对二值化后的影视字幕的识别率达到92.1%,实验结果表明了方法的有效性。

关键词: 视频图像, 文字识别, 文本定位, 二值化, 子网格特征, 笔画相关性

Abstract: In order to extract the subtitle in the video image, a robust method was proposed. First, the image edge feature was adopted in caption location step, and the binarization method of text images with the edge information was given. Then, the method combined with projection and regional generation was used to locate a character. Finally, taking fully account of the topology of the text strokes, the stroke correlation among the adjacent sub-grids was determined and the stroke fuzzy membership was used to complete the elastic grid feature extraction. This method can effectively get the binary image of characters from a complex background image, ensure the stability and robustness in feature extraction. The experimental results show the method is effective, and its recognition rate has been up to 92.1%.

Key words: video image, character recognition, text location, binarization, sub-grid feature, stroke correlation

中图分类号: