计算机应用 ›› 2015, Vol. 35 ›› Issue (1): 215-219.DOI: 10.11772/j.issn.1001-9081.2015.01.0215

• 虚拟现实与数字媒体 • 上一篇    下一篇

基于视觉显著性和超像素融合的物体定位方法

邵明正, 齐剑锋, 王希武, 王路   

  1. 军械工程学院 信息工程系, 石家庄050003
  • 收稿日期:2014-07-31 修回日期:2014-09-19 出版日期:2015-01-01 发布日期:2015-01-26
  • 通讯作者: 邵明正
  • 作者简介:邵明正(1986-),男,山东济宁人,硕士研究生,主要研究方向:计算机视觉、图像处理;齐剑锋(1969-),男,河北石家庄人,副教授,博士,主要研究方向:机器学习、计算机视觉;王希武(1966-),男,河北衡水人,副教授,博士,主要研究方向:数据挖掘.

Object localization method based on fusion of visual saliency and superpixels

SHAO Mingzheng, QI Jianfeng, WANG Xiwu, WANG Lu   

  1. Information Engineering Department, Ordnance Engineering College, Shijiazhuang Hebei 050003, China
  • Received:2014-07-31 Revised:2014-09-19 Online:2015-01-01 Published:2015-01-26

摘要:

针对选择性搜索算法所需定位窗口数量过多的问题,提出了一种基于视觉显著性和超像素融合的改进方法.首先,利用视觉显著性图像粗略估计物体的位置;然后,从这些初始位置开始,根据图像的表观特征融合相邻超像素,并引入一种背景分析方法以避免过度融合;最后,利用贪心算法将融合后的区域再进行组合,并生成最终的定位窗口.在Pascal VOC 2007数据集上的实验结果表明,与选择性搜索方法相比,在同样的检测标准下(查全率为0.91),改进后的方法所使用的窗口数量减少了20%,而重叠率达到了0.77.该方法由粗到细地进行物体定位,在定位窗口数量较少的情况下仍能保持较高的重叠率和查全率.

关键词: 物体定位, 视觉显著性, 超像素, 滑动窗口, 物体识别

Abstract:

Considering the weakness of the selective search method that needs a large number of windows to localize objects, a novel object localization method based on fusion of visual saliency and superpixels was proposed in this paper. Firstly, the visual saliency map was used to coarsely localize the objects, and then the adjacent superpixels could be merged according to the appearance features of image, starting from the above coarse positions. Furthermore, the method employed a simple background detector to avoid the over-merge. Finally, a greedy algorithm was used to iteratively combine the merged regions and generate the final bounding boxes. The experimental results on Pascal VOC 2007 show that the proposed method leads to a 20% reduction in the number of the bounding boxes on the same detection rate (recall of 0.91) compared to the selective search algorithm, and its overlap rate reaches 0.77. The presented method can keep higher overlap rate and recall scores with fewer windows because of its coarse-to-fine process.

Key words: object localization, visual saliency, superpixel, sliding window, object recognition

中图分类号: