计算机应用 ›› 2014, Vol. 34 ›› Issue (10): 2934-2937.DOI: 10.11772/j.issn.1001-9081.2014.10.2934

• 虚拟现实与数字媒体 • 上一篇    下一篇

基于深度自编码网络的运动目标检测

徐培,蔡小路,何文伟,谢易道   

  1. 电子科技大学 计算机科学与工程学院,成都 611731
  • 收稿日期:2014-05-05 修回日期:2014-06-16 出版日期:2014-10-01 发布日期:2014-10-30
  • 通讯作者: 徐培
  • 作者简介:徐培(1986-),男,四川自贡人,博士研究生,主要研究方向:计算机视觉、机器学习;蔡小路(1990-),男,湖北黄冈人,硕士研究生,主要研究方向:计算机视觉、机器学习; 何文伟(1988-), 男,四川泸州人,硕士研究生,主要研究方向:计算机视觉、机器学习;谢易道(1988-),男,四川成都人,硕士研究生,主要研究方向:计算机视觉、机器学习。
  • 基金资助:

    中央基本业务经费资助项目

Motion detection based on deep auto-encoder networks

XU Pei,CAI Xiaolu,HE Wenwei,XIE Yidao   

  1. School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 611731, China
  • Received:2014-05-05 Revised:2014-06-16 Online:2014-10-01 Published:2014-10-30
  • Contact: XU Pei
  • Supported by:

    Fundamental Research Funds for the Central Universities

摘要:

针对从动态背景中提取前景效果较差的问题,提出了一种基于深度自编码网络的运动目标检测方法。首先,用一个三层的深度自编码网络从视频图像中提取不包含运动目标的背景图像,将背景图像作为变量构造了深度自编码网络的代价函数;然后,构造了一个分离函数得到了输入图像的背景图像,再用另一个三层的深度自编码网络学习提取出的背景图像;为了使深度自编码网络的学习能够在线地提取运动目标,还提出了一种在线学习算法,通过寻找对代价函数敏感度较低的权重进行合并,从而能够对更多的视频图像进行处理。实验结果表明,所提方法在从动态背景中提取出前景运动目标上相比Lu等的前景检测的工作(LU C, SHI J, JIA J. Online robust dictionary learning. Proceeding of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Piscataway: IEEE Press, 2013:415-422)检测的准确率提高了6%,并且误报率降低了4.5%。在实际的应用中,能够获得更好的前景背景分离效果,为视频分析等方面的研究奠定更好的基础。

Abstract:

To address the poor results of foreground extraction from dynamic background, a motion detection method based on deep auto-encoder networks was proposed. Firstly, background images without containing motion objects were subtracted from video frames using a three-layer deep auto-encoder network whose cost function contained background as variable. Then, another three-layer deep auto-encoder network was used to learn the subtracted background images which are obtained by constructed separating function. To achieve online motion detection through deep auto-encoder learning, an online learning method of deep auto-encoder network was also proposed. The weights of network were merged according to the sensitivity of cost function to process more video frames. From the experimental results, the proposed method obtains better motion detection accuracy by 6%, and lower false rate by 4.5% than Lus work (LU C, SHI J, JIA J. Online robust dictionary learning. Proceeding of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Piscataway: IEEE Press, 2013:415-422). This work also obtains better extraction results of background and foreground in real applications, and lays better basis for video analysis.

中图分类号: