计算机应用 ›› 2019, Vol. 39 ›› Issue (12): 3697-3702.DOI: 10.11772/j.issn.1001-9081.2019050916

• 应用前沿、交叉与综合 • 上一篇    下一篇

基于三维卷积神经网络的航运监控事件识别

王中杰1,2, 张鸿1,2   

  1. 1. 武汉科技大学 计算机科学与技术学院, 武汉 430065;
    2. 智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065
  • 收稿日期:2019-05-31 修回日期:2019-07-02 出版日期:2019-12-10 发布日期:2019-09-10
  • 作者简介:王中杰(1994-),男,湖北武汉人,硕士研究生,主要研究方向:机器学习、视频识别、深度学习;张鸿(1979-),女,湖北襄阳人,教授,博士,主要研究方向:跨媒体检索、机器学习、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61373109)。

Shipping monitoring event recognition based on three-dimensional convolutional neural network

WANG Zhongjie1,2, ZHANG Hong1,2   

  1. 1. College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
    2. Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System(Wuhan University of Science and Technology), Wuhan Hubei 430065, China
  • Received:2019-05-31 Revised:2019-07-02 Online:2019-12-10 Published:2019-09-10
  • Contact: 王中杰
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61373109).

摘要: 针对传统的机器学习算法对大数据量的航运监控视频识别分类的效果不佳,以及现有的三维(3D)卷积的识别准确率较低的问题,基于3D卷积神经网络模型,结合较为流行的视觉几何组(VGG)网络结构以及GoogleNet的Inception网络结构,提出了一种基于VGG-16的3D卷积网络并引入Inception模块的VIC3D模型对航运货物实时监控视频进行智能识别。首先,将从摄像头获取到的视频数据处理成图片;然后,将等间隔取帧的视频帧序列按照类别进行分类并构建训练集与测试集;最后,在保证运行环境相同并且训练方式相同的前提下,将结合后的VIC3D模型与原模型分别进行训练,根据测试集的测试结果对各种模型进行比较。实验结果表明,VIC3D模型的识别准确率在原模型的基础上有所提升,相较于组约束循环卷积神经网络(GCRNN)模型的识别准确率提高了11.1个百分点,且每次识别所需时间减少了1.349 s;相较于C3D的两种模型的识别准确率分别提高了14.6个百分点和4.2个百分点。VIC3D模型能有效地应用到航运视频监控项目中。

关键词: 智能航运监控, 视频识别, 深度学习, 三维卷积, 神经网络

Abstract: Aiming at the poor effect of traditional machine learning algorithms on large data volume shipping monitoring video recognition classification and the low recognition accuracy of previous three-Dimensional (3D) convolution, based on 3D convolutional neural network model, combined with the popular Visual Geometry Group (VGG) network structure and GoogleNet's Inception network structure, a new VGG-Inception 3D Convolutional neural network (VIC3D) model based on VGG-16 3D convolutional network and introduced Inception module was proposed to realize the intelligent recognition of the real-time monitoring video of shipping goods. Firstly, the video data acquired from the camera were processed into images. Then, the video frame sequences by equal interval frame fetching were classified according to the categories, and the training set and the testing set were constructed. Under the premise of the same operating environment and the same training mode, the VIC3D model after combination and the original model were trained separately. Finally, the various models were compared based on the test results of the testing set. The experimental results show that, compared with the original model, the recognition accuracy of VIC3D model is improved, which is increased by 11.1 percentage points compared to the Group-constrained Convolutional Recurrent Neural Network (GCRNN) model, and the time required for every recognition is reduced by 1.349 s; the recognition accuracy of VIC3D model is increased by 14.6 percentage points and 4.2 percentage points respectively compared to the two models of C3D. The VIC3D model can be effectively applied to the shipping video surveillance projects.

Key words: intelligent shipping monitoring, video recognition, deep learning, three-Dimensional (3D) convolution, neural network

中图分类号: