《计算机应用》唯一官方网站

• •    下一篇

基于不确定度感知的帧关联短视频事件检测方法

李云1,王富铕2,井佩光3,王粟4,肖澳5   

  1. 1.广西财经学院 大数据与人工智能学院 2.中国铁路设计集团有限公司 电信电化院 3.天津大学 电气与信息工程学院 4.广西民族大学 电子信息学院 5.广西大学 计算机与电子信息学院
  • 收稿日期:2023-09-11 修回日期:2023-12-09 发布日期:2024-03-15 出版日期:2024-03-15
  • 通讯作者: 王富铕
  • 作者简介:李云(1978—),女(壮族),广西南宁人,教授,博士,CCF会员(L4818M),主要研究方向:大数据、人工智能;王富铕(1997—),男,天津人,硕士研究生,主要研究方向:多媒体计算、多模态融合;井佩光(1988—),男,天津人,副教授,博士,主要研究方向:多媒体计算、水下图像处理;王粟(1998—),男,江苏扬州人,硕士研究生,主要研究方向:多模态融合;肖澳(1999—),男,湖南衡阳人,硕士研究生,主要研究方向:多模态融合。
  • 基金资助:
    国家自然科学基金(61861014)资助项目;博士启动基金(BS2021025)

Uncertainty-based frame associated short video event detection method

LI Yun1,WANG Fuyou2,JING Peiguang3,WANG Su4XIAO Ao5   

  1. 1. School of Big Data and Artificial Intelligence, Guangxi University of Finance and Economics 2. China Railway Design Group Co., Ltd, Telecom Electrification Institute 3. School of Electrical Automation and Information Engineering, Tianjin University 4. College of Electronic Information, Guangxi Minzu University 5. School of Computer and Electronic Information, Guangxi University
  • Received:2023-09-11 Revised:2023-12-09 Online:2024-03-15 Published:2024-03-15
  • Contact: WANG Fuyou
  • About author:LI Yun, born in 1978, Ph. D., professor. Her research interests include big data, artificial intelligence. WANG Fuyou, born in 1997, M. S. candidate. His research interests include multimedia computing, multimodal fusion. JING Peiguang, born in 1988, Ph. D., associate professor. His research interests include multimedia computing, underwater image processing. WANG Su, born in 1998, M.S. candidate. His research interests include multimodal fusion. XIAO Ao, born in 1999, M.S. candidate. His research interests include multimodal fusion.
  • Supported by:
    National Natural Science Foundation of China (61861014); Doctoral Start-up Fund (BS2021025)

摘要: 针对如何联合短视频的帧不确定度和时序关联性,增强事件检测的问题,提出一种基于不确定度感知的帧关联短视频事件检测方法。首先,利用2D卷积神经网络提取短视频每一帧的特征,再将所提特征多次前向传播通过贝叶斯变分层获得特征均值和与特征对应的不确定度信息;然后,利用模型构建的不确定度感知模块将特征均值和不确定度信息进行融合,将融合后所得的各帧特征通过时序关联模块加强时域上的联系;最后,将时域关联后的特征通过分类网络实现短视频事件检测。利用从Flickr平台上爬取到的短视频事件检测数据集开展实验对比,实验结果表明,支持向量机(SVM)等子空间学习方法分类性能最差,对高级语义表示的探索不充分,而深度学习对于事件检测的精确度明显更优。相较于SviTT方法,所提方法的准确率、平均召回率和平均精度分别提高了3.37%、2.55%和2.09%,验证了所提方法在短视频事件检测任务上的有效性。

关键词: 时序关联性, 帧关联短视频事件, 卷积神经网络, 贝叶斯神经网络, 不确定度

Abstract: Aiming at the problem of how to combine the frame uncertainty and temporal correlation of short videos to enhance event detection, a frame-associated short video event detection method based on uncertainty perception was proposed. Firstly, 2D convolutional neural network was used to extract the features of each frame of short video, and then the proposed features were forward propagated several times to obtain the feature mean value and the uncertainty information corresponding to the features through Bayesian variational layering. Then, the uncertainty-aware module constructed by the model was used to fuse the feature mean value and the uncertainty information, and then the fused features of the frames were strengthened by the temporal correlation module to enhance the connection in time domain. Finally, the temporal correlation features were combined with the temporal correlation module to enhance the event detection in short video, the time-domain associated features were realized through the classification network for short video event detection. The short video event detection dataset crawled from Flickr platform was utilized to carry out experimental comparison, experimental results show that subspace learning methods such as Support Vector Machine (SVM) have the poorest classification performance and do not adequately explore high-level semantic representations, while deep learning has significantly better accuracy for event detection. Compared to SviTT method, the accuracy, average recall, and average precision of the proposed method are improved by 3.37%, 2.55%, and 2.09%, respectively. The effectiveness of the proposed method for the task of short video event detection was verified. 

Key words: temporal correlation, frame associated short video event, convolutional neural network, Bayesian neural network, uncertainty

中图分类号: