Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (9): 2903-2910.DOI: 10.11772/j.issn.1001-9081.2023091242
• Multimedia computing and computer simulation • Previous Articles Next Articles
Yun LI1, Fuyou WANG2(), Peiguang JING3, Su WANG4, Ao XIAO5
Received:
2023-09-18
Revised:
2023-12-11
Accepted:
2023-12-12
Online:
2024-03-15
Published:
2024-09-10
Contact:
Fuyou WANG
About author:
LI Yun, born in 1978, Ph. D., professor. Her research interests include big data, artificial intelligence.Supported by:
通讯作者:
王富铕
作者简介:
李云(1978—),女(壮族),广西南宁人,教授,博士,CCF会员,主要研究方向:大数据、人工智能基金资助:
CLC Number:
Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method[J]. Journal of Computer Applications, 2024, 44(9): 2903-2910.
李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023091242
帧数 | 模型 | AR | AP | ACC |
---|---|---|---|---|
4 | 模型A | 0.533 | 0.540 | 0.554 |
模型B | 0.654 | 0.662 | 0.674 | |
8 | 模型A | 0.573 | 0.588 | 0.602 |
模型B | 0.701 | 0.712 | 0.720 | |
12 | 模型A | 0.598 | 0.602 | 0.610 |
模型B | 0.738 | 0.752 | 0.760 | |
16 | 模型A | 0.608 | 0.622 | 0.615 |
模型B | 0.763 | 0.782 | 0.798 | |
20 | 模型A | 0.610 | 0.633 | 0.622 |
模型B | 0.763 | 0.782 | 0.798 |
Tab. 1 Influence of number of video frames on proposed model performance
帧数 | 模型 | AR | AP | ACC |
---|---|---|---|---|
4 | 模型A | 0.533 | 0.540 | 0.554 |
模型B | 0.654 | 0.662 | 0.674 | |
8 | 模型A | 0.573 | 0.588 | 0.602 |
模型B | 0.701 | 0.712 | 0.720 | |
12 | 模型A | 0.598 | 0.602 | 0.610 |
模型B | 0.738 | 0.752 | 0.760 | |
16 | 模型A | 0.608 | 0.622 | 0.615 |
模型B | 0.763 | 0.782 | 0.798 | |
20 | 模型A | 0.610 | 0.633 | 0.622 |
模型B | 0.763 | 0.782 | 0.798 |
层数 | AR | AP | ACC |
---|---|---|---|
1 | 0.714 | 0.721 | 0.732 |
2 | 0.742 | 0.756 | 0.755 |
3 | 0.763 | 0.782 | 0.798 |
4 | 0.732 | 0.747 | 0.750 |
5 | 0.716 | 0.712 | 0.707 |
6 | 0.691 | 0.685 | 0.686 |
Tab. 2 Influence of variable layering with different layers on proposed model performance
层数 | AR | AP | ACC |
---|---|---|---|
1 | 0.714 | 0.721 | 0.732 |
2 | 0.742 | 0.756 | 0.755 |
3 | 0.763 | 0.782 | 0.798 |
4 | 0.732 | 0.747 | 0.750 |
5 | 0.716 | 0.712 | 0.707 |
6 | 0.691 | 0.685 | 0.686 |
步长 | AR | AP | ACC |
---|---|---|---|
1 | 0.763 | 0.782 | 0.798 |
2 | 0.742 | 0.756 | 0.755 |
3 | 0.736 | 0.728 | 0.744 |
4 | 0.722 | 0.713 | 0.726 |
Tab. 3 Influence of frame association with different step sizes on proposed model performance
步长 | AR | AP | ACC |
---|---|---|---|
1 | 0.763 | 0.782 | 0.798 |
2 | 0.742 | 0.756 | 0.755 |
3 | 0.736 | 0.728 | 0.744 |
4 | 0.722 | 0.713 | 0.726 |
方法 | AR | AP | ACC |
---|---|---|---|
I3D+残差模块 | 0.692 | 0.713 | 0.722 |
EASTERN+残差模块 | 0.744 | 0.770 | 0.778 |
本文方法 | 0.763 | 0.782 | 0.798 |
Tab. 4 Experimental results of residual module
方法 | AR | AP | ACC |
---|---|---|---|
I3D+残差模块 | 0.692 | 0.713 | 0.722 |
EASTERN+残差模块 | 0.744 | 0.770 | 0.778 |
本文方法 | 0.763 | 0.782 | 0.798 |
方法 | AR | AP | ACC |
---|---|---|---|
Self-Attention | 0.682 | 0.708 | 0.712 |
Multi-Scale Attention | 0.726 | 0.733 | 0.745 |
本文方法 | 0.763 | 0.782 | 0.798 |
Tab. 5 Experimental results of different attention mechanisms
方法 | AR | AP | ACC |
---|---|---|---|
Self-Attention | 0.682 | 0.708 | 0.712 |
Multi-Scale Attention | 0.726 | 0.733 | 0.745 |
本文方法 | 0.763 | 0.782 | 0.798 |
方法 | AR | AP | ACC |
---|---|---|---|
noB+U | 0.658 | 0.671 | 0.688 |
noB+T | 0.624 | 0.645 | 0.650 |
noU+T | 0.573 | 0.588 | 0.576 |
noBNN | 0.738 | 0.747 | 0.752 |
noUPM | 0.710 | 0.728 | 0.733 |
noTCM | 0.680 | 0.695 | 0.704 |
本文方法 | 0.763 | 0.782 | 0.798 |
Tab. 6 Results of ablation experiments
方法 | AR | AP | ACC |
---|---|---|---|
noB+U | 0.658 | 0.671 | 0.688 |
noB+T | 0.624 | 0.645 | 0.650 |
noU+T | 0.573 | 0.588 | 0.576 |
noBNN | 0.738 | 0.747 | 0.752 |
noUPM | 0.710 | 0.728 | 0.733 |
noTCM | 0.680 | 0.695 | 0.704 |
本文方法 | 0.763 | 0.782 | 0.798 |
类型 | 方法 | AR | AP | ACC |
---|---|---|---|---|
子空间学习 | SVM | 0.442 | 0.461 | 0.482 |
SRRS | 0.487 | 0.491 | 0.501 | |
DTSL | 0.513 | 0.524 | 0.533 | |
深度学习 | C3D | 0.552 | 0.571 | 0.582 |
GoogleNet | 0.617 | 0.651 | 0.661 | |
ResNet3D | 0.632 | 0.671 | 0.672 | |
I3D | 0.671 | 0.692 | 0.712 | |
EASTERN | 0.738 | 0.760 | 0.765 | |
SViTT | 0.744 | 0.766 | 0.772 | |
PSN | 0.722 | 0.733 | 0.728 | |
本文方法 | 0.763 | 0.782 | 0.798 |
Tab. 7 Comparison of short video event classification performance of different methods on Flickr dataset
类型 | 方法 | AR | AP | ACC |
---|---|---|---|---|
子空间学习 | SVM | 0.442 | 0.461 | 0.482 |
SRRS | 0.487 | 0.491 | 0.501 | |
DTSL | 0.513 | 0.524 | 0.533 | |
深度学习 | C3D | 0.552 | 0.571 | 0.582 |
GoogleNet | 0.617 | 0.651 | 0.661 | |
ResNet3D | 0.632 | 0.671 | 0.672 | |
I3D | 0.671 | 0.692 | 0.712 | |
EASTERN | 0.738 | 0.760 | 0.765 | |
SViTT | 0.744 | 0.766 | 0.772 | |
PSN | 0.722 | 0.733 | 0.728 | |
本文方法 | 0.763 | 0.782 | 0.798 |
方法 | 测试时间/s | 浮点运算量/GFLOPs |
---|---|---|
I3D | 264 | 112.4 |
SViTT | 292 | 296.2 |
PSN | 206 | 82.3 |
本文方法 | 372 | 442.6 |
Tab. 8 Complexity experimental results of different methods
方法 | 测试时间/s | 浮点运算量/GFLOPs |
---|---|---|
I3D | 264 | 112.4 |
SViTT | 292 | 296.2 |
PSN | 206 | 82.3 |
本文方法 | 372 | 442.6 |
1 | JING P, SU Y, NIE L, et al. Low-rank multi-view embedding learning for micro-video popularity prediction [J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(8): 1519-1532. |
2 | WEI Y, WANG X, GUAN W, et al. Neural multimodal cooperative learning toward micro-video understanding [J]. IEEE Transactions on Image Processing, 2020, 29: 1-14. |
3 | LIU S, CHEN Z, LIU H, et al. User-video co-attention network for personalized micro-video recommendation [C]// Proceedings of the 2019 World Wide Web Conference. Republic and Canton of Geneva: International World Wide Web Conferences Steering Committee, 2019: 3020-3026. |
4 | WANG Y, HE D, LI F, et al. Multi-label classification with label graph superimposing [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 12265-12272. |
5 | YE O, DENG J, YU Z, et al. Abnormal event detection via feature expectation subgraph calibrating classification in video surveillance scenes [J]. IEEE Access, 2020, 8: 97564-97575. |
6 | XU P, BAI L, PEI X, et al. Uncertainty matters: Bayesian modeling of bicycle crashes with incomplete exposure data [J]. Accident Analysis and Prevention, 2022, 165: No.106518. |
7 | ABDAR M, SAMAMI M, DEHGHANI MAHMOODABAD S, et al. Uncertainty quantification in skin cancer classification using three-way decision-based Bayesian deep learning [J]. Computers in Biology and Medicine, 2021, 135: No.104418. |
8 | KENDALL A, GAL Y. What uncertainties do we need in Bayesian deep learning for computer vision? [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 5580-5590. |
9 | GENG J, MIAO Z, ZHANG X P. Efficient heuristic methods for multimodal fusion and concept fusion in video concept detection[J]. IEEE Transactions on Multimedia, 2015, 17(4): 498-511. |
10 | YANG Y, MA Z, XU Z, et al. How related exemplars help complex event detection in web videos? [C]// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013: 2104-2111. |
11 | XING T, VILAMALA M R, GARCIA L, et al. DeepCEP: deep complex event processing using distributed multimodal information[C]// Proceedings of the 2019 IEEE International Conference on Smart Computing. Piscataway: IEEE, 2019: 87-92. |
12 | 井佩光,宋晓艺,苏育挺. 基于深度动态语义关联的短视频事件检测[J]. 激光与光电子学进展, 2024, 61(4): 0437002. |
JING P G, SONG X Y, SU Y T. Micro-video event detection based on deep dynamic semantic correlation [J]. Laser and Optoelectronics Progress, 2024, 61(4): 0437002. | |
13 | 天津大学.一种用于短视频的事件检测方法: CN201910303095.7 [P]. 2019-08-09. |
Tianjin University. An event detection method for short videos: CN201910303095.7 [P]. 2019-08-09. | |
14 | GAL Y, GHAHRAMANI Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: JMLR.org, 2016: 1050-1059. |
15 | ZHANG J, ZHAO X, JIN S, et al. Phase-resolved real-time ocean wave prediction with quantified uncertainty based on variational Bayesian machine learning [J]. Applied Energy, 2022, 324: No.119711. |
16 | DEODATO G. Uncertainty modeling in deep learning: variational inference for Bayesian neural networks [D]. Torino: Politecnico di Torino, 2019: 122-125. |
17 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
18 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
19 | CHEN T, FOX E B, GUESTRIN C. Stochastic gradient Hamiltonian Monte Carlo [C]// Proceedings of the 31st International Conference on Machine Learning. New York: JMLR.org, 2014: 1683-1691. |
20 | WELLING M, TEH Y W. Bayesian learning via stochastic gradient Langevin dynamics [C]// Proceedings of the 28th International Conference on Machine Learning. New York: JMLR.org, 2011: 681-688. |
21 | MacKAY D J C. A practical Bayesian framework for backpropagation networks [J]. Neural Computation, 1992, 4(3): 448-472. |
22 | FROME A, CORRADO G S, SHLENS J, et al. DeViSE: a deep visual-semantic embedding model [C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2013: 2121-2129. |
23 | CHEN H, DING G, LIU X, et al. IMRAM: iterative matching with recurrent attention memory for cross-modal image-text retrieval[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12652-12660. |
24 | LI Y, MIN K, TRIPATHI S, et al. SViTT: temporal learning of sparse video-text Transformers [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 18919-18929. |
25 | WANG X, ZHU L, WU F, et al. A differentiable parallel sampler for efficient video classification [J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2023, 19(3): No.112. |
[1] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[2] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. |
[3] | Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization [J]. Journal of Computer Applications, 2024, 44(7): 1987-1994. |
[4] | Wei LI, Xiaorong ZHANG, Peng CHEN, Qing LI, Changqing ZHANG. Crowd counting algorithm with multi-scale fusion based on normal inverse Gamma distribution [J]. Journal of Computer Applications, 2024, 44(7): 2243-2249. |
[5] | Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242. |
[6] | Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919. |
[7] | Jianjing LI, Guanfeng LI, Feizhou QIN, Weijun LI. Multi-relation approximate reasoning model based on uncertain knowledge graph embedding [J]. Journal of Computer Applications, 2024, 44(6): 1751-1759. |
[8] | Yan ZHOU, Yang LI. Rectified cross pseudo supervision method with attention mechanism for stroke lesion segmentation [J]. Journal of Computer Applications, 2024, 44(6): 1942-1948. |
[9] | Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545. |
[10] | Wenshuo GAO, Xiaoyun CHEN. Point cloud classification network based on node structure [J]. Journal of Computer Applications, 2024, 44(5): 1471-1478. |
[11] | Tianhua CHEN, Jiaxuan ZHU, Jie YIN. Bird recognition algorithm based on attention mechanism [J]. Journal of Computer Applications, 2024, 44(4): 1114-1120. |
[12] | Lijun XU, Hui LI, Zuyang LIU, Kansong CHEN, Weixuan MA. 3D-GA-Unet: MRI image segmentation algorithm for glioma based on 3D-Ghost CNN [J]. Journal of Computer Applications, 2024, 44(4): 1294-1302. |
[13] | Jie WANG, Hua MENG. Image classification algorithm based on overall topological structure of point cloud [J]. Journal of Computer Applications, 2024, 44(4): 1107-1113. |
[14] | Ruifeng HOU, Pengcheng ZHANG, Liyuan ZHANG, Zhiguo GUI, Yi LIU, Haowen ZHANG, Shubin WANG. Iterative denoising network based on total variation regular term expansion [J]. Journal of Computer Applications, 2024, 44(3): 916-921. |
[15] | Jingxian ZHOU, Xina LI. UAV detection and recognition based on improved convolutional neural network and radio frequency fingerprint [J]. Journal of Computer Applications, 2024, 44(3): 876-882. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||