基于聚类关联网络的群组行为识别

doi:10.11772/j.issn.1001-9081.2020010019

计算机应用 ›› 2020, Vol. 40 ›› Issue (9): 2507-2513.DOI: 10.11772/j.issn.1001-9081.2020010019

基于聚类关联网络的群组行为识别

戎炜, 蒋哲远, 谢昭, 吴克伟

合肥工业大学计算机与信息学院, 合肥 230601

收稿日期:2020-01-14 修回日期:2020-04-17 发布日期:2020-04-28 出版日期:2020-09-10
通讯作者: 蒋哲远
作者简介:戎炜(1994-),男,安徽合肥人,硕士研究生,CCF会员,主要研究方向:图像处理、深度学习;蒋哲远(1965-),男,安徽巢湖人,副研究员,博士,CCF高级会员,主要研究方向:软件理论与智能、面向服务软件工程;谢昭(1980-),男,安徽合肥人,副研究员,博士,主要研究方向:计算机视觉、图像理解;吴克伟(1984-),男,安徽合肥人,副研究员,博士,主要研究方向:计算机视觉、图像理解。
基金资助:
安徽省自然科学基金资助项目（1808085MF168）。

Clustering relational network for group activity recognition

RONG Wei, JIANG Zheyuan, XIE Zhao, WU Kewei

College of Computer Science and Information Engineering, Hefei University of Technology, Hefei Anhui 230601, China

Received:2020-01-14 Revised:2020-04-17 Online:2020-04-28 Published:2020-09-10
Supported by:
This work is partially supported by the Natural Science Foundation of Anhui Province (1808085MF168).

摘要/Abstract

摘要： 目前群组行为识别方法没有充分利用群组关联信息而导致群组识别精度无法有效提升，针对这个问题，提出了基于近邻传播算法（AP）的层次关联模块的深度神经网络模型，命名为聚类关联网络（CRN）。首先，利用卷积神经网络（CNN）提取场景特征，再利用区域特征聚集提取场景中的人物特征。然后，利用AP的层次关联网络模块提取群组关联信息。最后，利用长短期记忆网络（LSTM）融合个体特征序列与群组关联信息，并得到最终的群组识别结果。与多流卷积神经网络（MSCNN）方法相比，CRN方法在Volleyball数据集与Collective Activity数据集上的识别准确率分别提升了5.39与3.33个百分点。与置信度能量循环网络（CERN）方法相比，CRN方法在Volleyball数据集与Collective Activity数据集上的识别准确率分别提升了8.7与3.14个百分点。实验结果表明，CRN方法在群体行为识别任务中拥有更高的识别准确精度。

关键词: 群组行为识别, 聚类关联网络, 群组关联信息, 近邻传播算法, 长短时记忆网络

Abstract: The current group behavior recognition method do not make full use of the group relational information, so that the group recognition accuracy cannot be effectively improved. Therefore, a deep neural network model based on the hierarchical relational module of Affinity Propagation (AP) algorithm was proposed, named Clustering Relational Network (CRN). First, Convolutional Neural Network (CNN) was used to extract scene features, and the regional feature clustering was used to extract person features in the scene. Second, the hierarchical relational network module of AP was adopted to extract group relational information. Finally, the individual feature sequences and group relational information were fused by Long Short-Term Memory (LSTM) network, and the final group recognition result was obtained. Compared with the Multi-Stream Convolutional Neural Network (MSCNN), CRN has the recognition accuracy improved by 5.39 and 3.33 percentage points on Volleyball dataset and Collective Activity dataset, respectively. Compared with the Confidence-Energy Recurrent Network (CERN), CRN has the recognition accuracy improved by 8.70 and 3.14 percentage points on Volleyball dataset and Collective dataset, respectively. Experimental results show that CRN has higher recognition accuracy in the group behavior recognition tasks.

Key words: group behavior recognition, Clustering Relational Network (CRN), group relational information, Affinity Propagation (AP) algorithm, Long Short-Term Memory (LSTM) network

中图分类号:

TP391

戎炜, 蒋哲远, 谢昭, 吴克伟. 基于聚类关联网络的群组行为识别[J]. 计算机应用, 2020, 40(9): 2507-2513.

RONG Wei, JIANG Zheyuan, XIE Zhao, WU Kewei. Clustering relational network for group activity recognition[J]. Journal of Computer Applications, 2020, 40(9): 2507-2513.

参考文献

[1] REN S,HE K,GIRSHICK R,et al. Faster R-CNN:towards realtime object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017, 39(6):1137-1149.
[2] 吴帅, 徐勇, 赵东宁. 基于深度卷积网络的目标检测综述[J]. 模式识别与人工智能,2018,31(4):335-346.(WU S,XU Y, ZHAO D N. Survey of object detection based on deep convolutional network[J]. Pattern Recognition and Artificial Intelligence,2018, 31(4):335-346.)
[3] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE,1998,86(11):2278-2324.
[4] 马力, 王永雄. 基于稀疏化双线性卷积神经网络的细粒度图像分类[J]. 模式识别与人工智能,2019,32(4):336-344.(MA L, WANG Y X. Fine-grained visual classification based on sparse bilinear convolutional neural network[J]. Pattern Recognition and Artificial Intelligence,2019,32(4):336-344.)
[5] DONAHUE J,HENDRICKS L A,GUADARRAMA S,et al. Longterm recurrent convolutional networks for visual recognition and description[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2015:2625-2634.
[6] IBRAHIM M S,MURALIDHARAM S,DENG Z. A hierarchical deep temporal model for group activity recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:1971-1980.
[7] BAGAUTDINOV T,ALAHI A,FLEURET F,et al. Social scene understanding:End-to-end multi-person action localization and collective activity recognition[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:3425-3434.
[8] RAMANATHAN V, HUANG J, ABU-EL-HAIJA S, et al. Detecting events and key actors in multi-person videos[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:3043-3053.
[9] AMER M R,LEI P,TODOROVIC S. HIRF:Hierarchical random field for collective activity recognition in videos[C]//Proceedings of the 2014 European Conference on Computer Vision,LNCS 8694. Cham:Springer,2014:572-585.
[10] LAN T,WANG Y,YANG W,et al. Discriminative latent models for recognizing contextual group activities[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(8):1549-1562.
[11] MORI G,SIGAL L,LAN T. Social roles in hierarchical models for human activity recognition[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2012:1354-1361.
[12] RAMANATHAN V,YAO B,LI F F. Social role discovery in human events[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2013:2475-2482.
[13] CHOI W,SAVARESE S. A unified framework for multi-target tracking and collective activity recognition[C]//Proceedings of the 2012 European Conference on Computer Vision,LNCS 7575. Berlin:Springer,2012:215-230.
[14] CHOI W, SHAHID K, SAVARESE S. Learning context for collective activity recognition[C]//Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2011:3273-3280.
[15] DENG Z, VAHDAT A, HU H, et al. Structure inference machines:recurrent neural networks for analyzing relations in group activity recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:4772-4781.
[16] AZAR S M, ATIGH M G, NICKABADI A. A multi-stream convolutional neural network framework for group activity recognition[EB/OL].[2019-12-26]. https://arxiv.org/pdf/1812.10328.pdf.
[17] SHU T,TODOROVIC S,ZHU S C. CERN:confidence-energy recurrent network for group activity recognition[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:4255-4263.
[18] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778.
[19] FARNEBÄCK G. Two-frame motion estimation based on polynomial expansion[C]//Proceedings of the 2003 Scandinavian Conference on Image Analysis,LNCS 2749. Berlin:Springer, 2003:363-370
[20] SZEGEDY C,VANHOUCKE V,IOFFE S,et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:2818-2826.
[21] HE K,GKIOXARI G,DOLLÄR P,et al. Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE,2017:2980-2988.
[22] OLAH C. Understanding LSTM networks[EB/OL].[2019-08-27]. https://colah.github.io/posts/2015-08-UnderstandingLSTMs/.
[23] FREY B J,DUECK D. Clustering by passing messages between data points[J]. Science,2007,315(5814):972-976.
[24] IBRAHIM M S,MORI G. Hierarchical relational networks for group activity recognition and retrieval[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11207. Cham:Springer,2018:742-758.
[25] LI X,CHOO M C. SBGAR:semantics based group activity recognition[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE,2017:2895-2904.
[26] HAJIMIRSADEGHI H,YAN W,VAHDAT A,et al. Visual recognition by counting instances:a multi-instance cardinality potential kernel[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2015:2596-2605.

基于聚类关联网络的群组行为识别

Clustering relational network for group activity recognition

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	刘辉, 马祥, 张琳玉, 何如瑾. 融合匹配长短时记忆网络和语法距离的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 45-50.
[2]	玄英律, 万源, 陈嘉慧. 基于多尺度卷积和注意力机制的LSTM时间序列分类[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2343-2352.
[3]	左亚尧, 陈皓宇, 陈致然, 洪嘉伟, 陈坤. 融合多语义特征的命名实体识别方法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2001-2008.
[4]	李昕, 贾韬. 基于组蛋白修饰数据预测基因差异性表达的深度融合模型[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3404-3412.
[5]	蔡兴泉, 封丁惟, 王通, 孙辰, 孙海燕. 基于时间注意力机制和EfficientNet的视频暴力行为检测[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3564-3572.
[6]	屈景怡, 杨柳, 陈旭阳, 王茜. 基于时空序列的Conv-LSTM航班延误预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3275-3282.
[7]	陈玉立, 佟强, 谌彤童, 侯守璐, 刘秀磊. 基于注意力机制和生成对抗网络的飞行器短期航迹预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3292-3299.
[8]	包银鑫, 曹阳, 施佺. 基于改进时空残差卷积神经网络的城市路网短时交通流预测[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 258-264.
[9]	张永斌, 常文欣, 孙连山, 张航. 基于字典的域名生成算法生成域名的检测方法[J]. 计算机应用, 2021, 41(9): 2609-2614.
[10]	丁尹, 桑楠, 李晓瑜, 吴飞舟. 基于循环神经网络的电信行业容量数据预测方法[J]. 计算机应用, 2021, 41(8): 2373-2378.
[11]	赵小虎, 李晓. 基于多特征提取的图像语义描述算法[J]. 计算机应用, 2021, 41(6): 1640-1646.
[12]	邱宁佳, 王晓霞, 王鹏, 王艳春. 融合语法规则的双通道中文情感模型分析[J]. 计算机应用, 2021, 41(2): 318-323.
[13]	周玉彬, 肖红, 王涛, 姜文超, 熊梦, 贺忠堂. 基于动作周期退化相似性度量的机械轴健康指标构建与剩余寿命预测[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3192-3199.
[14]	董永峰, 刘超, 王利琴, 李英双. 融合多跳关系路径信息的关系推理方法[J]. 计算机应用, 2021, 41(10): 2799-2805.
[15]	马停停, 冀天娇, 杨冠羽, 陈阳, 许文波, 刘宏图. 基于长短时记忆神经网络的手足口病发病趋势预测[J]. 计算机应用, 2021, 41(1): 265-269.