Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (10): 3236-3243.DOI: 10.11772/j.issn.1001-9081.2022101473
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
					
						                                                                                                                                                                                                                                                                                    Suolan LIU1,2, Zhenzhen TIAN1, Hongyuan WANG1( ), Long LIN1, Yan WANG1
), Long LIN1, Yan WANG1
												  
						
						
						
					
				
Received:2022-10-11
															
							
																	Revised:2022-12-29
															
							
																	Accepted:2023-01-03
															
							
							
																	Online:2023-04-12
															
							
																	Published:2023-10-10
															
							
						Contact:
								Hongyuan WANG   
													About author:LIU Suolan, born in 1980, Ph. D., associate professor. Her research interests include computer vision, artificial intelligence.Supported by:
        
                   
            刘锁兰1,2, 田珍珍1, 王洪元1( ), 林龙1, 王炎1
), 林龙1, 王炎1
                  
        
        
        
        
    
通讯作者:
					王洪元
							作者简介:刘锁兰(1980—),女,江苏泰州人,副教授,博士,CCF会员,主要研究方向:计算机视觉、人工智能基金资助:CLC Number:
Suolan LIU, Zhenzhen TIAN, Hongyuan WANG, Long LIN, Yan WANG. Human action recognition method based on multi-scale feature fusion of single mode[J]. Journal of Computer Applications, 2023, 43(10): 3236-3243.
刘锁兰, 田珍珍, 王洪元, 林龙, 王炎. 基于单模态的多尺度特征融合人体行为识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3236-3243.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022101473
| 方法 | 准确率/% | 参数量/106 | 
|---|---|---|
| RA-GCN(3s) | 87.3 | 6.21 | 
| Shift-GCN(1s) | 87.8 | 0.72 | 
| ST-TR(1s) | 88.7 | 6.48 | 
| DGNN(2s) | 89.9 | 26.20 | 
| PL-GCN | 89.2 | 20.70 | 
| PB-GCN | 87.5 | 3.55 | 
| 本文方法 | 89.0 | 4.10 | 
Tab. 1 Accuracy comparison of different methods on NTU RGB+D60 (X-sub protocol)
| 方法 | 准确率/% | 参数量/106 | 
|---|---|---|
| RA-GCN(3s) | 87.3 | 6.21 | 
| Shift-GCN(1s) | 87.8 | 0.72 | 
| ST-TR(1s) | 88.7 | 6.48 | 
| DGNN(2s) | 89.9 | 26.20 | 
| PL-GCN | 89.2 | 20.70 | 
| PB-GCN | 87.5 | 3.55 | 
| 本文方法 | 89.0 | 4.10 | 
| 特征数 | 方法 | X-sub | X-view | 
|---|---|---|---|
| 单特征 | ST-GCN | 81.5 | 88.3 | 
| Global feature graph | 86.7 | 93.1 | |
| 3subgraph | 86.8 | 93.3 | |
| 4subgraph | 87.4 | 93.7 | |
| 5subgraph | 86.9 | 93.4 | |
| 6subgraph | 87.0 | 93.2 | |
| 多特征 融合 | Global feature graph+3subgraph | 88.8 | 94.2 | 
| Global feature graph+4subgraph | 89.0 | 94.2 | |
| Global feature graph+5subgraph | 88.2 | 94.1 | |
| Global feature graph+6subgraph | 88.7 | 93.6 | 
Tab. 2 Results of ablation experiments on NTU RGB+D60 dataset
| 特征数 | 方法 | X-sub | X-view | 
|---|---|---|---|
| 单特征 | ST-GCN | 81.5 | 88.3 | 
| Global feature graph | 86.7 | 93.1 | |
| 3subgraph | 86.8 | 93.3 | |
| 4subgraph | 87.4 | 93.7 | |
| 5subgraph | 86.9 | 93.4 | |
| 6subgraph | 87.0 | 93.2 | |
| 多特征 融合 | Global feature graph+3subgraph | 88.8 | 94.2 | 
| Global feature graph+4subgraph | 89.0 | 94.2 | |
| Global feature graph+5subgraph | 88.2 | 94.1 | |
| Global feature graph+6subgraph | 88.7 | 93.6 | 
| 方法 | X-sub | X-view | 
|---|---|---|
| ST-GCN | 81.5 | 88.3 | 
| PB-GCN | 87.5 | 93.2 | 
| SAN | 87.2 | 92.7 | 
| SGN | 89.0 | 94.5 | 
| PGCN-TCA | 88.0 | 93.6 | 
| ST-TR(1s) | 88.7 | 95.6 | 
| RA-GCN(3s) | 87.3 | 93.6 | 
| MST-GCN(1s) | 89.0 | 95.1 | 
| Shift-GCN(1s) | 87.8 | 95.1 | 
| SkeleMixCLR(3s) | 87.7 | 94.0 | 
| 本文方法 | 89.0 | 94.2 | 
Tab. 3 Recognition accuracies of different methods on NTU RGB+D60 dataset
| 方法 | X-sub | X-view | 
|---|---|---|
| ST-GCN | 81.5 | 88.3 | 
| PB-GCN | 87.5 | 93.2 | 
| SAN | 87.2 | 92.7 | 
| SGN | 89.0 | 94.5 | 
| PGCN-TCA | 88.0 | 93.6 | 
| ST-TR(1s) | 88.7 | 95.6 | 
| RA-GCN(3s) | 87.3 | 93.6 | 
| MST-GCN(1s) | 89.0 | 95.1 | 
| Shift-GCN(1s) | 87.8 | 95.1 | 
| SkeleMixCLR(3s) | 87.7 | 94.0 | 
| 本文方法 | 89.0 | 94.2 | 
| 方法 | X-sub | X-setup | 
|---|---|---|
| GVFE+AS-GCN with DH-TCN | 78.3 | 79.8 | 
| Gimme Signals | 70.8 | 71.6 | 
| SkeleMixCLR(3s) | 82.0 | 82.9 | 
| Shift-GCN(1s) | 80.9 | 83.2 | 
| MST-GCN(1s) | 82.8 | 84.5 | 
| RA-GCN(3s) | 81.1 | 82.7 | 
| ST-TR(1s) | 81.9 | 84.1 | 
| SGN | 79.2 | 81.5 | 
| 本文方法 | 83.3 | 85.0 | 
Tab. 4 Recognition accuracies of different methods on NTU RGB+D120 dataset
| 方法 | X-sub | X-setup | 
|---|---|---|
| GVFE+AS-GCN with DH-TCN | 78.3 | 79.8 | 
| Gimme Signals | 70.8 | 71.6 | 
| SkeleMixCLR(3s) | 82.0 | 82.9 | 
| Shift-GCN(1s) | 80.9 | 83.2 | 
| MST-GCN(1s) | 82.8 | 84.5 | 
| RA-GCN(3s) | 81.1 | 82.7 | 
| ST-TR(1s) | 81.9 | 84.1 | 
| SGN | 79.2 | 81.5 | 
| 本文方法 | 83.3 | 85.0 | 
| 1 | SI C, CHEN W, WANG W, et al. An attention enhanced graph convolutional LSTM network for skeleton-based action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1227-1236. 10.1109/cvpr.2019.00132 | 
| 2 | A van den OORD, KALCHBRENNER N, KAVUKCUOGLU K. Pixel recurrent neural networks[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: JMLR.org, 2016: 1747-1756. | 
| 3 | DEFFERRARD M, BRESSON X, VANDERGHEYNST P. Convolutional neural networks on graphs with fast localized spectral filtering[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2016: 3844-3852. | 
| 4 | YANG H, YAN D, ZHANG L, et al. Feedback graph convolutional network for skeleton-based action recognition[J]. IEEE Transactions on Image Processing, 2022, 31: 164-175. 10.1109/tip.2021.3129117 | 
| 5 | YAN S, XIONG Y, LIN D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 7444-7452. 10.1609/aaai.v32i1.12328 | 
| 6 | SHI L, ZHANG Y, CHENG J, et al. Decoupled spatial-temporal attention network for skeleton-based action recognition[C]// Proceedings of the 2020 Asian Conference on Computer Vision, LNCS 12626. Cham: Springer, 2021: 38-53. | 
| 7 | CHEN Y, ZHANG Z, YUAN C, et al. Channel-wise topology refinement graph convolution for skeleton-based action recognition[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 13339-13348. 10.1109/iccv48922.2021.01311 | 
| 8 | LI C, CUI Z, ZHENG W, et al. Action-attending graphic neural network[J]. IEEE Transactions on Image Processing, 2018, 27(7): 3657-3670. 10.1109/tip.2018.2815744 | 
| 9 | PENG W, HONG X, CHEN H, et al. Learning graph convolutional network for skeleton-based human action recognition by neural searching[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 2669-2676. 10.1609/aaai.v34i03.5652 | 
| 10 | ZHAO R, WANG K, SU H, et al. Bayesian graph convolution LSTM for skeleton based action recognition[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6882-6892. 10.1109/iccv.2019.00698 | 
| 11 | GAO J, HE T, ZHOU X, et al. Focusing and diffusion: bidirectional attentive graph convolutional networks for skeleton-based action recognition[EB/OL]. (2019-12-24). [2022-08-13].. 10.1109/lsp.2021.3116513 | 
| 12 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22). [2022-09-10].. 10.48550/arXiv.1609.02907 | 
| 13 | LIU Z, ZHANG H, CHEN Z, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 143-152. 10.1109/cvpr42600.2020.00022 | 
| 14 | CHENG K, ZHANG Y, HE X, et al. Skeleton-based action recognition with shift graph convolutional network[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 180-189. 10.1109/cvpr42600.2020.00026 | 
| 15 | SONG Y F, ZHANG Z, SHAN C, et al. Richly activated graph convolutional network for robust skeleton-based action recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(5): 1915-1925. 10.1109/tcsvt.2020.3015051 | 
| 16 | CHO S, MAQBOOL M H, LIU F, et al. Self-attention network for skeleton-based human action recognition[C]// Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2020: 624-633. 10.1109/wacv45572.2020.9093639 | 
| 17 | YU W, YANG K, YAO H, et al. Exploiting the complementary strengths of multi-layer CNN features for image retrieval[J]. Neurocomputing, 2017, 237: 235-241. 10.1016/j.neucom.2016.12.002 | 
| 18 | 刘渭滨,邹智元,邢薇薇. 模式分类中的特征融合方法[J]. 北京邮电大学学报, 2017, 40(4): 1-8. | 
| LIU W B, ZOU Z Y, XING W W. Feature fusion method in pattern classification[J]. Journal of Beijing University of Posts and Telecommunications, 2017, 40(4): 1-8. | |
| 19 | SHI L, ZHANG Y, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 12018-12027. 10.1109/cvpr.2019.01230 | 
| 20 | CHEN Y, ROHRBACH M, YAN Z, et al. Graph-based global reasoning networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 433-442. 10.1109/cvpr.2019.00052 | 
| 21 | SHAHROUDY A, LIU J, NG T T, et al. NTU RGB+ D: a large scale dataset for 3D human activity analysis[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1010-1019. 10.1109/cvpr.2016.115 | 
| 22 | LIU J, SHAHROUDY A, PEREZ M, et al. NTU RGB+ D 120: a large-scale benchmark for 3D human activity understanding[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(10): 2684-2701. 10.1109/tpami.2019.2916873 | 
| 23 | PASZKE A, GROSS S, CHINTALA S, et al. Automatic differentiation in PyTorch[EB/OL]. (2017-10-29) [2020-12-01].. | 
| 24 | HUANG L, HUANG Y, OUYANG W, et al. Part-level graph convolutional network for skeleton-based action recognition[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 11045-11052. 10.1609/aaai.v34i07.6759 | 
| 25 | SHI L, ZHANG Y, CHENG J, et al. Skeleton-based action recognition with directed graph neural networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 7904-7913. 10.1109/cvpr.2019.00810 | 
| 26 | THAKKAR K, NARAYANAN P J. Part-based graph convolutional network for action recognition[EB/OL]. (2018-09-13) [2022-08-13].. | 
| 27 | YANG H, GU Y, ZHU J, et al. PGCN-TCA: pseudo graph convolutional network with temporal and channel-wise attention for skeleton-based action recognition[J]. IEEE Access, 2020, 8: 10040-10047. 10.1109/access.2020.2964115 | 
| 28 | PLIZZARI C, CANNICI M, MATTEUCCI M. Skeleton-based action recognition via spatial and temporal transformer networks[J]. Computer Vision and Image Understanding, 2021, 208/209: No.103219. 10.1016/j.cviu.2021.103219 | 
| 29 | ZHANG P, LAN C, ZENG W, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1109-1118. 10.1109/cvpr42600.2020.00119 | 
| 30 | CHEN Z, LIU H, GUO T, et al. Contrastive learning from spatio-temporal mixed skeleton sequences for self-supervised skeleton-based action recognition[EB/OL]. (2022-07-07) [2022-10-23].. | 
| 31 | CHEN Z, LI S, YANG B, et al. Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition[C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2021: 1113-1122. 10.1609/aaai.v35i2.16197 | 
| 32 | PAPADOPOULOS K, GHORBEL E, AOUADA D, et al. Vertex feature encoding and hierarchical temporal modeling in a spatial-temporal graph convolutional network for action recognition[C]// Proceedings of the 25th International Conference on Pattern Recognition. Piscataway: IEEE, 2021: 452-458. 10.1109/icpr48806.2021.9413189 | 
| 33 | MEMMESHEIMER R, THEISEN N, PAULUS D. Gimme signals: discriminative signal encoding for multimodal activity recognition[C]// Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE, 2020: 10394-10401. 10.1109/iros45743.2020.9341699 | 
| [1] | Guixiang XUE, Hui WANG, Weifeng ZHOU, Yu LIU, Yan LI. Port traffic flow prediction based on knowledge graph and spatio-temporal diffusion graph convolutional network [J]. Journal of Computer Applications, 2024, 44(9): 2952-2957. | 
| [2] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. | 
| [3] | Chuanlin PANG, Rui TANG, Ruizhi ZHANG, Chuan LIU, Jia LIU, Shibo YUE. Distributed power allocation algorithm based on graph convolutional network for D2D communication systems [J]. Journal of Computer Applications, 2024, 44(9): 2855-2862. | 
| [4] | Yan RONG, Jiawen LIU, Xinlei LI. Adaptive hybrid network for affective computing in student classroom [J]. Journal of Computer Applications, 2024, 44(9): 2919-2930. | 
| [5] | Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413. | 
| [6] | Chenqian LI, Jun LIU. Ultrasound carotid plaque segmentation method based on semi-supervision and multi-scale cascaded attention [J]. Journal of Computer Applications, 2024, 44(8): 2604-2610. | 
| [7] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. | 
| [8] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. | 
| [9] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. | 
| [10] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. | 
| [11] | Wei LI, Xiaorong ZHANG, Peng CHEN, Qing LI, Changqing ZHANG. Crowd counting algorithm with multi-scale fusion based on normal inverse Gamma distribution [J]. Journal of Computer Applications, 2024, 44(7): 2243-2249. | 
| [12] | Yuan TANG, Yanping CHEN, Ying HU, Ruizhang HUANG, Yongbin QIN. Relation extraction model based on multi-scale hybrid attention convolutional neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2011-2017. | 
| [13] | Sailong SHI, Zhiwen FANG. Gaze estimation model based on multi-scale aggregation and shared attention [J]. Journal of Computer Applications, 2024, 44(7): 2047-2054. | 
| [14] | Shibin LI, Jun GONG, Shengjun TANG. Semi-supervised heterophilic graph representation learning model based on Graph Transformer [J]. Journal of Computer Applications, 2024, 44(6): 1816-1823. | 
| [15] | Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG. Time series classification method based on multi-scale cross-attention fusion in time-frequency domain [J]. Journal of Computer Applications, 2024, 44(6): 1842-1847. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||