Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (10): 3236-3243.DOI: 10.11772/j.issn.1001-9081.2022101473
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
Suolan LIU1,2, Zhenzhen TIAN1, Hongyuan WANG1(), Long LIN1, Yan WANG1
Received:
2022-10-11
Revised:
2022-12-29
Accepted:
2023-01-03
Online:
2023-04-12
Published:
2023-10-10
Contact:
Hongyuan WANG
About author:
LIU Suolan, born in 1980, Ph. D., associate professor. Her research interests include computer vision, artificial intelligence.Supported by:
刘锁兰1,2, 田珍珍1, 王洪元1(), 林龙1, 王炎1
通讯作者:
王洪元
作者简介:
刘锁兰(1980—),女,江苏泰州人,副教授,博士,CCF会员,主要研究方向:计算机视觉、人工智能基金资助:
CLC Number:
Suolan LIU, Zhenzhen TIAN, Hongyuan WANG, Long LIN, Yan WANG. Human action recognition method based on multi-scale feature fusion of single mode[J]. Journal of Computer Applications, 2023, 43(10): 3236-3243.
刘锁兰, 田珍珍, 王洪元, 林龙, 王炎. 基于单模态的多尺度特征融合人体行为识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3236-3243.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022101473
方法 | 准确率/% | 参数量/106 |
---|---|---|
RA-GCN(3s) | 87.3 | 6.21 |
Shift-GCN(1s) | 87.8 | 0.72 |
ST-TR(1s) | 88.7 | 6.48 |
DGNN(2s) | 89.9 | 26.20 |
PL-GCN | 89.2 | 20.70 |
PB-GCN | 87.5 | 3.55 |
本文方法 | 89.0 | 4.10 |
Tab. 1 Accuracy comparison of different methods on NTU RGB+D60 (X-sub protocol)
方法 | 准确率/% | 参数量/106 |
---|---|---|
RA-GCN(3s) | 87.3 | 6.21 |
Shift-GCN(1s) | 87.8 | 0.72 |
ST-TR(1s) | 88.7 | 6.48 |
DGNN(2s) | 89.9 | 26.20 |
PL-GCN | 89.2 | 20.70 |
PB-GCN | 87.5 | 3.55 |
本文方法 | 89.0 | 4.10 |
特征数 | 方法 | X-sub | X-view |
---|---|---|---|
单特征 | ST-GCN | 81.5 | 88.3 |
Global feature graph | 86.7 | 93.1 | |
3subgraph | 86.8 | 93.3 | |
4subgraph | 87.4 | 93.7 | |
5subgraph | 86.9 | 93.4 | |
6subgraph | 87.0 | 93.2 | |
多特征 融合 | Global feature graph+3subgraph | 88.8 | 94.2 |
Global feature graph+4subgraph | 89.0 | 94.2 | |
Global feature graph+5subgraph | 88.2 | 94.1 | |
Global feature graph+6subgraph | 88.7 | 93.6 |
Tab. 2 Results of ablation experiments on NTU RGB+D60 dataset
特征数 | 方法 | X-sub | X-view |
---|---|---|---|
单特征 | ST-GCN | 81.5 | 88.3 |
Global feature graph | 86.7 | 93.1 | |
3subgraph | 86.8 | 93.3 | |
4subgraph | 87.4 | 93.7 | |
5subgraph | 86.9 | 93.4 | |
6subgraph | 87.0 | 93.2 | |
多特征 融合 | Global feature graph+3subgraph | 88.8 | 94.2 |
Global feature graph+4subgraph | 89.0 | 94.2 | |
Global feature graph+5subgraph | 88.2 | 94.1 | |
Global feature graph+6subgraph | 88.7 | 93.6 |
方法 | X-sub | X-view |
---|---|---|
ST-GCN | 81.5 | 88.3 |
PB-GCN | 87.5 | 93.2 |
SAN | 87.2 | 92.7 |
SGN | 89.0 | 94.5 |
PGCN-TCA | 88.0 | 93.6 |
ST-TR(1s) | 88.7 | 95.6 |
RA-GCN(3s) | 87.3 | 93.6 |
MST-GCN(1s) | 89.0 | 95.1 |
Shift-GCN(1s) | 87.8 | 95.1 |
SkeleMixCLR(3s) | 87.7 | 94.0 |
本文方法 | 89.0 | 94.2 |
Tab. 3 Recognition accuracies of different methods on NTU RGB+D60 dataset
方法 | X-sub | X-view |
---|---|---|
ST-GCN | 81.5 | 88.3 |
PB-GCN | 87.5 | 93.2 |
SAN | 87.2 | 92.7 |
SGN | 89.0 | 94.5 |
PGCN-TCA | 88.0 | 93.6 |
ST-TR(1s) | 88.7 | 95.6 |
RA-GCN(3s) | 87.3 | 93.6 |
MST-GCN(1s) | 89.0 | 95.1 |
Shift-GCN(1s) | 87.8 | 95.1 |
SkeleMixCLR(3s) | 87.7 | 94.0 |
本文方法 | 89.0 | 94.2 |
方法 | X-sub | X-setup |
---|---|---|
GVFE+AS-GCN with DH-TCN | 78.3 | 79.8 |
Gimme Signals | 70.8 | 71.6 |
SkeleMixCLR(3s) | 82.0 | 82.9 |
Shift-GCN(1s) | 80.9 | 83.2 |
MST-GCN(1s) | 82.8 | 84.5 |
RA-GCN(3s) | 81.1 | 82.7 |
ST-TR(1s) | 81.9 | 84.1 |
SGN | 79.2 | 81.5 |
本文方法 | 83.3 | 85.0 |
Tab. 4 Recognition accuracies of different methods on NTU RGB+D120 dataset
方法 | X-sub | X-setup |
---|---|---|
GVFE+AS-GCN with DH-TCN | 78.3 | 79.8 |
Gimme Signals | 70.8 | 71.6 |
SkeleMixCLR(3s) | 82.0 | 82.9 |
Shift-GCN(1s) | 80.9 | 83.2 |
MST-GCN(1s) | 82.8 | 84.5 |
RA-GCN(3s) | 81.1 | 82.7 |
ST-TR(1s) | 81.9 | 84.1 |
SGN | 79.2 | 81.5 |
本文方法 | 83.3 | 85.0 |
1 | SI C, CHEN W, WANG W, et al. An attention enhanced graph convolutional LSTM network for skeleton-based action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1227-1236. 10.1109/cvpr.2019.00132 |
2 | A van den OORD, KALCHBRENNER N, KAVUKCUOGLU K. Pixel recurrent neural networks[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: JMLR.org, 2016: 1747-1756. |
3 | DEFFERRARD M, BRESSON X, VANDERGHEYNST P. Convolutional neural networks on graphs with fast localized spectral filtering[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2016: 3844-3852. |
4 | YANG H, YAN D, ZHANG L, et al. Feedback graph convolutional network for skeleton-based action recognition[J]. IEEE Transactions on Image Processing, 2022, 31: 164-175. 10.1109/tip.2021.3129117 |
5 | YAN S, XIONG Y, LIN D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 7444-7452. 10.1609/aaai.v32i1.12328 |
6 | SHI L, ZHANG Y, CHENG J, et al. Decoupled spatial-temporal attention network for skeleton-based action recognition[C]// Proceedings of the 2020 Asian Conference on Computer Vision, LNCS 12626. Cham: Springer, 2021: 38-53. |
7 | CHEN Y, ZHANG Z, YUAN C, et al. Channel-wise topology refinement graph convolution for skeleton-based action recognition[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 13339-13348. 10.1109/iccv48922.2021.01311 |
8 | LI C, CUI Z, ZHENG W, et al. Action-attending graphic neural network[J]. IEEE Transactions on Image Processing, 2018, 27(7): 3657-3670. 10.1109/tip.2018.2815744 |
9 | PENG W, HONG X, CHEN H, et al. Learning graph convolutional network for skeleton-based human action recognition by neural searching[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 2669-2676. 10.1609/aaai.v34i03.5652 |
10 | ZHAO R, WANG K, SU H, et al. Bayesian graph convolution LSTM for skeleton based action recognition[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6882-6892. 10.1109/iccv.2019.00698 |
11 | GAO J, HE T, ZHOU X, et al. Focusing and diffusion: bidirectional attentive graph convolutional networks for skeleton-based action recognition[EB/OL]. (2019-12-24). [2022-08-13].. 10.1109/lsp.2021.3116513 |
12 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22). [2022-09-10].. 10.48550/arXiv.1609.02907 |
13 | LIU Z, ZHANG H, CHEN Z, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 143-152. 10.1109/cvpr42600.2020.00022 |
14 | CHENG K, ZHANG Y, HE X, et al. Skeleton-based action recognition with shift graph convolutional network[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 180-189. 10.1109/cvpr42600.2020.00026 |
15 | SONG Y F, ZHANG Z, SHAN C, et al. Richly activated graph convolutional network for robust skeleton-based action recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(5): 1915-1925. 10.1109/tcsvt.2020.3015051 |
16 | CHO S, MAQBOOL M H, LIU F, et al. Self-attention network for skeleton-based human action recognition[C]// Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2020: 624-633. 10.1109/wacv45572.2020.9093639 |
17 | YU W, YANG K, YAO H, et al. Exploiting the complementary strengths of multi-layer CNN features for image retrieval[J]. Neurocomputing, 2017, 237: 235-241. 10.1016/j.neucom.2016.12.002 |
18 | 刘渭滨,邹智元,邢薇薇. 模式分类中的特征融合方法[J]. 北京邮电大学学报, 2017, 40(4): 1-8. |
LIU W B, ZOU Z Y, XING W W. Feature fusion method in pattern classification[J]. Journal of Beijing University of Posts and Telecommunications, 2017, 40(4): 1-8. | |
19 | SHI L, ZHANG Y, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 12018-12027. 10.1109/cvpr.2019.01230 |
20 | CHEN Y, ROHRBACH M, YAN Z, et al. Graph-based global reasoning networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 433-442. 10.1109/cvpr.2019.00052 |
21 | SHAHROUDY A, LIU J, NG T T, et al. NTU RGB+ D: a large scale dataset for 3D human activity analysis[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1010-1019. 10.1109/cvpr.2016.115 |
22 | LIU J, SHAHROUDY A, PEREZ M, et al. NTU RGB+ D 120: a large-scale benchmark for 3D human activity understanding[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(10): 2684-2701. 10.1109/tpami.2019.2916873 |
23 | PASZKE A, GROSS S, CHINTALA S, et al. Automatic differentiation in PyTorch[EB/OL]. (2017-10-29) [2020-12-01].. |
24 | HUANG L, HUANG Y, OUYANG W, et al. Part-level graph convolutional network for skeleton-based action recognition[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 11045-11052. 10.1609/aaai.v34i07.6759 |
25 | SHI L, ZHANG Y, CHENG J, et al. Skeleton-based action recognition with directed graph neural networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 7904-7913. 10.1109/cvpr.2019.00810 |
26 | THAKKAR K, NARAYANAN P J. Part-based graph convolutional network for action recognition[EB/OL]. (2018-09-13) [2022-08-13].. |
27 | YANG H, GU Y, ZHU J, et al. PGCN-TCA: pseudo graph convolutional network with temporal and channel-wise attention for skeleton-based action recognition[J]. IEEE Access, 2020, 8: 10040-10047. 10.1109/access.2020.2964115 |
28 | PLIZZARI C, CANNICI M, MATTEUCCI M. Skeleton-based action recognition via spatial and temporal transformer networks[J]. Computer Vision and Image Understanding, 2021, 208/209: No.103219. 10.1016/j.cviu.2021.103219 |
29 | ZHANG P, LAN C, ZENG W, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1109-1118. 10.1109/cvpr42600.2020.00119 |
30 | CHEN Z, LIU H, GUO T, et al. Contrastive learning from spatio-temporal mixed skeleton sequences for self-supervised skeleton-based action recognition[EB/OL]. (2022-07-07) [2022-10-23].. |
31 | CHEN Z, LI S, YANG B, et al. Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition[C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2021: 1113-1122. 10.1609/aaai.v35i2.16197 |
32 | PAPADOPOULOS K, GHORBEL E, AOUADA D, et al. Vertex feature encoding and hierarchical temporal modeling in a spatial-temporal graph convolutional network for action recognition[C]// Proceedings of the 25th International Conference on Pattern Recognition. Piscataway: IEEE, 2021: 452-458. 10.1109/icpr48806.2021.9413189 |
33 | MEMMESHEIMER R, THEISEN N, PAULUS D. Gimme signals: discriminative signal encoding for multimodal activity recognition[C]// Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE, 2020: 10394-10401. 10.1109/iros45743.2020.9341699 |
[1] | Guixiang XUE, Hui WANG, Weifeng ZHOU, Yu LIU, Yan LI. Port traffic flow prediction based on knowledge graph and spatio-temporal diffusion graph convolutional network [J]. Journal of Computer Applications, 2024, 44(9): 2952-2957. |
[2] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[3] | Chuanlin PANG, Rui TANG, Ruizhi ZHANG, Chuan LIU, Jia LIU, Shibo YUE. Distributed power allocation algorithm based on graph convolutional network for D2D communication systems [J]. Journal of Computer Applications, 2024, 44(9): 2855-2862. |
[4] | Yan RONG, Jiawen LIU, Xinlei LI. Adaptive hybrid network for affective computing in student classroom [J]. Journal of Computer Applications, 2024, 44(9): 2919-2930. |
[5] | Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413. |
[6] | Chenqian LI, Jun LIU. Ultrasound carotid plaque segmentation method based on semi-supervision and multi-scale cascaded attention [J]. Journal of Computer Applications, 2024, 44(8): 2604-2610. |
[7] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[8] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. |
[9] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[10] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[11] | Wei LI, Xiaorong ZHANG, Peng CHEN, Qing LI, Changqing ZHANG. Crowd counting algorithm with multi-scale fusion based on normal inverse Gamma distribution [J]. Journal of Computer Applications, 2024, 44(7): 2243-2249. |
[12] | Yuan TANG, Yanping CHEN, Ying HU, Ruizhang HUANG, Yongbin QIN. Relation extraction model based on multi-scale hybrid attention convolutional neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2011-2017. |
[13] | Sailong SHI, Zhiwen FANG. Gaze estimation model based on multi-scale aggregation and shared attention [J]. Journal of Computer Applications, 2024, 44(7): 2047-2054. |
[14] | Shibin LI, Jun GONG, Shengjun TANG. Semi-supervised heterophilic graph representation learning model based on Graph Transformer [J]. Journal of Computer Applications, 2024, 44(6): 1816-1823. |
[15] | Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG. Time series classification method based on multi-scale cross-attention fusion in time-frequency domain [J]. Journal of Computer Applications, 2024, 44(6): 1842-1847. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||