《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2581-2587.DOI: 10.11772/j.issn.1001-9081.2022071105
所属专题: 多媒体计算与计算机仿真
收稿日期:2022-07-29
修回日期:2022-11-18
接受日期:2022-11-30
发布日期:2023-01-15
出版日期:2023-08-10
通讯作者:
李汪根
作者简介:李豆豆(1996—),男,安徽淮北人,硕士研究生,主要研究方向:深度学习、骨骼动作识别基金资助:
Doudou LI, Wanggen LI(
), Yichun XIA, Yang SHU, Kun GAO
Received:2022-07-29
Revised:2022-11-18
Accepted:2022-11-30
Online:2023-01-15
Published:2023-08-10
Contact:
Wanggen LI
About author:LI Doudou, born in 1996, M. S. candidate. His research interests include deep learning, skeletal-based action recognition.Supported by:摘要:
当前骨骼动作识别任务中仍存在数据预处理不合理、模型参数量大、识别精度低的缺点。为解决以上问题,提出了一种基于特征交互与自适应融合的骨骼动作识别方法AFFGCN。首先,提出一种自适应分池数据预处理算法,以解决数据帧分布不均匀和数据帧代表性差的问题;其次,引入一种多信息特征交互的方法来挖掘更深的特征,以提高模型的性能;最后,提出一种自适应特征融合(AFF)模块用于图卷积特征融合,以进一步提高模型性能。实验结果表明,该方法在NTU-RGB+D 60数据集上较基线方法轻量级多信息图卷积神经网络(LMI-GCN)在交叉主题(CS)与交叉视角(CV)两种评估设置上均提升了1.2个百分点,在NTU-RGB+D 120数据集上较基线方法LMI-GCN在CS和交叉设置号(SS)评估设置上分别提升了1.5和1.4个百分点。而在单流和多流网络上的实验结果表明,相较于语义引导神经网络(SGN)等当前主流骨骼动作识别方法,所提方法的模型参数量更低、准确度更高,模型性能优势明显,更加适用于移动设备的部署。
中图分类号:
李豆豆, 李汪根, 夏义春, 束阳, 高坤. 基于特征交互与自适应融合的骨骼动作识别[J]. 计算机应用, 2023, 43(8): 2581-2587.
Doudou LI, Wanggen LI, Yichun XIA, Yang SHU, Kun GAO. Skeleton-based action recognition based on feature interaction and adaptive fusion[J]. Journal of Computer Applications, 2023, 43(8): 2581-2587.
| 方法 | CS/% | CV/% | 参数量/106 |
|---|---|---|---|
| AFFGCN | 91.0 | 95.7 | 0.730 |
| AFFGCN* | 90.8 | 95.6 | 0.503 |
表1 是否使用分组洗牌卷积的对比
Tab. 1 Comparison of using and not using group shuffle convolution
| 方法 | CS/% | CV/% | 参数量/106 |
|---|---|---|---|
| AFFGCN | 91.0 | 95.7 | 0.730 |
| AFFGCN* | 90.8 | 95.6 | 0.503 |
| 方法 | CS/% | CV/% | 参数量/106 |
|---|---|---|---|
| LMI-GCN* | 89.6 | 94.4 | 0.376 |
| AD | 89.8 | 94.7 | 0.376 |
| MI | 89.9 | 94.7 | 0.385 |
| AF | 90.0 | 94.8 | 0.494 |
| AD+ MI | 90.3 | 95.0 | 0.385 |
| AD+ MI +AF | 90.8 | 95.6 | 0.503 |
表2 验证文中三种方法的有效性
Tab. 2 Verification of effectiveness of three methods in the paper
| 方法 | CS/% | CV/% | 参数量/106 |
|---|---|---|---|
| LMI-GCN* | 89.6 | 94.4 | 0.376 |
| AD | 89.8 | 94.7 | 0.376 |
| MI | 89.9 | 94.7 | 0.385 |
| AF | 90.0 | 94.8 | 0.494 |
| AD+ MI | 90.3 | 95.0 | 0.385 |
| AD+ MI +AF | 90.8 | 95.6 | 0.503 |
| 方法 | CS/% | CV/% | 参数量/106 |
|---|---|---|---|
| P + B | 88.5 | 94.2 | 0.485 |
| P + B + P '+ B ' | 89.8 | 94.9 | 0.494 |
| AM | 90.2 | 95.1 | 0.503 |
| I | 90.3 | 95.3 | 0.503 |
| AMI | 90.8 | 95.6 | 0.503 |
表3 多信息实验对比
Tab. 3 Comparison of multi-information experiments
| 方法 | CS/% | CV/% | 参数量/106 |
|---|---|---|---|
| P + B | 88.5 | 94.2 | 0.485 |
| P + B + P '+ B ' | 89.8 | 94.9 | 0.494 |
| AM | 90.2 | 95.1 | 0.503 |
| I | 90.3 | 95.3 | 0.503 |
| AMI | 90.8 | 95.6 | 0.503 |
| 方法 | 参数量/106 | CS/% | CV/% |
|---|---|---|---|
| ST-GCN[ | 3.10 | 81.5 | 88.3 |
| 2s-AGCN[ | 6.94 | 88.5 | 95.1 |
| SGN[ | 0.69 | 89.0 | 94.5 |
| NAS-GCN[ | 6.57 | 89.4 | 95.7 |
| PR-GCN[ | 0.50 | 85.2 | 91.7 |
| ShiftGCN++[ | 0.45 | 87.9 | 94.8 |
| 4s ShiftGCN++ | 2.76 | 90.7 | 96.5 |
| EfficientGCN-B0 | 0.32 | 89.9 | 94.7 |
| Sybio-GNN[ | 14.85 | 90.1 | 95.4 |
| LMI-GCN* | 0.38 | 89.6 | 94.4 |
| MS-SGN[ | 1.50 | 90.1 | 95.2 |
| ED-GCN[ | — | 88.7 | 95.2 |
| 2S-EGCN[ | — | 89.1 | 95.5 |
| ST-GCN++[ | 1.39 | 90.1 | 95.5 |
| 1s AFFGCN* | 0.50 | 90.8 | 95.6 |
| 1s AFFGCN | 0.73 | 91.0 | 95.7 |
| 2s AFFGCN* | 1.00 | 91.4 | 95.9 |
| 3s AFFGCN* | 1.50 | 91.6 | 96.1 |
表4 在NTU-RGB+D 60数据集上本文方法与当前主流方法的对比
Tab. 4 Comparison of the proposed method with current mainstream methods on NTU-RGB+D 60 dataset
| 方法 | 参数量/106 | CS/% | CV/% |
|---|---|---|---|
| ST-GCN[ | 3.10 | 81.5 | 88.3 |
| 2s-AGCN[ | 6.94 | 88.5 | 95.1 |
| SGN[ | 0.69 | 89.0 | 94.5 |
| NAS-GCN[ | 6.57 | 89.4 | 95.7 |
| PR-GCN[ | 0.50 | 85.2 | 91.7 |
| ShiftGCN++[ | 0.45 | 87.9 | 94.8 |
| 4s ShiftGCN++ | 2.76 | 90.7 | 96.5 |
| EfficientGCN-B0 | 0.32 | 89.9 | 94.7 |
| Sybio-GNN[ | 14.85 | 90.1 | 95.4 |
| LMI-GCN* | 0.38 | 89.6 | 94.4 |
| MS-SGN[ | 1.50 | 90.1 | 95.2 |
| ED-GCN[ | — | 88.7 | 95.2 |
| 2S-EGCN[ | — | 89.1 | 95.5 |
| ST-GCN++[ | 1.39 | 90.1 | 95.5 |
| 1s AFFGCN* | 0.50 | 90.8 | 95.6 |
| 1s AFFGCN | 0.73 | 91.0 | 95.7 |
| 2s AFFGCN* | 1.00 | 91.4 | 95.9 |
| 3s AFFGCN* | 1.50 | 91.6 | 96.1 |
| 方法 | 浮点运算量/GFLOPs | CS/% | SS/% |
|---|---|---|---|
| ST-GCN[ | 16.20 | 70.7 | 73.2 |
| 2s-AGCN[ | 35.80 | 82.5 | 84.2 |
| SGN[ | 0.80 | 79.2 | 81.5 |
| LMI-GCN[ | 0.90 | 84.6 | 86.2 |
| LMI-GCN* | 0.57 | 84.2 | 85.8 |
| MS-SGN[ | — | 84.5 | 85.6 |
| ShiftGCN++[ | 0.40 | 80.5 | 83.0 |
| 4s-ShiftGCN++ | 1.70 | 85.6 | 87.2 |
| EfficientGCN-B0[ | — | 85.9 | 84.3 |
| SparseShiftGCN[ | 3.80 | 82.2 | 83.9 |
| 4s-SparseShiftGCN | 15.30 | 86.6 | 88.1 |
| ST-GCN++ | 2.80 | 85.6 | 87.5 |
| 1s AFFGCN* | 0.80 | 85.7 | 87.2 |
| 1s AFFGCN | 1.20 | 86.4 | 87.7 |
| 2s AFFGCN* | 1.60 | 86.6 | 88.1 |
| 3s AFFGCN* | 2.40 | 87.0 | 88.5 |
表5 在NTU-RGB+D 120数据集上本文方法与当前主流方法的对比
Tab. 5 Comparison of the proposed method with current mainstream methods on NTU-RGB+D 120 dataset
| 方法 | 浮点运算量/GFLOPs | CS/% | SS/% |
|---|---|---|---|
| ST-GCN[ | 16.20 | 70.7 | 73.2 |
| 2s-AGCN[ | 35.80 | 82.5 | 84.2 |
| SGN[ | 0.80 | 79.2 | 81.5 |
| LMI-GCN[ | 0.90 | 84.6 | 86.2 |
| LMI-GCN* | 0.57 | 84.2 | 85.8 |
| MS-SGN[ | — | 84.5 | 85.6 |
| ShiftGCN++[ | 0.40 | 80.5 | 83.0 |
| 4s-ShiftGCN++ | 1.70 | 85.6 | 87.2 |
| EfficientGCN-B0[ | — | 85.9 | 84.3 |
| SparseShiftGCN[ | 3.80 | 82.2 | 83.9 |
| 4s-SparseShiftGCN | 15.30 | 86.6 | 88.1 |
| ST-GCN++ | 2.80 | 85.6 | 87.5 |
| 1s AFFGCN* | 0.80 | 85.7 | 87.2 |
| 1s AFFGCN | 1.20 | 86.4 | 87.7 |
| 2s AFFGCN* | 1.60 | 86.6 | 88.1 |
| 3s AFFGCN* | 2.40 | 87.0 | 88.5 |
| 1 | AHMAD T, JIN L W, ZHANG X, et al. Graph convolutional neural network for human action recognition: a comprehensive survey[J]. IEEE Transactions on Artificial Intelligence, 2021, 2(2):128-145. 10.1109/tai.2021.3076974 |
| 2 | MA L Q, JIA X, SUN Q R, et al. Pose guided person image generation[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017:405-415. |
| 3 | 刘建伟,刘媛,罗雄麟. 深度学习研究进展[J]. 计算机应用研究, 2014, 31(7):1921-1930, 1942. 10.3969/j.issn.1001-3695.2014.07.001 |
| LIU J W, LIU Y, LUO X L. Research and development on deep learning[J]. Application Research of Computers, 2014, 31(7):1921-1930, 1942. 10.3969/j.issn.1001-3695.2014.07.001 | |
| 4 | YAN S J, XIONG Y J, LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 7444-77452. 10.1609/aaai.v32i1.12328 |
| 5 | CHENG K, ZHANG Y F, HE X Y, et al. Skeleton-based action recognition with shift graph convolutional network[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 180-189. 10.1109/cvpr42600.2020.00026 |
| 6 | DU Y, WANG W, WANG L. Hierarchical recurrent neural network for skeleton based action recognition[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1110-1118. 10.1109/cvpr.2015.7298714 |
| 7 | KE Q H, BENNAMOUN M, AN S J, et al. A new representation of skeleton sequences for 3D action recognition[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 4570-4579. 10.1109/cvpr.2017.486 |
| 8 | LI G H, MÜLLER M, THABET A, et al. DeepGCNs: can GCNs go as deep as CNNS?[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9266-9275. 10.1109/iccv.2019.00936 |
| 9 | SHI L, ZHANG Y F, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 12018-12027. 10.1109/cvpr.2019.01230 |
| 10 | CHENG K, ZHANG Y F, HE X Y, et al. Extremely lightweight skeleton-based action recognition with ShiftGCN++[J]. IEEE Transactions on Image Processing, 2021, 30: 7333-7348. 10.1109/tip.2021.3104182 |
| 11 | ZHANG P F, LAN C L, ZENG W J, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:1109-1118. 10.1109/cvpr42600.2020.00119 |
| 12 | SONG Y F, ZHANG Z, SHAN C F, et al. Constructing stronger and faster baselines for skeleton-based action recognition[J]. IEEE Transactions on Artificial Intelligence, 2023, 45(2):1474-1488. 10.1109/tpami.2022.3157033 |
| 13 | 井望,李汪根,沈公仆,等. 轻量级多信息图卷积神经网络动作识别方法[J]. 计算机应用研究, 2022, 39(4):1247-1252. |
| JING W, LI W G, SHEN G P, et al. Lightweight multi-information graph convolution neural network action recognition method[J]. Application Research of Computers, 2022, 39(4): 1247-1252. | |
| 14 | ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6848-6856. 10.1109/cvpr.2018.00716 |
| 15 | 安徽师范大学. 基于多流分组洗牌图卷积神经网络的骨骼动作识别方法:202210031468.1[P]. 2022-04-15. |
| Anhui Normal University. Skeletal action recognition method based on multi-stream group shuffle graph convolutional neural network: 202210031468.1[P]. 2022-04-15. | |
| 16 | SU Y X, ZHANG R, ERFANI S, et al. Detecting beneficial feature interactions for recommender systems[C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2021: 4357-4365. 10.1609/aaai.v35i5.16561 |
| 17 | LI X, WANG W H, HU X L, et al. Selective kernel networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 510-519. 10.1109/cvpr.2019.00060 |
| 18 | ZHANG H, WU C R, ZHANG Z Y, et al. ResNeSt: Split-attention networks[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 2735-2745. 10.1109/cvprw56347.2022.00309 |
| 19 | SHAHROUDY A, LIU J, NG T T, et al. NTU RGB+D: a large scale dataset for 3D human activity analysis[C]// Proceedings of the 2016 Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016:1010-1019. 10.1109/cvpr.2016.115 |
| 20 | LIU J, SHAHROUDY A, PEREZ M, et al. NTU RGB+D 120: a large-scale benchmark for 3D human activity understanding[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(10): 2684-2701. 10.1109/tpami.2019.2916873 |
| 21 | PENG W, HONG X P, CHEN H Y, et al. Learning graph convolutional network for skeleton-based human action recognition by neural searching[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 2669-2676. 10.1609/aaai.v34i03.5652 |
| 22 | ZHOU G Y, WANG H Q, CHEN J X, et al. PR-GCN: a deep graph convolutional network with point refinement for 6D pose estimation[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 2773-2782. 10.1109/iccv48922.2021.00279 |
| 23 | LI M S, CHEN S H, CHEN X, et al. Symbiotic graph neural networks for 3D skeleton-based human action recognition and motion prediction[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 3316-3333. 10.1109/tpami.2021.3053765 |
| 24 | ZHANG P F, LAN C L, ZENG W J, et al. Multi-scale semantics-guided neural networks for efficient skeleton-based human action recognition[EB/OL]. (2021-11-07) [2022-06-25].. 10.1109/cvpr42600.2020.00119 |
| 25 | ALSARHAN T, ALI U, LU H T. Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition[J]. Computer Vision and Image Understanding, 2022, 216: No.103348. 10.1016/j.cviu.2021.103348 |
| 26 | WANG Q Y, ZHANG K X, ASGHAR M A. Skeleton-based ST-GCN for human action recognition with extended skeleton graph and partitioning strategy[J]. IEEE Access, 2022, 10: 41403-41410. 10.1109/access.2022.3164711 |
| 27 | DUAN H D, WANG J Q, CHEN K, et al. PYSKL: towards good practices for skeleton action recognition[EB/OL]. (2022-05-19) [2022-06-02].. 10.1145/3503161.3548546 |
| 28 | ZANG Y, YANG D S, LIU T J, et al. SparseShift-GCN: high precision skeleton-based action recognition[J]. Pattern Recognition Letters, 2022, 153: 136-143. 10.1016/j.patrec.2021.12.005 |
| [1] | 张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371. |
| [2] | 姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785. |
| [3] | 沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806. |
| [4] | 王星, 刘贵娟, 陈志豪. 高斯混合模型与文本图卷积网络结合的虚假评论识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 360-368. |
| [5] | 郭晓, 陈艳平, 唐瑞雪, 黄瑞章, 秦永彬. 融合行为词的罪名预测多任务学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 159-166. |
| [6] | 何嘉明, 杨巨成, 吴超, 闫潇宁, 许能华. 基于多模态图卷积神经网络的行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2182-2189. |
| [7] | 樊小宇, 蔺素珍, 王彦博, 刘峰, 李大威. 基于残差图卷积神经网络的高倍欠采样核磁共振图像重建算法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1261-1268. |
| [8] | 王若莹, 吕凡, 赵柳清, 胡伏原. 融合用户需求和边界约束的平面图生成算法[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 575-582. |
| [9] | 罗芳, 刘阳, 何道森. 复杂场景下自适应特征融合的多尺度船舶检测[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3587-3593. |
| [10] | 韩滕跃, 牛少彰, 张文. 基于对比学习的多模态序列推荐算法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1683-1688. |
| [11] | 余晓鹏, 何儒汉, 黄晋, 张俊杰, 胡新荣. 基于改进Inception结构的知识图谱嵌入模型[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1065-1071. |
| [12] | 陈浩杰, 范江亭, 刘勇. 深度强化学习解决动态旅行商问题[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1194-1200. |
| [13] | 李晓杰, 崔超然, 宋广乐, 苏雅茜, 吴天泽, 张春云. 基于时序超图卷积神经网络的股票趋势预测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 797-803. |
| [14] | 潘仁志, 钱付兰, 赵姝, 张燕平. 基于卷积神经网络交互的用户属性偏好建模的推荐模型[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 404-411. |
| [15] | 富坤, 高金辉, 赵晓梦, 李佳宁. 融合全局结构信息的拓扑优化图卷积网络[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 357-364. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||