基于特征交互与自适应融合的骨骼动作识别

doi:10.11772/j.issn.1001-9081.2022071105

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2581-2587.DOI: 10.11772/j.issn.1001-9081.2022071105

所属专题：多媒体计算与计算机仿真

• 多媒体计算与计算机仿真 • 上一篇下一篇

基于特征交互与自适应融合的骨骼动作识别

李豆豆, 李汪根(), 夏义春, 束阳, 高坤

安徽师范大学计算机与信息学院，安徽芜湖 241003

收稿日期:2022-07-29 修回日期:2022-11-18 接受日期:2022-11-30 发布日期:2023-01-15 出版日期:2023-08-10
通讯作者: 李汪根
作者简介:李豆豆（1996—），男，安徽淮北人，硕士研究生，主要研究方向：深度学习、骨骼动作识别
夏义春（1995—），男，安徽合肥人，硕士研究生，主要研究方向：推荐系统、计算广告、深度学习
束阳（1997—），男，安徽宣城人，硕士研究生，主要研究方向：深度学习、骨骼动作识别
高坤（1997—），男，安徽淮南人，硕士研究生，主要研究方向：深度学习、姿态估计。
基金资助:
高校领军人才引进与培育计划项目(051619)

Skeleton-based action recognition based on feature interaction and adaptive fusion

Doudou LI, Wanggen LI(), Yichun XIA, Yang SHU, Kun GAO

School of Computer and Information，Anhui Normal University，Wuhu Anhui 241003，China

Received:2022-07-29 Revised:2022-11-18 Accepted:2022-11-30 Online:2023-01-15 Published:2023-08-10
Contact: Wanggen LI
About author:LI Doudou， born in 1996， M. S. candidate. His research interests include deep learning， skeletal-based action recognition.
XIA Yichun， born in 1995， M. S. candidate. His research interests include recommender system， computational advertising， deep learning.
SHU Yang， born in 1997， M. S. candidate. His research interests include deep learning， skeletal-based action recognition.
GAO Kun， born in 1997， M. S. candidate. His research interests include deep learning， pose estimation.
Supported by:
University Leading Talents Introduction and Cultivation Program(051619)

摘要/Abstract

摘要：

当前骨骼动作识别任务中仍存在数据预处理不合理、模型参数量大、识别精度低的缺点。为解决以上问题，提出了一种基于特征交互与自适应融合的骨骼动作识别方法AFFGCN。首先，提出一种自适应分池数据预处理算法，以解决数据帧分布不均匀和数据帧代表性差的问题；其次，引入一种多信息特征交互的方法来挖掘更深的特征，以提高模型的性能；最后，提出一种自适应特征融合（AFF）模块用于图卷积特征融合，以进一步提高模型性能。实验结果表明，该方法在NTU-RGB+D 60数据集上较基线方法轻量级多信息图卷积神经网络（LMI-GCN）在交叉主题（CS）与交叉视角（CV）两种评估设置上均提升了1.2个百分点，在NTU-RGB+D 120数据集上较基线方法LMI-GCN在CS和交叉设置号（SS）评估设置上分别提升了1.5和1.4个百分点。而在单流和多流网络上的实验结果表明，相较于语义引导神经网络（SGN）等当前主流骨骼动作识别方法，所提方法的模型参数量更低、准确度更高，模型性能优势明显，更加适用于移动设备的部署。

关键词: 图卷积神经网络, 自适应特征融合, 人体骨骼动作识别, 多信息融合, 特征交互

Abstract:

At present， in skeleton-based action recognition task， there still are some shortcomings， such as unreasonable data preprocessing， too many model parameters and low recognition accuracy. In order to solve the above problems， a skeleton-based action recognition method based on feature interaction and adaptive fusion， namely AFFGCN（Adaptively Feature Fusion Graph Convolutional Neural Network）， was proposed. Firstly， an adaptive pooling method for data preprocessing to solve the problems of uneven data frame distribution and poor data frame representation was proposed. Secondly， a multi-information feature interaction method was introduced to mine deeper features， so as to improve performance of the model. Finally， an Adaptive Feature Fusion （AFF） module was proposed to fuse graph convolutional features， thereby further improving the model performance. Experimental results show that the proposed method increases 1.2 percentage points compared with baseline method Lightweight Multi-Information Graph Convolutional Neural Network （LMI-GCN） on NTU-RGB+D 60 dataset in both Cross-Subject （CS） and Cross-View （CV） evaluation settings. At the same time， the CS and Cross-Setup （SS） evaluation settings of the proposed method on NTU-RGB+D 120 dataset are increased by 1.5 and 1.4 percentage points respectively compared with those of baseline method LMI-GCN. And the experimental results on single-stream and multi-stream networks show that compared with current mainstream skeleton-based action recognition methods such as Semantics-Guided Neural network （SGN）， the proposed method has less parameters and higher accuracy of the model， showing obvious advantages of the model， and that the model is more suitable for mobile device deployment.

Key words: Graph Convolutional neural Network (GCN), Adaptive Feature Fusion (AFF), human skeleton-based action recognition, multi-information fusion, feature interaction

中图分类号:

TP391

李豆豆, 李汪根, 夏义春, 束阳, 高坤. 基于特征交互与自适应融合的骨骼动作识别[J]. 计算机应用, 2023, 43(8): 2581-2587.

Doudou LI, Wanggen LI, Yichun XIA, Yang SHU, Kun GAO. Skeleton-based action recognition based on feature interaction and adaptive fusion[J]. Journal of Computer Applications, 2023, 43(8): 2581-2587.

图/表 11

参考文献 28

1	AHMAD T， JIN L W， ZHANG X， et al. Graph convolutional neural network for human action recognition： a comprehensive survey［J］. IEEE Transactions on Artificial Intelligence， 2021， 2（2）：128-145. 10.1109/tai.2021.3076974
2	MA L Q， JIA X， SUN Q R， et al. Pose guided person image generation［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017：405-415.
3	刘建伟，刘媛，罗雄麟. 深度学习研究进展［J］. 计算机应用研究， 2014， 31（7）：1921-1930， 1942. 10.3969/j.issn.1001-3695.2014.07.001
	LIU J W， LIU Y， LUO X L. Research and development on deep learning［J］. Application Research of Computers， 2014， 31（7）：1921-1930， 1942. 10.3969/j.issn.1001-3695.2014.07.001
4	YAN S J， XIONG Y J， LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2018： 7444-77452. 10.1609/aaai.v32i1.12328
5	CHENG K， ZHANG Y F， HE X Y， et al. Skeleton-based action recognition with shift graph convolutional network［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 180-189. 10.1109/cvpr42600.2020.00026
6	DU Y， WANG W， WANG L. Hierarchical recurrent neural network for skeleton based action recognition［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1110-1118. 10.1109/cvpr.2015.7298714
7	KE Q H， BENNAMOUN M， AN S J， et al. A new representation of skeleton sequences for 3D action recognition［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 4570-4579. 10.1109/cvpr.2017.486
8	LI G H， MÜLLER M， THABET A， et al. DeepGCNs： can GCNs go as deep as CNNS？［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 9266-9275. 10.1109/iccv.2019.00936
9	SHI L， ZHANG Y F， CHENG J， et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 12018-12027. 10.1109/cvpr.2019.01230
10	CHENG K， ZHANG Y F， HE X Y， et al. Extremely lightweight skeleton-based action recognition with ShiftGCN++［J］. IEEE Transactions on Image Processing， 2021， 30： 7333-7348. 10.1109/tip.2021.3104182
11	ZHANG P F， LAN C L， ZENG W J， et al. Semantics-guided neural networks for efficient skeleton-based human action recognition［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020：1109-1118. 10.1109/cvpr42600.2020.00119
12	SONG Y F， ZHANG Z， SHAN C F， et al. Constructing stronger and faster baselines for skeleton-based action recognition［J］. IEEE Transactions on Artificial Intelligence， 2023， 45（2）：1474-1488. 10.1109/tpami.2022.3157033
13	井望，李汪根，沈公仆，等. 轻量级多信息图卷积神经网络动作识别方法［J］. 计算机应用研究， 2022， 39（4）：1247-1252.
	JING W， LI W G， SHEN G P， et al. Lightweight multi-information graph convolution neural network action recognition method［J］. Application Research of Computers， 2022， 39（4）： 1247-1252.
14	ZHANG X Y， ZHOU X Y， LIN M X， et al. ShuffleNet： an extremely efficient convolutional neural network for mobile devices［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6848-6856. 10.1109/cvpr.2018.00716
15	安徽师范大学. 基于多流分组洗牌图卷积神经网络的骨骼动作识别方法：202210031468.1［P］. 2022-04-15.
	Anhui Normal University. Skeletal action recognition method based on multi-stream group shuffle graph convolutional neural network： 202210031468.1［P］. 2022-04-15.
16	SU Y X， ZHANG R， ERFANI S， et al. Detecting beneficial feature interactions for recommender systems［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2021： 4357-4365. 10.1609/aaai.v35i5.16561
17	LI X， WANG W H， HU X L， et al. Selective kernel networks［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 510-519. 10.1109/cvpr.2019.00060
18	ZHANG H， WU C R， ZHANG Z Y， et al. ResNeSt： Split-attention networks［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 2735-2745. 10.1109/cvprw56347.2022.00309
19	SHAHROUDY A， LIU J， NG T T， et al. NTU RGB+D： a large scale dataset for 3D human activity analysis［C］// Proceedings of the 2016 Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016：1010-1019. 10.1109/cvpr.2016.115
20	LIU J， SHAHROUDY A， PEREZ M， et al. NTU RGB+D 120： a large-scale benchmark for 3D human activity understanding［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（10）： 2684-2701. 10.1109/tpami.2019.2916873
21	PENG W， HONG X P， CHEN H Y， et al. Learning graph convolutional network for skeleton-based human action recognition by neural searching［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 2669-2676. 10.1609/aaai.v34i03.5652
22	ZHOU G Y， WANG H Q， CHEN J X， et al. PR-GCN： a deep graph convolutional network with point refinement for 6D pose estimation［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 2773-2782. 10.1109/iccv48922.2021.00279
23	LI M S， CHEN S H， CHEN X， et al. Symbiotic graph neural networks for 3D skeleton-based human action recognition and motion prediction［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（6）： 3316-3333. 10.1109/tpami.2021.3053765
24	ZHANG P F， LAN C L， ZENG W J， et al. Multi-scale semantics-guided neural networks for efficient skeleton-based human action recognition［EB/OL］. （2021-11-07）［2022-06-25］.. 10.1109/cvpr42600.2020.00119
25	ALSARHAN T， ALI U， LU H T. Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition［J］. Computer Vision and Image Understanding， 2022， 216： No.103348. 10.1016/j.cviu.2021.103348
26	WANG Q Y， ZHANG K X， ASGHAR M A. Skeleton-based ST-GCN for human action recognition with extended skeleton graph and partitioning strategy［J］. IEEE Access， 2022， 10： 41403-41410. 10.1109/access.2022.3164711
27	DUAN H D， WANG J Q， CHEN K， et al. PYSKL： towards good practices for skeleton action recognition［EB/OL］. （2022-05-19）［2022-06-02］.. 10.1145/3503161.3548546
28	ZANG Y， YANG D S， LIU T J， et al. SparseShift-GCN： high precision skeleton-based action recognition［J］. Pattern Recognition Letters， 2022， 153： 136-143. 10.1016/j.patrec.2021.12.005

方法	CS/%	CV/%	参数量/10⁶
AFFGCN	91.0	95.7	0.730
AFFGCN^*	90.8	95.6	0.503

方法	CS/%	CV/%	参数量/10⁶
AFFGCN	91.0	95.7	0.730
AFFGCN^*	90.8	95.6	0.503

方法	CS/%	CV/%	参数量/10⁶
LMI-GCN^*	89.6	94.4	0.376
AD	89.8	94.7	0.376
MI	89.9	94.7	0.385
AF	90.0	94.8	0.494
AD+ MI	90.3	95.0	0.385
AD+ MI +AF	90.8	95.6	0.503

方法	CS/%	CV/%	参数量/10⁶
LMI-GCN^*	89.6	94.4	0.376
AD	89.8	94.7	0.376
MI	89.9	94.7	0.385
AF	90.0	94.8	0.494
AD+ MI	90.3	95.0	0.385
AD+ MI +AF	90.8	95.6	0.503

方法	CS/%	CV/%	参数量/10⁶
P + B	88.5	94.2	0.485
P + B + P '+ B '	89.8	94.9	0.494
AM	90.2	95.1	0.503
I	90.3	95.3	0.503
AMI	90.8	95.6	0.503

基于特征交互与自适应融合的骨骼动作识别

Skeleton-based action recognition based on feature interaction and adaptive fusion

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 28

相关文章 15

编辑推荐

Metrics

方法	参数量/10⁶	CS/%	CV/%
ST-GCN^［4］	3.10	81.5	88.3
2s-AGCN^［9］	6.94	88.5	95.1
SGN^［11］	0.69	89.0	94.5
NAS-GCN^［21］	6.57	89.4	95.7
PR-GCN^［22］	0.50	85.2	91.7
ShiftGCN++^［10］	0.45	87.9	94.8
4s ShiftGCN++	2.76	90.7	96.5
EfficientGCN-B0	0.32	89.9	94.7
Sybio-GNN^［23］	14.85	90.1	95.4
LMI-GCN^*	0.38	89.6	94.4
MS-SGN^［24］	1.50	90.1	95.2
ED-GCN^［25］	—	88.7	95.2
2S-EGCN^［26］	—	89.1	95.5
ST-GCN++^［27］	1.39	90.1	95.5
1s AFFGCN^*	0.50	90.8	95.6
1s AFFGCN	0.73	91.0	95.7
2s AFFGCN^*	1.00	91.4	95.9
3s AFFGCN^*	1.50	91.6	96.1

方法	浮点运算量/GFLOPs	CS/%	SS/%
ST-GCN^［4］	16.20	70.7	73.2
2s-AGCN^［9］	35.80	82.5	84.2
SGN^［11］	0.80	79.2	81.5
LMI-GCN^［13］	0.90	84.6	86.2
LMI-GCN^*	0.57	84.2	85.8
MS-SGN^［24］	—	84.5	85.6
ShiftGCN++^［10］	0.40	80.5	83.0
4s-ShiftGCN++	1.70	85.6	87.2
EfficientGCN-B0^［12］	—	85.9	84.3
SparseShiftGCN^［28］	3.80	82.2	83.9
4s-SparseShiftGCN	15.30	86.6	88.1
ST-GCN++	2.80	85.6	87.5
1s AFFGCN^*	0.80	85.7	87.2
1s AFFGCN	1.20	86.4	87.7
2s AFFGCN^*	1.60	86.6	88.1
3s AFFGCN^*	2.40	87.0	88.5

[1]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[2]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[3]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[4]	王星, 刘贵娟, 陈志豪. 高斯混合模型与文本图卷积网络结合的虚假评论识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 360-368.
[5]	郭晓, 陈艳平, 唐瑞雪, 黄瑞章, 秦永彬. 融合行为词的罪名预测多任务学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 159-166.
[6]	何嘉明, 杨巨成, 吴超, 闫潇宁, 许能华. 基于多模态图卷积神经网络的行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2182-2189.
[7]	樊小宇, 蔺素珍, 王彦博, 刘峰, 李大威. 基于残差图卷积神经网络的高倍欠采样核磁共振图像重建算法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1261-1268.
[8]	王若莹, 吕凡, 赵柳清, 胡伏原. 融合用户需求和边界约束的平面图生成算法[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 575-582.
[9]	罗芳, 刘阳, 何道森. 复杂场景下自适应特征融合的多尺度船舶检测[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3587-3593.
[10]	韩滕跃, 牛少彰, 张文. 基于对比学习的多模态序列推荐算法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1683-1688.
[11]	余晓鹏, 何儒汉, 黄晋, 张俊杰, 胡新荣. 基于改进Inception结构的知识图谱嵌入模型[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1065-1071.
[12]	陈浩杰, 范江亭, 刘勇. 深度强化学习解决动态旅行商问题[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1194-1200.
[13]	李晓杰, 崔超然, 宋广乐, 苏雅茜, 吴天泽, 张春云. 基于时序超图卷积神经网络的股票趋势预测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 797-803.
[14]	潘仁志, 钱付兰, 赵姝, 张燕平. 基于卷积神经网络交互的用户属性偏好建模的推荐模型[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 404-411.
[15]	富坤, 高金辉, 赵晓梦, 李佳宁. 融合全局结构信息的拓扑优化图卷积网络[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 357-364.