Skeleton-based action recognition based on feature interaction and adaptive fusion

doi:10.11772/j.issn.1001-9081.2022071105

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (8): 2581-2587.DOI: 10.11772/j.issn.1001-9081.2022071105

Special Issue: 多媒体计算与计算机仿真

• Multimedia computing and computer simulation • Previous Articles Next Articles

Skeleton-based action recognition based on feature interaction and adaptive fusion

Doudou LI, Wanggen LI(), Yichun XIA, Yang SHU, Kun GAO

School of Computer and Information，Anhui Normal University，Wuhu Anhui 241003，China

Received:2022-07-29 Revised:2022-11-18 Accepted:2022-11-30 Online:2023-01-15 Published:2023-08-10
Contact: Wanggen LI
About author:LI Doudou， born in 1996， M. S. candidate. His research interests include deep learning， skeletal-based action recognition.
XIA Yichun， born in 1995， M. S. candidate. His research interests include recommender system， computational advertising， deep learning.
SHU Yang， born in 1997， M. S. candidate. His research interests include deep learning， skeletal-based action recognition.
GAO Kun， born in 1997， M. S. candidate. His research interests include deep learning， pose estimation.
Supported by:
University Leading Talents Introduction and Cultivation Program(051619)

基于特征交互与自适应融合的骨骼动作识别

李豆豆, 李汪根(), 夏义春, 束阳, 高坤

安徽师范大学计算机与信息学院，安徽芜湖 241003

通讯作者: 李汪根
作者简介:李豆豆（1996—），男，安徽淮北人，硕士研究生，主要研究方向：深度学习、骨骼动作识别
夏义春（1995—），男，安徽合肥人，硕士研究生，主要研究方向：推荐系统、计算广告、深度学习
束阳（1997—），男，安徽宣城人，硕士研究生，主要研究方向：深度学习、骨骼动作识别
高坤（1997—），男，安徽淮南人，硕士研究生，主要研究方向：深度学习、姿态估计。
基金资助:
高校领军人才引进与培育计划项目(051619)

Abstract

Abstract:

At present， in skeleton-based action recognition task， there still are some shortcomings， such as unreasonable data preprocessing， too many model parameters and low recognition accuracy. In order to solve the above problems， a skeleton-based action recognition method based on feature interaction and adaptive fusion， namely AFFGCN（Adaptively Feature Fusion Graph Convolutional Neural Network）， was proposed. Firstly， an adaptive pooling method for data preprocessing to solve the problems of uneven data frame distribution and poor data frame representation was proposed. Secondly， a multi-information feature interaction method was introduced to mine deeper features， so as to improve performance of the model. Finally， an Adaptive Feature Fusion （AFF） module was proposed to fuse graph convolutional features， thereby further improving the model performance. Experimental results show that the proposed method increases 1.2 percentage points compared with baseline method Lightweight Multi-Information Graph Convolutional Neural Network （LMI-GCN） on NTU-RGB+D 60 dataset in both Cross-Subject （CS） and Cross-View （CV） evaluation settings. At the same time， the CS and Cross-Setup （SS） evaluation settings of the proposed method on NTU-RGB+D 120 dataset are increased by 1.5 and 1.4 percentage points respectively compared with those of baseline method LMI-GCN. And the experimental results on single-stream and multi-stream networks show that compared with current mainstream skeleton-based action recognition methods such as Semantics-Guided Neural network （SGN）， the proposed method has less parameters and higher accuracy of the model， showing obvious advantages of the model， and that the model is more suitable for mobile device deployment.

Key words: Graph Convolutional neural Network (GCN), Adaptive Feature Fusion (AFF), human skeleton-based action recognition, multi-information fusion, feature interaction

摘要：

当前骨骼动作识别任务中仍存在数据预处理不合理、模型参数量大、识别精度低的缺点。为解决以上问题，提出了一种基于特征交互与自适应融合的骨骼动作识别方法AFFGCN。首先，提出一种自适应分池数据预处理算法，以解决数据帧分布不均匀和数据帧代表性差的问题；其次，引入一种多信息特征交互的方法来挖掘更深的特征，以提高模型的性能；最后，提出一种自适应特征融合（AFF）模块用于图卷积特征融合，以进一步提高模型性能。实验结果表明，该方法在NTU-RGB+D 60数据集上较基线方法轻量级多信息图卷积神经网络（LMI-GCN）在交叉主题（CS）与交叉视角（CV）两种评估设置上均提升了1.2个百分点，在NTU-RGB+D 120数据集上较基线方法LMI-GCN在CS和交叉设置号（SS）评估设置上分别提升了1.5和1.4个百分点。而在单流和多流网络上的实验结果表明，相较于语义引导神经网络（SGN）等当前主流骨骼动作识别方法，所提方法的模型参数量更低、准确度更高，模型性能优势明显，更加适用于移动设备的部署。

关键词: 图卷积神经网络, 自适应特征融合, 人体骨骼动作识别, 多信息融合, 特征交互

CLC Number:

TP391

Doudou LI, Wanggen LI, Yichun XIA, Yang SHU, Kun GAO. Skeleton-based action recognition based on feature interaction and adaptive fusion[J]. Journal of Computer Applications, 2023, 43(8): 2581-2587.

李豆豆, 李汪根, 夏义春, 束阳, 高坤. 基于特征交互与自适应融合的骨骼动作识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2581-2587.

Figures/Tables 11

References 28

1	AHMAD T， JIN L W， ZHANG X， et al. Graph convolutional neural network for human action recognition： a comprehensive survey［J］. IEEE Transactions on Artificial Intelligence， 2021， 2（2）：128-145. 10.1109/tai.2021.3076974
2	MA L Q， JIA X， SUN Q R， et al. Pose guided person image generation［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017：405-415.
3	刘建伟，刘媛，罗雄麟. 深度学习研究进展［J］. 计算机应用研究， 2014， 31（7）：1921-1930， 1942. 10.3969/j.issn.1001-3695.2014.07.001
	LIU J W， LIU Y， LUO X L. Research and development on deep learning［J］. Application Research of Computers， 2014， 31（7）：1921-1930， 1942. 10.3969/j.issn.1001-3695.2014.07.001
4	YAN S J， XIONG Y J， LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2018： 7444-77452. 10.1609/aaai.v32i1.12328
5	CHENG K， ZHANG Y F， HE X Y， et al. Skeleton-based action recognition with shift graph convolutional network［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 180-189. 10.1109/cvpr42600.2020.00026
6	DU Y， WANG W， WANG L. Hierarchical recurrent neural network for skeleton based action recognition［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1110-1118. 10.1109/cvpr.2015.7298714
7	KE Q H， BENNAMOUN M， AN S J， et al. A new representation of skeleton sequences for 3D action recognition［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 4570-4579. 10.1109/cvpr.2017.486
8	LI G H， MÜLLER M， THABET A， et al. DeepGCNs： can GCNs go as deep as CNNS？［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 9266-9275. 10.1109/iccv.2019.00936
9	SHI L， ZHANG Y F， CHENG J， et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 12018-12027. 10.1109/cvpr.2019.01230
10	CHENG K， ZHANG Y F， HE X Y， et al. Extremely lightweight skeleton-based action recognition with ShiftGCN++［J］. IEEE Transactions on Image Processing， 2021， 30： 7333-7348. 10.1109/tip.2021.3104182
11	ZHANG P F， LAN C L， ZENG W J， et al. Semantics-guided neural networks for efficient skeleton-based human action recognition［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020：1109-1118. 10.1109/cvpr42600.2020.00119
12	SONG Y F， ZHANG Z， SHAN C F， et al. Constructing stronger and faster baselines for skeleton-based action recognition［J］. IEEE Transactions on Artificial Intelligence， 2023， 45（2）：1474-1488. 10.1109/tpami.2022.3157033
13	井望，李汪根，沈公仆，等. 轻量级多信息图卷积神经网络动作识别方法［J］. 计算机应用研究， 2022， 39（4）：1247-1252.
	JING W， LI W G， SHEN G P， et al. Lightweight multi-information graph convolution neural network action recognition method［J］. Application Research of Computers， 2022， 39（4）： 1247-1252.
14	ZHANG X Y， ZHOU X Y， LIN M X， et al. ShuffleNet： an extremely efficient convolutional neural network for mobile devices［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6848-6856. 10.1109/cvpr.2018.00716
15	安徽师范大学. 基于多流分组洗牌图卷积神经网络的骨骼动作识别方法：202210031468.1［P］. 2022-04-15.
	Anhui Normal University. Skeletal action recognition method based on multi-stream group shuffle graph convolutional neural network： 202210031468.1［P］. 2022-04-15.
16	SU Y X， ZHANG R， ERFANI S， et al. Detecting beneficial feature interactions for recommender systems［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2021： 4357-4365. 10.1609/aaai.v35i5.16561
17	LI X， WANG W H， HU X L， et al. Selective kernel networks［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 510-519. 10.1109/cvpr.2019.00060
18	ZHANG H， WU C R， ZHANG Z Y， et al. ResNeSt： Split-attention networks［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 2735-2745. 10.1109/cvprw56347.2022.00309
19	SHAHROUDY A， LIU J， NG T T， et al. NTU RGB+D： a large scale dataset for 3D human activity analysis［C］// Proceedings of the 2016 Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016：1010-1019. 10.1109/cvpr.2016.115
20	LIU J， SHAHROUDY A， PEREZ M， et al. NTU RGB+D 120： a large-scale benchmark for 3D human activity understanding［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（10）： 2684-2701. 10.1109/tpami.2019.2916873
21	PENG W， HONG X P， CHEN H Y， et al. Learning graph convolutional network for skeleton-based human action recognition by neural searching［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 2669-2676. 10.1609/aaai.v34i03.5652
22	ZHOU G Y， WANG H Q， CHEN J X， et al. PR-GCN： a deep graph convolutional network with point refinement for 6D pose estimation［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 2773-2782. 10.1109/iccv48922.2021.00279
23	LI M S， CHEN S H， CHEN X， et al. Symbiotic graph neural networks for 3D skeleton-based human action recognition and motion prediction［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（6）： 3316-3333. 10.1109/tpami.2021.3053765
24	ZHANG P F， LAN C L， ZENG W J， et al. Multi-scale semantics-guided neural networks for efficient skeleton-based human action recognition［EB/OL］. （2021-11-07）［2022-06-25］.. 10.1109/cvpr42600.2020.00119
25	ALSARHAN T， ALI U， LU H T. Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition［J］. Computer Vision and Image Understanding， 2022， 216： No.103348. 10.1016/j.cviu.2021.103348
26	WANG Q Y， ZHANG K X， ASGHAR M A. Skeleton-based ST-GCN for human action recognition with extended skeleton graph and partitioning strategy［J］. IEEE Access， 2022， 10： 41403-41410. 10.1109/access.2022.3164711
27	DUAN H D， WANG J Q， CHEN K， et al. PYSKL： towards good practices for skeleton action recognition［EB/OL］. （2022-05-19）［2022-06-02］.. 10.1145/3503161.3548546
28	ZANG Y， YANG D S， LIU T J， et al. SparseShift-GCN： high precision skeleton-based action recognition［J］. Pattern Recognition Letters， 2022， 153： 136-143. 10.1016/j.patrec.2021.12.005

方法	CS/%	CV/%	参数量/10⁶
AFFGCN	91.0	95.7	0.730
AFFGCN^*	90.8	95.6	0.503

方法	CS/%	CV/%	参数量/10⁶
AFFGCN	91.0	95.7	0.730
AFFGCN^*	90.8	95.6	0.503

方法	CS/%	CV/%	参数量/10⁶
LMI-GCN^*	89.6	94.4	0.376
AD	89.8	94.7	0.376
MI	89.9	94.7	0.385
AF	90.0	94.8	0.494
AD+ MI	90.3	95.0	0.385
AD+ MI +AF	90.8	95.6	0.503

方法	CS/%	CV/%	参数量/10⁶
LMI-GCN^*	89.6	94.4	0.376
AD	89.8	94.7	0.376
MI	89.9	94.7	0.385
AF	90.0	94.8	0.494
AD+ MI	90.3	95.0	0.385
AD+ MI +AF	90.8	95.6	0.503

方法	CS/%	CV/%	参数量/10⁶
P + B	88.5	94.2	0.485
P + B + P '+ B '	89.8	94.9	0.494
AM	90.2	95.1	0.503
I	90.3	95.3	0.503
AMI	90.8	95.6	0.503

Skeleton-based action recognition based on feature interaction and adaptive fusion

基于特征交互与自适应融合的骨骼动作识别

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 28

Related Articles 12

Recommended Articles

Metrics

方法	参数量/10⁶	CS/%	CV/%
ST-GCN^［4］	3.10	81.5	88.3
2s-AGCN^［9］	6.94	88.5	95.1
SGN^［11］	0.69	89.0	94.5
NAS-GCN^［21］	6.57	89.4	95.7
PR-GCN^［22］	0.50	85.2	91.7
ShiftGCN++^［10］	0.45	87.9	94.8
4s ShiftGCN++	2.76	90.7	96.5
EfficientGCN-B0	0.32	89.9	94.7
Sybio-GNN^［23］	14.85	90.1	95.4
LMI-GCN^*	0.38	89.6	94.4
MS-SGN^［24］	1.50	90.1	95.2
ED-GCN^［25］	—	88.7	95.2
2S-EGCN^［26］	—	89.1	95.5
ST-GCN++^［27］	1.39	90.1	95.5
1s AFFGCN^*	0.50	90.8	95.6
1s AFFGCN	0.73	91.0	95.7
2s AFFGCN^*	1.00	91.4	95.9
3s AFFGCN^*	1.50	91.6	96.1

方法	浮点运算量/GFLOPs	CS/%	SS/%
ST-GCN^［4］	16.20	70.7	73.2
2s-AGCN^［9］	35.80	82.5	84.2
SGN^［11］	0.80	79.2	81.5
LMI-GCN^［13］	0.90	84.6	86.2
LMI-GCN^*	0.57	84.2	85.8
MS-SGN^［24］	—	84.5	85.6
ShiftGCN++^［10］	0.40	80.5	83.0
4s-ShiftGCN++	1.70	85.6	87.2
EfficientGCN-B0^［12］	—	85.9	84.3
SparseShiftGCN^［28］	3.80	82.2	83.9
4s-SparseShiftGCN	15.30	86.6	88.1
ST-GCN++	2.80	85.6	87.5
1s AFFGCN^*	0.80	85.7	87.2
1s AFFGCN	1.20	86.4	87.7
2s AFFGCN^*	1.60	86.6	88.1
3s AFFGCN^*	2.40	87.0	88.5

[1]	Xun YAO, Zhongzheng QIN, Jie YANG. Generative label adversarial text classification model [J]. Journal of Computer Applications, 2024, 44(6): 1781-1785.
[2]	Junfeng SHEN, Xingchen ZHOU, Can TANG. Dual-channel sentiment analysis model based on improved prompt learning method [J]. Journal of Computer Applications, 2024, 44(6): 1796-1806.
[3]	Jiaming HE, Jucheng YANG, Chao WU, Xiaoning YAN, Nenghua XU. Person re-identification method based on multi-modal graph convolutional neural network [J]. Journal of Computer Applications, 2023, 43(7): 2182-2189.
[4]	Xiaoyu FAN, Suzhen LIN, Yanbo WANG, Feng LIU, Dawei LI. Reconstruction algorithm for highly undersampled magnetic resonance images based on residual graph convolutional neural network [J]. Journal of Computer Applications, 2023, 43(4): 1261-1268.
[5]	Tengyue HAN, Shaozhang NIU, Wen ZHANG. Multimodal sequential recommendation algorithm based on contrastive learning [J]. Journal of Computer Applications, 2022, 42(6): 1683-1688.
[6]	Xiaopeng YU, Ruhan HE, Jin HUANG, Junjie ZHANG, Xinrong HU. Knowledge graph embedding model based on improved Inception structure [J]. Journal of Computer Applications, 2022, 42(4): 1065-1071.
[7]	Renzhi PAN, Fulan QIAN, Shu ZHAO, Yanping ZHANG. Recommendation model for user attribute preference modeling based on convolutional neural network interaction [J]. Journal of Computer Applications, 2022, 42(2): 404-411.
[8]	MOU Changning, WANG Haipeng, ZHOU Piyu, HOU Xinhang. De novo peptide sequencing by tandem mass spectrometry based on graph convolutional neural network [J]. Journal of Computer Applications, 2021, 41(9): 2773-2779.
[9]	LI Yangzhi, YUAN Jiazheng, LIU Hongzhe. Human skeleton-based action recognition algorithm based on spatiotemporal attention graph convolutional network model [J]. Journal of Computer Applications, 2021, 41(7): 1915-1921.
[10]	CHE Bingqian, ZHOU Dong. Tag recommendation method combining network structure information and text content [J]. Journal of Computer Applications, 2021, 41(4): 976-983.
[11]	YI Dongyi, DENG Genqiang, DONG Chaoxiong, ZHU Miaomiao, LYU Zhouping, ZHU Suisong. Medical insurance fraud detection algorithm based on graph convolutional neural network [J]. Journal of Computer Applications, 2020, 40(5): 1272-1277.
[12]	. Feature interactions detection in SCML [J]. Journal of Computer Applications, 2009, 29(05): 1218-1221.