Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (11): 3639-3646.DOI: 10.11772/j.issn.1001-9081.2023101379
• Frontier and comprehensive applications • Previous Articles
Received:
2023-10-13
Revised:
2024-01-16
Accepted:
2024-01-18
Online:
2024-11-13
Published:
2024-11-10
Contact:
Hui YANG
About author:
LIANG Ruiyan, born in 1998, M. S. candidate. His research interests include pose estimation, graph convolutional network.
Supported by:
通讯作者:
杨慧
作者简介:
梁睿衍(1998—),男,广东佛山人,硕士研究生,主要研究方向:姿态估计、图卷积网络
基金资助:
CLC Number:
Ruiyan LIANG, Hui YANG. Lightweight fall detection algorithm framework based on RPEpose and XJ-GCN[J]. Journal of Computer Applications, 2024, 44(11): 3639-3646.
梁睿衍, 杨慧. 基于RPEpose和XJ-GCN的轻量级跌倒检测算法框架[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3639-3646.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023101379
位置编码 | AP | AR |
---|---|---|
2D Sine Position Embedding | 71.7 | 77.1 |
Bias Mode | 72.9 | 77.4 |
Contextual Mode | 73.3 | 77.6 |
RPE-I(本文) | 74.3 | 78.2 |
Tab. 1 Comparison of different position embeddings
位置编码 | AP | AR |
---|---|---|
2D Sine Position Embedding | 71.7 | 77.1 |
Bias Mode | 72.9 | 77.4 |
Contextual Mode | 73.3 | 77.6 |
RPE-I(本文) | 74.3 | 78.2 |
模型 | 分辨率 | 计算量/GFLOPs | AP/% | AR/% |
---|---|---|---|---|
TransPose-H-A4[ | 256×192 | 10.2 | 74.2 | 78.0 |
CPN+[ | 384×288 | 29.2 | 73.0 | 79.0 |
AlphaPose[ | 320×256 | 26.7 | 72.3 | — |
Simple Baseline[ | 384×288 | 35.6 | 72.3 | 79.0 |
OpenPose[ | — | — | 65.3 | — |
YOLO-Pose[ | 960×960 | — | 68.5 | 75.0 |
OpenPifPaf[ | — | — | 71.9 | — |
RPEpose | 256×192 | 8.2 | 74.3 | 78.2 |
Tab. 2 Performance comparison of different joint keypoint detection models
模型 | 分辨率 | 计算量/GFLOPs | AP/% | AR/% |
---|---|---|---|---|
TransPose-H-A4[ | 256×192 | 10.2 | 74.2 | 78.0 |
CPN+[ | 384×288 | 29.2 | 73.0 | 79.0 |
AlphaPose[ | 320×256 | 26.7 | 72.3 | — |
Simple Baseline[ | 384×288 | 35.6 | 72.3 | 79.0 |
OpenPose[ | — | — | 65.3 | — |
YOLO-Pose[ | 960×960 | — | 68.5 | 75.0 |
OpenPifPaf[ | — | — | 71.9 | — |
RPEpose | 256×192 | 8.2 | 74.3 | 78.2 |
维度 | Top-1 Accuracy/% | |
---|---|---|
X-Sub | X-View | |
2D | 88.4 | 95.2 |
3D | 89.6 | 94.6 |
Tab. 3 Top-1 Accuracy comparison of XJ-GCN on different dimensional datasets
维度 | Top-1 Accuracy/% | |
---|---|---|
X-Sub | X-View | |
2D | 88.4 | 95.2 |
3D | 89.6 | 94.6 |
模型 | 参数量/MB | Top-1 Accuracy/% | |
---|---|---|---|
X-Sub | X-View | ||
S-TR[ | 3.1 | 86.8 | 93.8 |
HCN[ | 1.1 | 86.5 | 91.1 |
ST-GCN[ | 3.1 | 81.5 | 88.3 |
2s-AGCN[ | 7.1 | 88.5 | 95.1 |
AS-GCN[ | 7.6 | 86.8 | 94.2 |
SR-TSL[ | 19.2 | 84.8 | 92.4 |
AGC-LSTM[ | 23.4 | 87.5 | 93.5 |
VA-CNN[ | 24.1 | 88.7 | 94.3 |
CoST-GCN[ | 3.1 | 86.0 | 93.4 |
XJ-GCN | 1.4 | 89.6 | 94.6 |
Tab. 4 Performance comparison of different models on NTU RGB+D dataset
模型 | 参数量/MB | Top-1 Accuracy/% | |
---|---|---|---|
X-Sub | X-View | ||
S-TR[ | 3.1 | 86.8 | 93.8 |
HCN[ | 1.1 | 86.5 | 91.1 |
ST-GCN[ | 3.1 | 81.5 | 88.3 |
2s-AGCN[ | 7.1 | 88.5 | 95.1 |
AS-GCN[ | 7.6 | 86.8 | 94.2 |
SR-TSL[ | 19.2 | 84.8 | 92.4 |
AGC-LSTM[ | 23.4 | 87.5 | 93.5 |
VA-CNN[ | 24.1 | 88.7 | 94.3 |
CoST-GCN[ | 3.1 | 86.0 | 93.4 |
XJ-GCN | 1.4 | 89.6 | 94.6 |
跌倒检测算法框架 | 准确率 |
---|---|
OpenPose+CoST-GCN | 85.1 |
OpenPose+XJ-GCN | 85.9 |
OpenPifPaf+CoST-GCN | 85.8 |
OpenPifPaf+XJ-GCN | 86.3 |
RPEpose+CoST-GCN | 86.4 |
RPEpose+XJ-GCN | 87.2 |
Tab. 5 Accuracy comparison of different fall detection algorithm frameworks
跌倒检测算法框架 | 准确率 |
---|---|
OpenPose+CoST-GCN | 85.1 |
OpenPose+XJ-GCN | 85.9 |
OpenPifPaf+CoST-GCN | 85.8 |
OpenPifPaf+XJ-GCN | 86.3 |
RPEpose+CoST-GCN | 86.4 |
RPEpose+XJ-GCN | 87.2 |
1 | PIERLEONI P, BELLI A, PALMA L, et al. A high reliability wearable device for elderly fall detection [J]. IEEE Sensors Journal, 2015, 15(8): 4544-4553. |
2 | CAO Z, HIDALGO G, SIMON T, et al. OpenPose: realtime multi-person 2D pose estimation using part affinity fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(1): 172-186. |
3 | MAJI D, NAGORI S, MATHEW M, et al. YOLO-Pose: enhancing YOLO for multi person pose estimation using object keypoint similarity loss[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 2636-2645. |
4 | CHEN Y, WANG Z, PENG Y, et al. Cascaded pyramid network for multi-person pose estimation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7103-7112. |
5 | YANG S, QUAN Z, NIE M, et al. TransPose: keypoint localization via Transformer[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 11782-11792. |
6 | RAMACHANDRAN P, PARMAR N, VASWANI A, et al. Stand-alone self-attention in vision models[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2019: 68-80. |
7 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. [2023-10-11]. . |
8 | LIN T-Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 13th European Conference on Computer Vision. Cham: Springer, 2014: 740-755. |
9 | YAN S, XIONG Y, LIN D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 7444-7452. |
10 | LI M, CHEN S, CHEN X, et al. Actional-structural graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3590-3598. |
11 | HEDEGAARD L, HEIDARI N, IOSIFIDIS A. Continual spatio-temporal graph convolutional networks[J]. Pattern Recognition, 2023, 140: 109528. |
12 | SHAHROUDY A, LIU J, T-T NG, et al. NTU RGB+ D: a large scale dataset for 3D human activity analysis[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1010-1019. |
13 | XU Y, ZHANG J, ZHANG Q, et al. ViTPose: simple vision Transformer baselines for human pose estimation[C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2022: 38571-38584. |
14 | YUAN Y, FU R, HUANG L, et al. HRFormer: high-resolution vision Transformer for dense predict[C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2021: 7281-7293. |
15 | 曹建荣,吕俊杰,武欣莹,等.融合运动特征和深度学习的跌倒检测算法[J].计算机应用,2021,41(2):583-589. |
CAO J R, LYU J J, WU X Y, et al. Fall detection algorithm integrating motion features and deep learning[J]. Journal of Computer Applications, 2021, 41(2): 583-589. | |
16 | 马敬奇,雷欢,陈敏翼.基于AlphaPose优化模型的老人跌倒行为检测算法[J].计算机应用,2022,42(1):294-301. |
MA J Q, LEI H, CHEN M Y. Fall behavior detection algorithm for the elderly based on AlphaPose optimization model[J]. Journal of Computer Applications, 2022, 42(1):294-301. | |
17 | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 248-255. |
18 | WU K, PENG H, CHEN M, et al. Rethinking and improving relative position encoding for vision Transformer[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10033-10041. |
19 | FANG H-S, XIE S, TAI Y-W, et al. RMPE: regional multi-person pose estimation[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2353-2362. |
20 | XIAO B, WU H, WEI Y. Simple baselines for human pose estimation and tracking[C]// Proceedings of the 15th European Conference on Computer Vision.Cham: Springer, 2018: 472-487. |
21 | KREISS S, BERTONI L, ALAHI A. OpenPifPaf: composite fields for semantic keypoint detection and spatio-temporal association[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(8): 13498-13511. |
22 | PLIZZARI C, CANNICI M, MATTEUCCI M. Skeleton-based action recognition via spatial and temporal Transformer networks[J]. Computer Vision and Image Understanding, 2021, 208/209: 103219. |
23 | LI C, ZHONG Q, XIE D, et al. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation [EB/OL]. [2023-08-22]. . |
24 | SHI L, ZHANG Y, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 12018-12027. |
25 | SI C, JING Y, WANG W, et al. Skeleton-based action recognition with spatial reasoning and temporal stack learning[C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 106-121. |
26 | SI C, CHEN W, WANG W, et al. An attention enhanced graph convolutional LSTM network for skeleton-based action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1227-1236. |
27 | ZHANG P, LAN C, XING J, et al. View adaptive neural networks for high performance skeleton-based human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(8): 1963-1978. |
[1] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[2] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[3] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[4] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[5] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[6] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[7] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[8] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[9] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[10] | Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182. |
[11] | Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191. |
[12] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[13] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[14] | Dianhui MAO, Xuebo LI, Junling LIU, Denghui ZHANG, Wenjing YAN. Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2018-2025. |
[15] | Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG. Time series classification method based on multi-scale cross-attention fusion in time-frequency domain [J]. Journal of Computer Applications, 2024, 44(6): 1842-1847. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||