Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (7): 2047-2054.DOI: 10.11772/j.issn.1001-9081.2023081172
• Artificial intelligence • Previous Articles Next Articles
Sailong SHI1,2,3, Zhiwen FANG1,2,3()
Received:
2023-09-01
Revised:
2023-11-15
Accepted:
2023-11-24
Online:
2024-07-18
Published:
2024-07-10
Contact:
Zhiwen FANG
About author:
SHI Sailong, born in 2000, M. S. candidate. His research interests include computer vision, gaze analysis.Supported by:
通讯作者:
方智文
作者简介:
施赛龙(2000—),男,江苏南通人,硕士研究生,主要研究方向:计算机视觉、注视分析;基金资助:
CLC Number:
Sailong SHI, Zhiwen FANG. Gaze estimation model based on multi-scale aggregation and shared attention[J]. Journal of Computer Applications, 2024, 44(7): 2047-2054.
施赛龙, 方智文. 基于多尺度聚合和共享注意力的注视估计模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2047-2054.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023081172
模型 | MPIIFaceGaze | Gaze360 | Gaze360_Processed | GAFA-Head |
---|---|---|---|---|
FullFace[ | 4.93 | 22.06 | 14.99 | 33.02 |
Gaze360[ | 4.06 | 15.60 | 11.04 | 27.78 |
CA-Net[ | 4.27 | N/A | 11.20 | N/A |
MANet[ | 4.30 | N/A | 13.20 | N/A |
GEDDnet[ | 4.50 | N/A | N/A | N/A |
GazeTR[ | 4.18 | 15.39 | 11.00 | 28.53 |
CADSE[ | 4.04 | N/A | 10.70 | N/A |
GazeNas-ETH[ | 3.96 | N/A | 10.52 | N/A |
本文模型 | 3.94 | 14.76 | 10.47 | 25.52 |
Tab. 1 AAE comparison results of predicted gaze angle by various models on different datasets
模型 | MPIIFaceGaze | Gaze360 | Gaze360_Processed | GAFA-Head |
---|---|---|---|---|
FullFace[ | 4.93 | 22.06 | 14.99 | 33.02 |
Gaze360[ | 4.06 | 15.60 | 11.04 | 27.78 |
CA-Net[ | 4.27 | N/A | 11.20 | N/A |
MANet[ | 4.30 | N/A | 13.20 | N/A |
GEDDnet[ | 4.50 | N/A | N/A | N/A |
GazeTR[ | 4.18 | 15.39 | 11.00 | 28.53 |
CADSE[ | 4.04 | N/A | 10.70 | N/A |
GazeNas-ETH[ | 3.96 | N/A | 10.52 | N/A |
本文模型 | 3.94 | 14.76 | 10.47 | 25.52 |
模型 | Gaze360 | GAFA-Head | ||||||
---|---|---|---|---|---|---|---|---|
全角度 | 前180° | 正面 | 后180° | 全角度 | 前180° | 正面 | 后180° | |
FullFace[ | 22.06 | 17.82 | 18.44 | 37.33 | 33.02 | 22.87 | 21.20 | 41.59 |
Gaze360[ | 15.60 | 13.40 | 13.40 | 23.50 | 27.78 | 19.75 | 19.15 | 34.55 |
GazeTR[ | 15.39 | 13.27 | 13.60 | 23.00 | 28.53 | 21.31 | 20.71 | 34.63 |
本文模型 | 14.76 | 12.68 | 12.78 | 21.92 | 25.52 | 18.87 | 18.64 | 31.15 |
Tab. 2 AAE comparison for different range subsets in Gaze360 and GAFA-Head
模型 | Gaze360 | GAFA-Head | ||||||
---|---|---|---|---|---|---|---|---|
全角度 | 前180° | 正面 | 后180° | 全角度 | 前180° | 正面 | 后180° | |
FullFace[ | 22.06 | 17.82 | 18.44 | 37.33 | 33.02 | 22.87 | 21.20 | 41.59 |
Gaze360[ | 15.60 | 13.40 | 13.40 | 23.50 | 27.78 | 19.75 | 19.15 | 34.55 |
GazeTR[ | 15.39 | 13.27 | 13.60 | 23.00 | 28.53 | 21.31 | 20.71 | 34.63 |
本文模型 | 14.76 | 12.68 | 12.78 | 21.92 | 25.52 | 18.87 | 18.64 | 31.15 |
分流自注意力 | 共享注意力 | AAE | |||
---|---|---|---|---|---|
MPIIFaceGaze | Gaze360 | Gaze360_Processed | GAFA-Head | ||
无 | 无 | 4.06 | 15.60 | 11.21 | 27.78 |
有 | 无 | 3.98 | 15.31 | 10.70 | 27.45 |
无 | 有 | 4.03 | 15.33 | 10.79 | 26.21 |
有 | 有 | 3.94 | 14.76 | 10.47 | 25.52 |
Tab. 3 Ablation study results
分流自注意力 | 共享注意力 | AAE | |||
---|---|---|---|---|---|
MPIIFaceGaze | Gaze360 | Gaze360_Processed | GAFA-Head | ||
无 | 无 | 4.06 | 15.60 | 11.21 | 27.78 |
有 | 无 | 3.98 | 15.31 | 10.70 | 27.45 |
无 | 有 | 4.03 | 15.33 | 10.79 | 26.21 |
有 | 有 | 3.94 | 14.76 | 10.47 | 25.52 |
1 | EMERY N J. The eyes have it: the neuroethology, function and evolution of social gaze [J]. Neuroscience & Biobehavioral Reviews, 2000, 24(6): 581-604. |
2 | TERZIOĞLU Y, MUTLU B, ŞAHIN E. Designing social cues for collaborative robots: the roie of gaze and breathing in human-robot collaboration [C]// Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction. New York: ACM, 2020: 343-357. |
3 | TOPAL C, GUNAL S, KOÇDEVIREN O, et al. A low-computational approach on gaze estimation with eye touch system [J]. IEEE Transactions on Cybernetics, 2014, 44(2): 228-239. |
4 | 胡文婷,周献中,盛寅,等.基于视线跟踪的智能界面实现机制研究[J].计算机应用与软件, 2016, 33(1): 134-137. |
HU W T, ZHOU X Z, SHENG Y, et al. On implementation mechanism of intelligent interface based on gaze tracking [J]. Computer Applications and Software, 2016, 33(1): 134-137. | |
5 | CHONG E, CLARK-WHITNEY E, SOUTHERLAND A, et al. Detection of eye contact with deep neural networks is as accurate as human experts [J]. Nature Communications, 2020, 11(1): 6386. |
6 | LI J, CHEN Z, ZHONG Y, et al. Appearance-based gaze estimation for ASD diagnosis [J]. IEEE Transactions on Cybernetics, 2022, 52(7): 6504-6517. |
7 | 郭爱华,潘小平.阿尔茨海默病的眼动跟踪研究[J].广东医学, 2021, 42(9): 1132-1135. |
GUO A H, PAN X P. Eye tracking research on Alzheimer’s disease [J]. Guangdong Medical Journal, 2021, 42(9): 1132-1135. | |
8 | VINNIKOV M, ALLISON R S, FERNANDES S. Gaze-contingent auditory displays for improved spatial attention in virtual reality [J]. ACM Transactions on Computer-Human Interaction, 2017, 24(3): No. 19. |
9 | PATNEY A, SALVI M, KIM J, et al. Towards foveated rendering for gaze-tracked virtual reality [J]. ACM Transactions on Graphics, 2016, 35(6): No. 179. |
10 | 侯守明,贾超兰,张明敏.用于虚拟现实系统的眼动交互技术综述[J].计算机应用, 2022, 42(11): 3534-3543. |
HOU S M, JIA C L, ZHANG M M. Review of eye movement-based interaction techniques for virtual reality systems [J]. Journal of Computer Applications, 2022, 42(11): 3534-3543. | |
11 | LIU Y, ZHOU L, BAI X, et al. Goal-oriented gaze estimation for zero-shot learning [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway IEEE, 2021: 3793-3802. |
12 | 张闯,迟健男,张朝晖,等.一种新的基于瞳孔-角膜反射技术的视线追踪方法[J].计算机学报, 2010, 33(7): 1272-1285. |
ZHANG C, CHI J N, ZHANG Z H, et al. A novel eye gaze tracking technique based on pupil center cornea reflection technique [J]. Chinese Journal of Computers, 2010, 33(7): 1272-1285. | |
13 | 熊春水,黄磊,刘昌平.一种新的单点标定视线估计方法[J].自动化学报, 2014, 40(3): 459-470. |
XIONG C S, HUANG L, LIU C P. A novel gaze estimation method with one-point calibration [J]. Acta Automatica Sinica, 2014, 40(3): 459-470. | |
14 | 苟超,卓莹,王康,等.眼动跟踪研究进展与展望[J].自动化学报,2022, 48(5): 1173-1192. |
GOU C, ZHUO Y, WANG K, et al. Research advances and prospects of eye tracking [J]. Acta Automatica Sinica, 2022, 48(5): 1173-1192. | |
15 | ZHANG X, SUGANO Y, FRITZ M, et al. Appearance-based gaze estimation in the wild [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 4511-4520. |
16 | WANG K, ZHAO R, JI Q. A hierarchical generative model for eye image synthesis and eye gaze estimation [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 440-448. |
17 | REED S, AKATA Z, YAN X, et al. Generative adversarial text to image synthesis [C]// Proceedings of the 33rd International Conference on Machine Learning. New York: ACM, 2016: 1060-1069. |
18 | LIU G, YU Y, MORA K A F, et al. A differential approach for gaze estimation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligencem, 2021, 43(3): 1092-1099. |
19 | SUN Y, ZENG J, SHAN S, et al. Cross-encoder for unsupervised gaze representation learning [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 3682-3691. |
20 | ZHANG X, SUGANO Y, FRITZ M, et al. It's written all over your face: full-face appearance-based gaze estimation [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2017: 2299-2308. |
21 | CHENG Y, HUANG S, WANG F, et al. A coarse-to-fine adaptive network for appearance-based gaze estimation [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2020: 10623-10630. |
22 | ZHANG X, SUGANO Y, BULLING A, et al. Learning-based region selection for end-to-end gaze estimation [C]// Proceedings of the 31st British Machine Vision Conference. Nottingham, UK: BMVA Press, 2020: No. 86. |
23 | KELLNHOFER P, RECASENS A, STENT S, et al. Gaze360: physically unconstrained gaze estimation in the wild [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 6911-6920. |
24 | KOTHARI R, DE MELLO S, IQBAL U, et al. Weakly-supervised physically unconstrained gaze estimation [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 9975-9984. |
25 | NONAKA S, NOBUHARA S, NISHINO K. Dynamic 3D gaze from afar: deep gaze estimation from temporal eye-head-body coordination [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 2182-2191. |
26 | WU Y, LI G, LIU Z, et al. Gaze estimation via modulation-based adaptive network with auxiliary self-learning [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(8): 5510-5520. |
27 | CHEN Z, SHI B E. Towards high performance low complexity calibration in appearance based gaze estimation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 1174-1188. |
28 | CHENG Y, LU F. Gaze estimation using transformer [C]// Proceedings of the 2022 26th International Conference on Pattern Recognition. Piscataway: IEEE, 2022: 3341-3347. |
29 | OH J O, CHANG H J, S-I CHOI. Self-attention with convolution and deconvolution for efficient eye gaze estimation from a full face image [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2022: 4988-4996. |
30 | NAGPURE V, OKUMA K. Searching efficient neural architecture with multi-resolution fusion transformer for appearance-based gaze estimation [C]// Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2023: 890-899. |
31 | REN S, ZHOU D, HE S, et al. Shunted self-attention via multi-scale token aggregation [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 10843-10852. |
32 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. (2021-06-03) [2022-10-14]. |
33 | CHENG Y, WANG H, BAO Y, et al. Appearance-based gaze estimation with deep learning: a review and benchmark [EB/OL]. (2021-04-26) [2023-08-22]. . |
[1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[2] | Shuai FU, Xiaoying GUO, Ruyi BAI, Tao YAN, Bin CHEN. Age estimation method combining improved CloFormer model and ordinal regression [J]. Journal of Computer Applications, 2024, 44(8): 2372-2380. |
[3] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[4] | Ziwen SUN, Lizhi QIAN, Chuandong YANG, Yibo GAO, Qingyang LU, Guanglin YUAN. Survey of visual object tracking methods based on Transformer [J]. Journal of Computer Applications, 2024, 44(5): 1644-1654. |
[5] | Yudong PANG, Zhixing LI, Weijie LIU, Tianhao LI, Ningning WANG. Small target detection model in overlooking scenes on tower cranes based on improved real-time detection Transformer [J]. Journal of Computer Applications, 2024, 44(12): 3922-3929. |
[6] | Yongjiang LIU, Bin CHEN. Pixel-level unsupervised industrial anomaly detection based on multi-scale memory bank [J]. Journal of Computer Applications, 2024, 44(11): 3587-3594. |
[7] | Wenze CHAI, Jing FAN, Shukui SUN, Yiming LIANG, Jingfeng LIU. Overview of deep metric learning [J]. Journal of Computer Applications, 2024, 44(10): 2995-3010. |
[8] | Yi WANG, Jie XIE, Jia CHENG, Liwei DOU. Review of object pose estimation in RGB images based on deep learning [J]. Journal of Computer Applications, 2023, 43(8): 2546-2555. |
[9] | Yichi CHEN, Bin CHEN. Review of lifelong learning in computer vision [J]. Journal of Computer Applications, 2023, 43(6): 1785-1795. |
[10] | Mengting WANG, Wenzhong YANG, Yongzhi WU. Survey of single target tracking algorithms based on Siamese network [J]. Journal of Computer Applications, 2023, 43(3): 661-673. |
[11] | SHEN Zhijun, MU Lina, GAO Jing, SHI Yuanhang, LIU Zhiqiang. Review of fine-grained image categorization [J]. Journal of Computer Applications, 2023, 43(1): 51-60. |
[12] | Zhida FENG, Li CHEN. Single direction projected Transformer method for aliasing text detection [J]. Journal of Computer Applications, 2022, 42(12): 3686-3691. |
[13] | Shouming HOU, Chaolan JIA, Mingmin ZHANG. Review of eye movement‑based interaction techniques for virtual reality systems [J]. Journal of Computer Applications, 2022, 42(11): 3534-3543. |
[14] | Yi ZHANG, Hua WAN, Shuqin TU. Technical review and case study on classification of Chinese herbal slices based on computer vision [J]. Journal of Computer Applications, 2022, 42(10): 3224-3234. |
[15] | MA Jialiang, CHEN Bin, SUN Xiaofei. General object detection framework based on improved Faster R-CNN [J]. Journal of Computer Applications, 2021, 41(9): 2712-2719. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||