Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (7): 2047-2054.DOI: 10.11772/j.issn.1001-9081.2023081172
• Artificial intelligence • Previous Articles Next Articles
					
						                                                                                                                                                                                    Sailong SHI1,2,3, Zhiwen FANG1,2,3( )
)
												  
						
						
						
					
				
Received:2023-09-01
															
							
																	Revised:2023-11-15
															
							
																	Accepted:2023-11-24
															
							
							
																	Online:2024-07-18
															
							
																	Published:2024-07-10
															
							
						Contact:
								Zhiwen FANG   
													About author:SHI Sailong, born in 2000, M. S. candidate. His research interests include computer vision, gaze analysis.Supported by:通讯作者:
					方智文
							作者简介:施赛龙(2000—),男,江苏南通人,硕士研究生,主要研究方向:计算机视觉、注视分析;基金资助:CLC Number:
Sailong SHI, Zhiwen FANG. Gaze estimation model based on multi-scale aggregation and shared attention[J]. Journal of Computer Applications, 2024, 44(7): 2047-2054.
施赛龙, 方智文. 基于多尺度聚合和共享注意力的注视估计模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2047-2054.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023081172
| 模型 | MPIIFaceGaze | Gaze360 | Gaze360_Processed | GAFA-Head | 
|---|---|---|---|---|
| FullFace[ | 4.93 | 22.06 | 14.99 | 33.02 | 
| Gaze360[ | 4.06 | 15.60 | 11.04 | 27.78 | 
| CA-Net[ | 4.27 | N/A | 11.20 | N/A | 
| MANet[ | 4.30 | N/A | 13.20 | N/A | 
| GEDDnet[ | 4.50 | N/A | N/A | N/A | 
| GazeTR[ | 4.18 | 15.39 | 11.00 | 28.53 | 
| CADSE[ | 4.04 | N/A | 10.70 | N/A | 
| GazeNas-ETH[ | 3.96 | N/A | 10.52 | N/A | 
| 本文模型 | 3.94 | 14.76 | 10.47 | 25.52 | 
Tab. 1 AAE comparison results of predicted gaze angle by various models on different datasets
| 模型 | MPIIFaceGaze | Gaze360 | Gaze360_Processed | GAFA-Head | 
|---|---|---|---|---|
| FullFace[ | 4.93 | 22.06 | 14.99 | 33.02 | 
| Gaze360[ | 4.06 | 15.60 | 11.04 | 27.78 | 
| CA-Net[ | 4.27 | N/A | 11.20 | N/A | 
| MANet[ | 4.30 | N/A | 13.20 | N/A | 
| GEDDnet[ | 4.50 | N/A | N/A | N/A | 
| GazeTR[ | 4.18 | 15.39 | 11.00 | 28.53 | 
| CADSE[ | 4.04 | N/A | 10.70 | N/A | 
| GazeNas-ETH[ | 3.96 | N/A | 10.52 | N/A | 
| 本文模型 | 3.94 | 14.76 | 10.47 | 25.52 | 
| 模型 | Gaze360 | GAFA-Head | ||||||
|---|---|---|---|---|---|---|---|---|
| 全角度 | 前180° | 正面 | 后180° | 全角度 | 前180° | 正面 | 后180° | |
| FullFace[ | 22.06 | 17.82 | 18.44 | 37.33 | 33.02 | 22.87 | 21.20 | 41.59 | 
| Gaze360[ | 15.60 | 13.40 | 13.40 | 23.50 | 27.78 | 19.75 | 19.15 | 34.55 | 
| GazeTR[ | 15.39 | 13.27 | 13.60 | 23.00 | 28.53 | 21.31 | 20.71 | 34.63 | 
| 本文模型 | 14.76 | 12.68 | 12.78 | 21.92 | 25.52 | 18.87 | 18.64 | 31.15 | 
Tab. 2 AAE comparison for different range subsets in Gaze360 and GAFA-Head
| 模型 | Gaze360 | GAFA-Head | ||||||
|---|---|---|---|---|---|---|---|---|
| 全角度 | 前180° | 正面 | 后180° | 全角度 | 前180° | 正面 | 后180° | |
| FullFace[ | 22.06 | 17.82 | 18.44 | 37.33 | 33.02 | 22.87 | 21.20 | 41.59 | 
| Gaze360[ | 15.60 | 13.40 | 13.40 | 23.50 | 27.78 | 19.75 | 19.15 | 34.55 | 
| GazeTR[ | 15.39 | 13.27 | 13.60 | 23.00 | 28.53 | 21.31 | 20.71 | 34.63 | 
| 本文模型 | 14.76 | 12.68 | 12.78 | 21.92 | 25.52 | 18.87 | 18.64 | 31.15 | 
| 分流自注意力 | 共享注意力 | AAE | |||
|---|---|---|---|---|---|
| MPIIFaceGaze | Gaze360 | Gaze360_Processed | GAFA-Head | ||
| 无 | 无 | 4.06 | 15.60 | 11.21 | 27.78 | 
| 有 | 无 | 3.98 | 15.31 | 10.70 | 27.45 | 
| 无 | 有 | 4.03 | 15.33 | 10.79 | 26.21 | 
| 有 | 有 | 3.94 | 14.76 | 10.47 | 25.52 | 
Tab. 3 Ablation study results
| 分流自注意力 | 共享注意力 | AAE | |||
|---|---|---|---|---|---|
| MPIIFaceGaze | Gaze360 | Gaze360_Processed | GAFA-Head | ||
| 无 | 无 | 4.06 | 15.60 | 11.21 | 27.78 | 
| 有 | 无 | 3.98 | 15.31 | 10.70 | 27.45 | 
| 无 | 有 | 4.03 | 15.33 | 10.79 | 26.21 | 
| 有 | 有 | 3.94 | 14.76 | 10.47 | 25.52 | 
| 1 | EMERY N J. The eyes have it: the neuroethology, function and evolution of social gaze [J]. Neuroscience & Biobehavioral Reviews, 2000, 24(6): 581-604. | 
| 2 | TERZIOĞLU Y, MUTLU B, ŞAHIN E. Designing social cues for collaborative robots: the roie of gaze and breathing in human-robot collaboration [C]// Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction. New York: ACM, 2020: 343-357. | 
| 3 | TOPAL C, GUNAL S, KOÇDEVIREN O, et al. A low-computational approach on gaze estimation with eye touch system [J]. IEEE Transactions on Cybernetics, 2014, 44(2): 228-239. | 
| 4 | 胡文婷,周献中,盛寅,等.基于视线跟踪的智能界面实现机制研究[J].计算机应用与软件, 2016, 33(1): 134-137. | 
| HU W T, ZHOU X Z, SHENG Y, et al. On implementation mechanism of intelligent interface based on gaze tracking [J]. Computer Applications and Software, 2016, 33(1): 134-137. | |
| 5 | CHONG E, CLARK-WHITNEY E, SOUTHERLAND A, et al. Detection of eye contact with deep neural networks is as accurate as human experts [J]. Nature Communications, 2020, 11(1): 6386. | 
| 6 | LI J, CHEN Z, ZHONG Y, et al. Appearance-based gaze estimation for ASD diagnosis [J]. IEEE Transactions on Cybernetics, 2022, 52(7): 6504-6517. | 
| 7 | 郭爱华,潘小平.阿尔茨海默病的眼动跟踪研究[J].广东医学, 2021, 42(9): 1132-1135. | 
| GUO A H, PAN X P. Eye tracking research on Alzheimer’s disease [J]. Guangdong Medical Journal, 2021, 42(9): 1132-1135. | |
| 8 | VINNIKOV M, ALLISON R S, FERNANDES S. Gaze-contingent auditory displays for improved spatial attention in virtual reality [J]. ACM Transactions on Computer-Human Interaction, 2017, 24(3): No. 19. | 
| 9 | PATNEY A, SALVI M, KIM J, et al. Towards foveated rendering for gaze-tracked virtual reality [J]. ACM Transactions on Graphics, 2016, 35(6): No. 179. | 
| 10 | 侯守明,贾超兰,张明敏.用于虚拟现实系统的眼动交互技术综述[J].计算机应用, 2022, 42(11): 3534-3543. | 
| HOU S M, JIA C L, ZHANG M M. Review of eye movement-based interaction techniques for virtual reality systems [J]. Journal of Computer Applications, 2022, 42(11): 3534-3543. | |
| 11 | LIU Y, ZHOU L, BAI X, et al. Goal-oriented gaze estimation for zero-shot learning [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway IEEE, 2021: 3793-3802. | 
| 12 | 张闯,迟健男,张朝晖,等.一种新的基于瞳孔-角膜反射技术的视线追踪方法[J].计算机学报, 2010, 33(7): 1272-1285. | 
| ZHANG C, CHI J N, ZHANG Z H, et al. A novel eye gaze tracking technique based on pupil center cornea reflection technique [J]. Chinese Journal of Computers, 2010, 33(7): 1272-1285. | |
| 13 | 熊春水,黄磊,刘昌平.一种新的单点标定视线估计方法[J].自动化学报, 2014, 40(3): 459-470. | 
| XIONG C S, HUANG L, LIU C P. A novel gaze estimation method with one-point calibration [J]. Acta Automatica Sinica, 2014, 40(3): 459-470. | |
| 14 | 苟超,卓莹,王康,等.眼动跟踪研究进展与展望[J].自动化学报,2022, 48(5): 1173-1192. | 
| GOU C, ZHUO Y, WANG K, et al. Research advances and prospects of eye tracking [J]. Acta Automatica Sinica, 2022, 48(5): 1173-1192. | |
| 15 | ZHANG X, SUGANO Y, FRITZ M, et al. Appearance-based gaze estimation in the wild [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 4511-4520. | 
| 16 | WANG K, ZHAO R, JI Q. A hierarchical generative model for eye image synthesis and eye gaze estimation [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 440-448. | 
| 17 | REED S, AKATA Z, YAN X, et al. Generative adversarial text to image synthesis [C]// Proceedings of the 33rd International Conference on Machine Learning. New York: ACM, 2016: 1060-1069. | 
| 18 | LIU G, YU Y, MORA K A F, et al. A differential approach for gaze estimation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligencem, 2021, 43(3): 1092-1099. | 
| 19 | SUN Y, ZENG J, SHAN S, et al. Cross-encoder for unsupervised gaze representation learning [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 3682-3691. | 
| 20 | ZHANG X, SUGANO Y, FRITZ M, et al. It's written all over your face: full-face appearance-based gaze estimation [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2017: 2299-2308. | 
| 21 | CHENG Y, HUANG S, WANG F, et al. A coarse-to-fine adaptive network for appearance-based gaze estimation [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2020: 10623-10630. | 
| 22 | ZHANG X, SUGANO Y, BULLING A, et al. Learning-based region selection for end-to-end gaze estimation [C]// Proceedings of the 31st British Machine Vision Conference. Nottingham, UK: BMVA Press, 2020: No. 86. | 
| 23 | KELLNHOFER P, RECASENS A, STENT S, et al. Gaze360: physically unconstrained gaze estimation in the wild [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 6911-6920. | 
| 24 | KOTHARI R, DE MELLO S, IQBAL U, et al. Weakly-supervised physically unconstrained gaze estimation [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 9975-9984. | 
| 25 | NONAKA S, NOBUHARA S, NISHINO K. Dynamic 3D gaze from afar: deep gaze estimation from temporal eye-head-body coordination [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 2182-2191. | 
| 26 | WU Y, LI G, LIU Z, et al. Gaze estimation via modulation-based adaptive network with auxiliary self-learning [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(8): 5510-5520. | 
| 27 | CHEN Z, SHI B E. Towards high performance low complexity calibration in appearance based gaze estimation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 1174-1188. | 
| 28 | CHENG Y, LU F. Gaze estimation using transformer [C]// Proceedings of the 2022 26th International Conference on Pattern Recognition. Piscataway: IEEE, 2022: 3341-3347. | 
| 29 | OH J O, CHANG H J, S-I CHOI. Self-attention with convolution and deconvolution for efficient eye gaze estimation from a full face image [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2022: 4988-4996. | 
| 30 | NAGPURE V, OKUMA K. Searching efficient neural architecture with multi-resolution fusion transformer for appearance-based gaze estimation [C]// Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2023: 890-899. | 
| 31 | REN S, ZHOU D, HE S, et al. Shunted self-attention via multi-scale token aggregation [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 10843-10852. | 
| 32 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. (2021-06-03) [2022-10-14]. | 
| 33 | CHENG Y, WANG H, BAO Y, et al. Appearance-based gaze estimation with deep learning: a review and benchmark [EB/OL]. (2021-04-26) [2023-08-22]. . | 
| [1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. | 
| [2] | Shuai FU, Xiaoying GUO, Ruyi BAI, Tao YAN, Bin CHEN. Age estimation method combining improved CloFormer model and ordinal regression [J]. Journal of Computer Applications, 2024, 44(8): 2372-2380. | 
| [3] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. | 
| [4] | Ziwen SUN, Lizhi QIAN, Chuandong YANG, Yibo GAO, Qingyang LU, Guanglin YUAN. Survey of visual object tracking methods based on Transformer [J]. Journal of Computer Applications, 2024, 44(5): 1644-1654. | 
| [5] | Yudong PANG, Zhixing LI, Weijie LIU, Tianhao LI, Ningning WANG. Small target detection model in overlooking scenes on tower cranes based on improved real-time detection Transformer [J]. Journal of Computer Applications, 2024, 44(12): 3922-3929. | 
| [6] | Yongjiang LIU, Bin CHEN. Pixel-level unsupervised industrial anomaly detection based on multi-scale memory bank [J]. Journal of Computer Applications, 2024, 44(11): 3587-3594. | 
| [7] | Wenze CHAI, Jing FAN, Shukui SUN, Yiming LIANG, Jingfeng LIU. Overview of deep metric learning [J]. Journal of Computer Applications, 2024, 44(10): 2995-3010. | 
| [8] | Yi WANG, Jie XIE, Jia CHENG, Liwei DOU. Review of object pose estimation in RGB images based on deep learning [J]. Journal of Computer Applications, 2023, 43(8): 2546-2555. | 
| [9] | Yichi CHEN, Bin CHEN. Review of lifelong learning in computer vision [J]. Journal of Computer Applications, 2023, 43(6): 1785-1795. | 
| [10] | Mengting WANG, Wenzhong YANG, Yongzhi WU. Survey of single target tracking algorithms based on Siamese network [J]. Journal of Computer Applications, 2023, 43(3): 661-673. | 
| [11] | SHEN Zhijun, MU Lina, GAO Jing, SHI Yuanhang, LIU Zhiqiang. Review of fine-grained image categorization [J]. Journal of Computer Applications, 2023, 43(1): 51-60. | 
| [12] | Zhida FENG, Li CHEN. Single direction projected Transformer method for aliasing text detection [J]. Journal of Computer Applications, 2022, 42(12): 3686-3691. | 
| [13] | Shouming HOU, Chaolan JIA, Mingmin ZHANG. Review of eye movement‑based interaction techniques for virtual reality systems [J]. Journal of Computer Applications, 2022, 42(11): 3534-3543. | 
| [14] | Yi ZHANG, Hua WAN, Shuqin TU. Technical review and case study on classification of Chinese herbal slices based on computer vision [J]. Journal of Computer Applications, 2022, 42(10): 3224-3234. | 
| [15] | MA Jialiang, CHEN Bin, SUN Xiaofei. General object detection framework based on improved Faster R-CNN [J]. Journal of Computer Applications, 2021, 41(9): 2712-2719. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||