Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (1): 253-260.DOI: 10.11772/j.issn.1001-9081.2024010098
• Multimedia computing and computer simulation • Previous Articles Next Articles
Jie XU1, Yong ZHONG2, Yang WANG3, Changfu ZHANG4, Guanci YANG1,3()
Received:
2024-01-26
Revised:
2024-03-28
Accepted:
2024-04-01
Online:
2024-05-09
Published:
2025-01-10
Contact:
Guanci YANG
About author:
XU Jie, born in 1997, M. S. candidate. His research interests include intelligent autonomous system.Supported by:
通讯作者:
杨观赐
作者简介:
徐杰(1997—),男,安徽阜阳人,硕士研究生,CCF会员,主要研究方向:自主智能系统;基金资助:
CLC Number:
Jie XU, Yong ZHONG, Yang WANG, Changfu ZHANG, Guanci YANG. Facial attribute estimation and expression recognition based on contextual channel attention mechanism[J]. Journal of Computer Applications, 2025, 45(1): 253-260.
徐杰, 钟勇, 王阳, 张昌福, 杨观赐. 基于上下文通道注意力机制的人脸属性估计与表情识别[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 253-260.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024010098
任务 | 数据集 | 样本数 | 特征描述 | |
---|---|---|---|---|
训练集 | 测试集 | |||
FAE | CelebA | 162 770 | 19 962 | 40种二元人脸属性类别 |
FER | RAF-DB | 12 271 | 3 068 | 7种基础表情类别 |
AffectNet | 283 901 | 3 500 | 7种基础表情类别 |
Tab. 1 Datasets for facial attribute estimation and facial expression recognition
任务 | 数据集 | 样本数 | 特征描述 | |
---|---|---|---|---|
训练集 | 测试集 | |||
FAE | CelebA | 162 770 | 19 962 | 40种二元人脸属性类别 |
FER | RAF-DB | 12 271 | 3 068 | 7种基础表情类别 |
AffectNet | 283 901 | 3 500 | 7种基础表情类别 |
模块 | 准确率 | |
---|---|---|
AffectNet | RAF-DB | |
Baseline | 64.77 | 90.74 |
Baseline+ CC Attention | 65.03 | 90.91 |
Baseline+ Center Loss | 66.66 | 91.75 |
Tab. 2 Ablation experimental results of different modules
模块 | 准确率 | |
---|---|---|
AffectNet | RAF-DB | |
Baseline | 64.77 | 90.74 |
Baseline+ CC Attention | 65.03 | 90.91 |
Baseline+ Center Loss | 66.66 | 91.75 |
损失函数 | 平均准确率 | |
---|---|---|
Baseline | FAER | |
Asymmetric Loss | 90.69 | 91.23 |
BCE Loss | 91.03 | 91.71 |
Focal Loss | 91.66 | 91.87 |
Tab. 3 Comparison of average accuracies with different loss functions
损失函数 | 平均准确率 | |
---|---|---|
Baseline | FAER | |
Asymmetric Loss | 90.69 | 91.23 |
BCE Loss | 91.03 | 91.71 |
Focal Loss | 91.66 | 91.87 |
模型 | 准确率 | |
---|---|---|
RAF-DB | AffectNet | |
DACL[ | 87.87 | 65.20 |
VTFF[ | 88.14 | 61.85 |
EfficientFace[ | 88.36 | 63.70 |
MA-Net[ | 88.42 | 64.53 |
PSR[ | 88.98 | 63.77 |
AMP-Net[ | 89.25 | 64.54 |
DAN[ | 89.70 | 65.69 |
EAC[ | 90.35 | 65.32 |
ARM[ | 90.42 | 65.20 |
TransFER[ | 90.91 | 66.23 |
Baseline | 90.74 | 64.77 |
FAER | 91.75 | 66.66 |
Tab. 4 Accuracy performance comparison for FER tasks on RAF-DB and AffectNet datasets
模型 | 准确率 | |
---|---|---|
RAF-DB | AffectNet | |
DACL[ | 87.87 | 65.20 |
VTFF[ | 88.14 | 61.85 |
EfficientFace[ | 88.36 | 63.70 |
MA-Net[ | 88.42 | 64.53 |
PSR[ | 88.98 | 63.77 |
AMP-Net[ | 89.25 | 64.54 |
DAN[ | 89.70 | 65.69 |
EAC[ | 90.35 | 65.32 |
ARM[ | 90.42 | 65.20 |
TransFER[ | 90.91 | 66.23 |
Baseline | 90.74 | 64.77 |
FAER | 91.75 | 66.66 |
数据集 | 模型 | 类别准确率 | ||||||
---|---|---|---|---|---|---|---|---|
愤怒 | 厌恶 | 恐惧 | 开心 | 自然 | 悲伤 | 惊讶 | ||
RAF-DB | Baseline | 88.20 | 72.80 | 75.39 | 95.74 | 90.86 | 87.65 | 89.97 |
FAER | 90.06 | 79.62 | 72.37 | 97.11 | 86.99 | 92.94 | 92.72 | |
AffectNet | Baseline | 61.39 | 66.18 | 67.23 | 77.20 | 53.05 | 71.07 | 59.26 |
FAER | 62.25 | 72.86 | 70.22 | 80.97 | 56.57 | 66.33 | 59.93 |
Tab. 5 Class-wise accuracy statistics for FER tasks on RAF-DB and AffectNet datasets
数据集 | 模型 | 类别准确率 | ||||||
---|---|---|---|---|---|---|---|---|
愤怒 | 厌恶 | 恐惧 | 开心 | 自然 | 悲伤 | 惊讶 | ||
RAF-DB | Baseline | 88.20 | 72.80 | 75.39 | 95.74 | 90.86 | 87.65 | 89.97 |
FAER | 90.06 | 79.62 | 72.37 | 97.11 | 86.99 | 92.94 | 92.72 | |
AffectNet | Baseline | 61.39 | 66.18 | 67.23 | 77.20 | 53.05 | 71.07 | 59.26 |
FAER | 62.25 | 72.86 | 70.22 | 80.97 | 56.57 | 66.33 | 59.93 |
1 | 张晓行,田启川,廉露,等.人脸关键点检测研究综述[J].计算机工程与应用, 2024, 60(12): 48-60. |
ZHANG X H, TIAN Q C, LIAN L, et al. Review of research on facial landmark detection [J]. Computer Engineering and Applications, 2024, 60(12): 48-60. | |
2 | 张波,兰艳亭,鲜浩,等.基于通道注意力机制的人脸表情识别机器人交互研究[J].电子测量技术, 2021, 44(11): 169-174. |
ZHANG B, LAN Y T, XIAN H, et al. Research on robot interaction of facial expression recognition based on channel attention mechanism [J]. Electronic Measurement Technology, 2021, 44(11): 169-174. | |
3 | LIN J, LI Y, YANG G. FPGAN: face de-identification method with generative adversarial networks for social robots [J]. Neural Networks, 2021, 133: 132-147. |
4 | LIU Z, LUO P, WANG X, et al. Large-scale CelebFaces attributes (CelebA) dataset [EB/OL]. [2023-10-22]. . |
5 | LIU Z, LUO P, WANG X, et al. Deep learning face attributes in the wild [C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 3730-3738. |
6 | HAN H, JAIN A K, WANG F, et al. Heterogeneous face attribute estimation: a deep multi-task learning approach [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(11): 2597-2609. |
7 | HE S, LUO H, WANG P, et al. TransReID: Transformer-based object re-identification [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 14993-15002. |
8 | NGUYEN H M, LY N Q, PHUNG T T T. Large-scale face image retrieval system at attribute level based on facial attribute ontology and deep neuron network [C]// Proceedings of the 2018 Asian Conference on Intelligent Information and Database Systems, LNCS 10752. Cham: Springer, 2018: 539-549. |
9 | CAO J, LI Y, ZHANG Z. Partially shared multi-task convolutional neural network with local constraint for face attribute learning [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4290-4299. |
10 | QIN L, WANG M, DENG C, et al. SwinFace: a multi-task transformer for face recognition, expression recognition, age estimation and attribute estimation [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(4): 2223-2234. |
11 | LI W, CAO Z, FENG J, et al. Label2Label: a language modeling framework for multi-attribute learning [C]// Proceedings of the 2022 European Conference on Computer Vision, LNCS 13672. Cham: Springer, 2022: 562-579. |
12 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. [2023-10-20]. . |
13 | 戴国庆,张晟磊,袁玉波.老龄面部数据抽取的肤色显著性方法[J].计算机应用, 2022, 42(S2): 217-223. |
DAI G Q, ZHANG S L, YUAN Y B. Aged facial dada extraction method by using skin color saliency [J]. Journal of Computer Applications, 2022, 42(S2): 217-223. | |
14 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
15 | LIU Z, MAO H, WU C Y, et al. A ConvNet for the 2020s [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11966-11976. |
16 | EKMAN P, FRIESEN W V. Constants across cultures in the face and emotion [J]. Journal of Personality and Social Psychology, 1971, 17(2): 124-129. |
17 | MOLLAHOSSEINI A, HASANI B, MAHOOR M H. AffectNet: a database for facial expression, valence, and arousal computing in the wild [J]. IEEE Transactions on Affective Computing, 2019, 10(1): 18-31. |
18 | LI S, DENG W, DU J. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2584-2593. |
19 | WEN Y, ZHANG K, LI Z, et al. A discriminative feature learning approach for deep face recognition [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9911. Cham: Springer, 2016: 499-515. |
20 | WAN W, ZHONG Y, LI T, et al. Rethinking feature distribution for loss functions in image classification [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 9117-9126. |
21 | FARZANEH A H, QI X. Facial expression recognition in the wild via deep attentive center loss [C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 2401-2410. |
22 | ZHANG Y, WANG C, LING X, et al. Learn from all: erasing attention consistency for noisy label facial expression recognition [C]// Proceedings of the 2022 European Conference on Computer Vision, LNCS 13686. Cham: Springer, 2022: 418-434. |
23 | 刘希未,宫晓燕,赵红霞,等.基于混合注意力机制的动态人脸表情识别[J].计算机应用, 2023, 43(S1): 1-7. |
LIU X W, GONG X Y, ZHAO H X, et al. Dynamic facial expression recognition based on hybrid attention mechanism [J]. Journal of Computer Applications, 2023, 43(S1): 1-7. | |
24 | FERNANDEZ P D M, PEÑA F A G, REN T I, et al. FERAtt: facial expression recognition with attention net [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2019: 837-846. |
25 | WANG K, PENG X, YANG J, et al. Region attention networks for pose and occlusion robust facial expression recognition [J]. IEEE Transactions on Image Processing, 2020, 29: 4057-4069. |
26 | MA F, SUN B, LI S. Facial expression recognition with visual transformers and attentional selective fusion [J]. IEEE Transactions on Affective Computing, 2023, 14(2): 1236-1248. |
27 | XUE F, WANG Q, GUO G. TransFER: learning relation-aware facial expression representations with Transformers [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 3581-3590. |
28 | VO T H, LEE G S, YANG H J, et al. Pyramid with super resolution for in-the-wild facial expression recognition [J]. IEEE Access, 2020, 8: 131988-132001. |
29 | RUDD E M, GÜNTHER M, BOULT T E. MOON: a mixed objective optimization network for the recognition of facial attributes [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9909. Cham: Springer, 2016: 19-35. |
30 | HAND E M, CHELLAPPA R. Attributes for improved attributes: a multi-task network utilizing implicit and explicit relationships for facial attribute classification [C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017: 4068-4074. |
31 | ZHUANG N, YAN Y, CHEN S, et al. Multi-task learning of cascaded CNN for facial attribute classification [C]// Proceedings of the 24th International Conference on Pattern Recognition. Piscataway: IEEE, 2018: 2069-2074. |
32 | MAO L, YAN Y, XUE J H, et al. Deep multi-task multi-label CNN for effective facial attribute classification [J]. IEEE Transactions on Affective Computing, 2022, 13(2): 818-828. |
33 | SAVCHENKO A V. Facial expression and attributes recognition based on multi-task learning of lightweight neural networks [C]// Proceedings of the IEEE 19th International Symposium on Intelligent Systems and Informatics. Piscataway: IEEE, 2021: 119-124. |
34 | ZHAO Z, LIU Q, ZHOU F. Robust lightweight facial expression recognition network with label distribution training [C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021: 3510-3519. |
35 | ZHAO Z, LIU Q, WANG S. Learning deep global multi-scale and local attention features for facial expression recognition in the wild [J]. IEEE Transactions on Image Processing, 2021, 30: 6544-6556. |
36 | WEN Z, LIN W, WANG T, et al. Distract your attention: multi-head cross attention network for facial expression recognition [J]. Biomimetics, 2023, 8(2): No.199. |
37 | LIU H, CAI H, LIN Q, et al. Adaptive multilayer perceptual attention network for facial expression recognition [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(9): 6253-6266. |
38 | SHI J, ZHU S, LIANG Z. Learning to amend facial expression representation via de-albino and affinity [EB/OL]. [2023-10-22]. . |
39 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
40 | LI Y, YAO T, PAN Y, et al. Contextual Transformer networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 1489-1500. |
41 | LIN T, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007. |
42 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
43 | GUO Y, ZHANG L, HU Y, et al. MS-Celeb-1M: a dataset and benchmark for large-scale face recognition [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9907. Cham: Springer, 2016: 87-102. |
44 | RIDNIK T, BEN-BARUCH E, ZAMIR N, et al. Asymmetric loss for multi-label classification [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 82-91. |
45 | MAHBUB U, SARKAR S, CHELLAPPA R. Segment-based methods for facial attribute detection from partial faces [J]. IEEE Transactions on Affective Computing, 2020, 11(4): 601-613. |
46 | GILDENBLAT J. PyTorch library for CAM methods [EB/OL]. [2023-10-22]. . |
[1] | Lifang WANG, Jingshuang WU, Pengliang YIN, Lihua HU. Action recognition algorithm based on attention mechanism and energy function [J]. Journal of Computer Applications, 2025, 45(1): 234-239. |
[2] | Ying HUANG, Changsheng LI, Hui PENG, Su LIU. Dual-branch network guided by local entropy for dynamic scene high dynamic range imaging [J]. Journal of Computer Applications, 2025, 45(1): 204-213. |
[3] | Jialin ZHANG, Qinghua REN, Qirong MAO. Speaker verification system utilizing global-local feature dependency for anti-spoofing [J]. Journal of Computer Applications, 2025, 45(1): 308-317. |
[4] | Junying CHEN, Shijie GUO, Lingling CHEN. Lightweight human pose estimation based on decoupled attention and ghost convolution [J]. Journal of Computer Applications, 2025, 45(1): 223-233. |
[5] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[6] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[7] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[8] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[9] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[10] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[11] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[12] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[13] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[14] | Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182. |
[15] | Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||