Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (1): 253-260.DOI: 10.11772/j.issn.1001-9081.2024010098

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Facial attribute estimation and expression recognition based on contextual channel attention mechanism

Jie XU1, Yong ZHONG2, Yang WANG3, Changfu ZHANG4, Guanci YANG1,3()   

  1. 1.Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education (Guizhou University),Guiyang Guizhou 550025,China
    2.Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610213,China
    3.State Key Laboratory of Public Big Data (Guizhou University),Guiyang Guizhou 550025,China
    4.School of Mechanical Engineering,Guizhou University,Guiyang Guizhou 550025,China
  • Received:2024-01-26 Revised:2024-03-28 Accepted:2024-04-01 Online:2024-05-09 Published:2025-01-10
  • Contact: Guanci YANG
  • About author:XU Jie, born in 1997, M. S. candidate. His research interests include intelligent autonomous system.
    ZHONG Yong, born in 1966, Ph. D., research fellow. His research interests include big data and its intelligent processing, cloud computing, software engineering.
    WANG Yang, born in 1987, Ph. D., senior engineer. His research interests include artificial intelligence, computer vision, intelligent analysis of big data.
    ZHANG Changfu, born in 1990, senior engineer. His research interests include industrial big data, artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(62373116);Guizhou Province Science and Technology Program (Qiankehe Zhicheng [2023] Yiban 118, Qiankehe Pingtairencai [2020]6007-2)

基于上下文通道注意力机制的人脸属性估计与表情识别

徐杰1, 钟勇2, 王阳3, 张昌福4, 杨观赐1,3()   

  1. 1.现代制造技术教育部重点实验室(贵州大学),贵阳 550025
    2.中国科学院 成都计算机应用研究所,成都 610213
    3.省部共建公共大数据国家重点实验室(贵州大学),贵阳 550025
    4.贵州大学 机械工程学院,贵阳 550025
  • 通讯作者: 杨观赐
  • 作者简介:徐杰(1997—),男,安徽阜阳人,硕士研究生,CCF会员,主要研究方向:自主智能系统;
    钟勇(1966—),男,四川岳池人,研究员,博士,主要研究方向:大数据及其智能处理、云计算、软件工程;
    王阳(1987—),男,河南鹤壁人,高级工程师,博士,主要研究方向:人工智能、计算机视觉、大数据智能分析;
    张昌福(1990—),男,贵州瓮安人,高级工程师,主要研究方向:工业大数据、人工智能;
  • 基金资助:
    国家自然科学基金资助项目(62373116);贵州省科技计划项目(黔科合支撑[2023]一般118,黔科合平台人才[2020]6007-2)

Abstract:

Facial features contain a lot of information and hold significant value in facial attribute and expression analysis tasks, but the diversity and complexity of facial features make facial analysis tasks difficult. Aiming at the above issue, a model of Facial Attribute estimation and Expression Recognition based on contextual channel attention mechanism (FAER) was proposed from the perspective of fine-grained facial features. Firstly, a local feature encoding backbone network based on ConvNext was constructed, and by utilizing the effectiveness of the backbone network in encoding local features, the differences among facial local features were represented adequately. Secondly, a Contextual Channel Attention (CC Attention) mechanism was introduced. By adjusting the weight information on feature channels dynamically and adaptively, both global and local features of deep features were represented, so as to address the limitations of the backbone network ability in encoding global features. Finally, different classification strategies were designed. For Facial Attribute Estimation (FAE) and Facial Expression Recognition (FER) tasks, different combinations of loss functions were employed to encourage the model to learn more fine-grained facial features. Experimental results show that the proposed model achieves an average accuracy of 91.87% on facial attribute dataset CelebA (CelebFaces Attributes), surpassing the suboptimal model SwinFace (Swin transformer for Face) by 0.55 percentage points, and the proposed model achieves accuracies of 91.75% and 66.66% respectively on facial expression datasets RAF-DB and AffectNet, surpassing the suboptimal model TransFER (Transformers for Facial Expression Recognition) by 0.84 and 0.43 percentage points respectively.

Key words: Facial Attribute Estimation (FAE), Facial Expression Recognition (FER), attention mechanism, fine-grained feature, feature difference

摘要:

人脸特征蕴含诸多信息,在面部属性和情感分析任务中具有重要价值,而面部特征的多样性和复杂性使人脸分析任务变得困难。针对上述难题,从面部细粒度特征角度出发,提出基于上下文通道注意力机制的人脸属性估计和表情识别(FAER)模型。首先,构建基于ConvNext的局部特征编码骨干网络,并运用骨干网络编码局部特征的有效性来充分表征人脸局部特征之间的差异性;其次,提出上下文通道注意力(CC Attention)机制,通过动态自适应调整特征通道上的权重信息,表征深度特征的全局和局部特征,从而弥补骨干网络编码全局特征能力的不足;最后,设计不同分类策略,针对人脸属性估计(FAE)和面部表情识别(FER)任务,分别采用不同损失函数组合,以促使模型学习更多的面部细粒度特征。实验结果表明,所提FAER模型在人脸属性数据集CelebA (CelebFaces Attributes)上取得了91.87%的平均准确率,相较于次优模型SwinFace (Swin transformer for Face)高出0.55个百分点;在面部表情数据集RAF-DB和AffectNet上分别取得了91.75%和66.66%的准确率,相较于次优模型TransFER (Transformers for Facial Expression Recognition)分别高出0.84和0.43个百分点。

关键词: 人脸属性估计, 面部表情识别, 注意力机制, 细粒度特征, 特征差异

CLC Number: