Adaptive hybrid network for affective computing in student classroom

doi:10.11772/j.issn.1001-9081.2023091303

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (9): 2919-2930.DOI: 10.11772/j.issn.1001-9081.2023091303

• Multimedia computing and computer simulation • Previous Articles Next Articles

Adaptive hybrid network for affective computing in student classroom

Yan RONG¹^,², Jiawen LIU¹, Xinlei LI¹()

^1.School of Statistics and Information，Shanghai University of International Business and Economics，Shanghai 201620，China
^2.The Hong Kong University of Science and Technology （Guangzhou），Guangzhou Guangdong 511458，China

Received:2023-09-20 Revised:2023-11-24 Accepted:2023-12-01 Online:2024-01-31 Published:2024-09-10
Contact: Xinlei LI
About author:RONG Yan， born in 2001， M. S. candidate. Her research interests include computer vision， affective computing.
LIU Jiawen， born in 1998， M. S. candidate. Her research interests include multimodal learning.
Supported by:
“Technology Innovation Action Plan” Sailing Program of Science and Technology Commission of Shanghai Municipality(22YF1415000);Shanghai University Young Teacher Training Program of Educational Commission of Shanghai Municipality(B3A010023045001);National College Students Innovation and Entrepreneurship Training Program(202210273042)

面向学生课堂情感计算的自适应混合网络

戎妍¹^,², 刘嘉雯¹, 李馨蕾¹()

^1.上海对外经贸大学统计与信息学院，上海 201620
^2.香港科技大学（广州），广州 511458

通讯作者: 李馨蕾
作者简介:戎妍（2001—），女，江苏丹阳人，硕士研究生，CCF会员，主要研究方向：计算机视觉、情感计算
刘嘉雯（1998—），女，江苏常州人，硕士研究生，主要研究方向：多模态学习；
基金资助:
上海市科委“科技创新行动计划”启明星项目（扬帆专项）(22YF1415000);上海市教委上海高校青年教师培养资助计划项目(B3A010023045001);国家级大学生创新创业训练计划项目(202210273042)

Abstract

Abstract:

Affective computing can provide a better teaching effectiveness and learning experience for intelligent education. Current research on affective computing in classroom domain still suffers from limited adaptability and weak perception on complex scenarios. To address these challenges， a novel hybrid architecture was proposed， namely SC-ACNet， aiming at accurate affective computing for students in classroom. In the architecture， the followings were included： a multi-scale student face detection module capable of adapting to small targets， an affective computing module with an adaptive spatial structure that can adapt to different facial postures to recognize five emotions （calm， confused， jolly， sleepy， and surprised） of students in classroom， and a self-attention module that visualized the regions of the model contributing most to the results. In addition， a new student classroom dataset， SC-ACD， was constructed to alleviate the lack of face emotion image datasets in classroom. Experimental results on SC-ACD dataset show that SC-ACNet improves the mean Average Precision （mAP） by 4.2 percentage points and the accuracy of affective computing by 9.1 percentage points compared with the baseline method YOLOv7. Furthermore， SC-ACNet has the accuracies of 0.972 and 0.994 on common sentiment datasets， namely KDEF and RaFD， validating the viability of the proposed method as a promising solution to elevate the quality of teaching and learning in intelligent classroom.

Key words: affective computing, face detection, hybrid architecture, intelligent classroom, multi-scale feature

摘要：

情感计算可以为智慧教育提供更好的教学效果和学习体验。目前针对课堂领域的情感计算研究仍存在有限的适应性与对复杂场景的感知能力较弱的问题。针对这一挑战，提出一种混合架构SC-ACNet，旨在对学生课堂进行准确的情感计算。该架构包含一个能适应小目标的多尺度学生面部检测模块；一个能适应不同面部姿态的、具有自适应空间结构的情感计算模块，对学生的5种课堂情感（平静、困惑、愉悦、困倦和惊讶）进行准确识别；以及一个自注意力模块，以可视化模型中对结果产生主要贡献的区域。此外，为缓解课堂环境下学生面部情绪图像数据集匮乏的问题，构建了一个学生课堂数据集SC-ACD。在SC-ACD数据集上的实验结果表明，与基线方法YOLOv7相比，SC-ACNet的平均精度均值（mAP）提升了4.2个百分点，情感计算准确率提升了9.1个百分点；此外，SC-ACNet在KDEF和RaFD数据集上的准确率分别达到了0.972和0.994，验证了SC-ACNet可作为提高智慧课堂教学质量的有前途的解决方案。

关键词: 情感计算, 人脸检测, 混合架构, 智慧课堂, 多尺度特征

CLC Number:

TP391.4

Yan RONG, Jiawen LIU, Xinlei LI. Adaptive hybrid network for affective computing in student classroom[J]. Journal of Computer Applications, 2024, 44(9): 2919-2930.

戎妍, 刘嘉雯, 李馨蕾. 面向学生课堂情感计算的自适应混合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2919-2930.

Figures/Tables 25

Fig. 1 Examples of some sample quality differences

Fig. 2 Structure of SC-ACNet

Fig. 3 Structure of improved backbone network and various modules

Fig. 4 Neck module of improved detection network

Fig. 5 Structure of GAM attention

Fig. 6 Structure of CoordRep module

Fig. 7 Examples of face alignment

Fig. 8 Structure of DCN-MobileViT module

Fig. 9 Some sample images of students in classroom

Fig. 10 Sample images of student facial emotion labels

Fig. 11 Sample images of different head postures and non-target components

Tab. 1 Influence of SAC-ELANBLOCK module on model performance

$S B S 1$	$S B S 2$	P	R	mAP	$m A P 50$
		0.969	0.949	0.556	0.980
􀳫		0.968	0.950	0.570	0.979
	􀳫	0.970	0.954	0.571	0.983
􀳫	􀳫	0.980	0.953	0.571	0.985

Tab. 1 Influence of SAC-ELANBLOCK module on model performance

$S B S 1$	$S B S 2$	P	R	mAP	$m A P 50$
		0.969	0.949	0.556	0.980
􀳫		0.968	0.950	0.570	0.979
	􀳫	0.970	0.954	0.571	0.983
􀳫	􀳫	0.980	0.953	0.571	0.985

Tab. 2 Influence of GAM structure on model performance

GAM₁	GAM₂	GAM₃	P	R	mAP	$m A P 50$
			0.969	0.949	0.556	0.980
􀳫			0.966	0.971	0.587	0.989
	􀳫		0.970	0.968	0.605	0.987
		􀳫	0.987	0.967	0.548	0.990
􀳫	􀳫	􀳫	0.984	0.978	0.576	0.991

Tab. 2 Influence of GAM structure on model performance

GAM₁	GAM₂	GAM₃	P	R	mAP	$m A P 50$
			0.969	0.949	0.556	0.980
􀳫			0.966	0.971	0.587	0.989
	􀳫		0.970	0.968	0.605	0.987
		􀳫	0.987	0.967	0.548	0.990
􀳫	􀳫	􀳫	0.984	0.978	0.576	0.991

Tab. 3 Comparison results of different attention mechanisms

注意力机制	P	R	mAP	$m A P 50$
SE^［34］	0.973	0.951	0.556	0.985
CBAM^［27］	0.957	0.946	0.552	0.977
Polarized Self-Attention^［35］	0.975	0.951	0.529	0.980
CoordAttention^［36］	0.976	0.949	0.566	0.982
Sequential Self-Attention^［37］	0.975	0.970	0.570	0.984
SimAM^［38］	0.974	0.953	0.560	0.979
TripletAttention^［39］	0.979	0.938	0.542	0.979
GAM	0.984	0.978	0.576	0.991

Tab. 3 Comparison results of different attention mechanisms

注意力机制	P	R	mAP	$m A P 50$
SE^［34］	0.973	0.951	0.556	0.985
CBAM^［27］	0.957	0.946	0.552	0.977
Polarized Self-Attention^［35］	0.975	0.951	0.529	0.980
CoordAttention^［36］	0.976	0.949	0.566	0.982
Sequential Self-Attention^［37］	0.975	0.970	0.570	0.984
SimAM^［38］	0.974	0.953	0.560	0.979
TripletAttention^［39］	0.979	0.938	0.542	0.979
GAM	0.984	0.978	0.576	0.991

Tab. 4 Comparison results of different loss functions in face detection module

损失函数	P	R	mAP	$m A P 50$
CIoU	0.969	0.949	0.556	0.980
SIoU^［40］	0.975	0.949	0.556	0.980
AlphaIoU^［41］	0.944	0.938	0.556	0.975
FocalEIoU^［42］	0.984	0.946	0.572	0.981
EIoU	0.979	0.956	0.567	0.983
CIoU+NWD	0.979	0.949	0.565	0.982
WiseIoU^［43］+NWD	0.972	0.944	0.526	0.974
NWD-EIoU	0.985	0.949	0.581	0.984

Tab. 4 Comparison results of different loss functions in face detection module

损失函数	P	R	mAP	$m A P 50$
CIoU	0.969	0.949	0.556	0.980
SIoU^［40］	0.975	0.949	0.556	0.980
AlphaIoU^［41］	0.944	0.938	0.556	0.975
FocalEIoU^［42］	0.984	0.946	0.572	0.981
EIoU	0.979	0.956	0.567	0.983
CIoU+NWD	0.979	0.949	0.565	0.982
WiseIoU^［43］+NWD	0.972	0.944	0.526	0.974
NWD-EIoU	0.985	0.949	0.581	0.984

Tab. 5 Ablation experiment results of improvement modules in face detection module

SEB	GAM	CoordRep	NWD-EIoU	P	R	mAP	$m A P 50$
				0.969	0.949	0.556	0.980
􀳫				0.980	0.953	0.571	0.985
	􀳫			0.984	0.978	0.576	0.991
		􀳫		0.970	0.968	0.605	0.987
			􀳫	0.985	0.949	0.581	0.984
􀳫	􀳫	􀳫	􀳫	0.994	0.986	0.598	0.994

Tab. 5 Ablation experiment results of improvement modules in face detection module

SEB	GAM	CoordRep	NWD-EIoU	P	R	mAP	$m A P 50$
				0.969	0.949	0.556	0.980
􀳫				0.980	0.953	0.571	0.985
	􀳫			0.984	0.978	0.576	0.991
		􀳫		0.970	0.968	0.605	0.987
			􀳫	0.985	0.949	0.581	0.984
􀳫	􀳫	􀳫	􀳫	0.994	0.986	0.598	0.994

Tab. 6 Comparison results of different loss functions in affective computing module

损失函数	Acc	R	F1
Cross EntropyLoss	0.828	0.824	0.814
LabelSmoothLoss	0.826	0.813	0.810
SeesawLoss	0.829	0.845	0.830
Focal Loss	0.832	0.832	0.820
ASL（clip=0.5）	0.849	0.814	0.808
ASL	0.853	0.843	0.841

Tab. 7 Ablation experiment results of DCNv2， feature fusion， and loss function

DCNv2	特征融合	ASL	Acc	R	F1
			0.832	0.832	0.820
		􀳫	0.853	0.843	0.841
	􀳫		0.845	0.842	0.823
􀳫			0.861	0.857	0.846
􀳫	􀳫	􀳫	0.923	0.912	0.907

Fig. 12 Comparison of heat maps for each module

Tab. 8 Comparison results of different object detection algorithms in face detection module

算法	one-stage	Anchor	Backbone	mAP	$m A P 50$	$m A P 75$	浮点运算量/GFLOPs	Params/10⁶
DETR^［7］	􀳫		R-50	0.437	0.868	0.386	5.170	41.280
FSAF^［8］	􀳫		R-50	0.725	0.988	0.866	9.950	36.010
YOLOX^［9］	􀳫		YOLOX-S	0.301	0.679	0.209	1.630	8.940
YOLOv6^［10］	􀳫	􀳫	YOLOv6-s	0.689	0.983	0.815	2.681	17.187
RetinaNet^［12］	􀳫	􀳫	R-50-FPN	0.718	0.968	0.866	10.050	36.100
Grid RCNN^［13］		􀳫	R-50	0.698	0.979	0.855	136.830	64.240
Cascade RCNN^［14］		􀳫	R-50-FPN	0.743	0.979	0.907	51.150	68.930
Faster RCNN^［15］		􀳫	R-50-FPN	0.718	0.980	0.885	23.350	23.350
TOOD^［16］	􀳫		R-50	0.732	0.989	0.890	8.860	31.790
FCOS^［17］	􀳫		R-50	0.705	0.979	0.847	9.660	31.840
Deformable DETR^［44］	􀳫		R-50	0.529	0.928	0.537	11.010	39.820
SC-ACNet	􀳫	􀳫	本文网络	0.748	0.995	0.902	6.338	36.503

Tab. 8 Comparison results of different object detection algorithms in face detection module

算法	one-stage	Anchor	Backbone	mAP	$m A P 50$	$m A P 75$	浮点运算量/GFLOPs	Params/10⁶
DETR^［7］	􀳫		R-50	0.437	0.868	0.386	5.170	41.280
FSAF^［8］	􀳫		R-50	0.725	0.988	0.866	9.950	36.010
YOLOX^［9］	􀳫		YOLOX-S	0.301	0.679	0.209	1.630	8.940
YOLOv6^［10］	􀳫	􀳫	YOLOv6-s	0.689	0.983	0.815	2.681	17.187
RetinaNet^［12］	􀳫	􀳫	R-50-FPN	0.718	0.968	0.866	10.050	36.100
Grid RCNN^［13］		􀳫	R-50	0.698	0.979	0.855	136.830	64.240
Cascade RCNN^［14］		􀳫	R-50-FPN	0.743	0.979	0.907	51.150	68.930
Faster RCNN^［15］		􀳫	R-50-FPN	0.718	0.980	0.885	23.350	23.350
TOOD^［16］	􀳫		R-50	0.732	0.989	0.890	8.860	31.790
FCOS^［17］	􀳫		R-50	0.705	0.979	0.847	9.660	31.840
Deformable DETR^［44］	􀳫		R-50	0.529	0.928	0.537	11.010	39.820
SC-ACNet	􀳫	􀳫	本文网络	0.748	0.995	0.902	6.338	36.503

Tab. 9 Comparison results of different algorithms in affective computing module

模型	SC-ACD			KDEF			RaFD			浮点运算量/GFLOPs	Params/10⁶
模型	Acc	R	F1	Acc	R	F1	Acc	R	F1	浮点运算量/GFLOPs	Params/10⁶
ConvNext^［45］	0.525	0.429	0.409	0.652	0.653	0.652	0.155	0.278	0.187	5.84	27.83
MobileNet v2^［46］	0.863	0.814	0.818	0.848	0.848	0.847	0.741	0.714	0.683	0.41	2.23
EfficientNet^［47］	0.847	0.814	0.818	0.956	0.955	0.955	0.989	0.992	0.990	0.52	4.02
ShuffleNet v2^［48］	0.863	0.857	0.857	0.924	0.919	0.918	0.941	0.945	0.911	0.19	1.26
DenseNet^［49］	0.133	0.283	0.181	0.632	0.636	0.622	0.131	0.136	0.109	3.74	6.96
CSPNet^［50］	0.825	0.845	0.819	0.666	0.666	0.666	0.975	0.975	0.974	6.57	27.64
VAN^［51］	0.850	0.825	0.819	0.954	0.954	0.953	0.991	0.989	0.990	1.13	3.85
PoolFormer^［52］	0.888	0.858	0.860	0.962	0.961	0.961	0.974	0.972	0.972	2.38	11.41
MViTv2^［53］	0.775	0.736	0.730	0.968	0.968	0.977	0.995	0.996	0.996	6.41	23.41
Swin Transformer v2^［54］	0.875	0.843	0.843	0.975	0.978	0.976	0.964	0.972	0.967	5.96	27.58
ConvMixer^［55］	0.825	0.775	0.773	0.728	0.701	0.689	0.954	0.961	0.956	28.83	20.35
Twins^［56］	0.775	0.712	0.706	0.576	0.574	0.569	0.013	0.125	0.024	5.06	23.60
SC-ACNet	0.923	0.913	0.908	0.972	0.972	0.971	0.994	0.997	0.996	2.03	4.94

Fig. 13 Visualization results of face detection module on test set

Fig. 14 Affection classification results of affective computing module on test set

Fig. 15 Affection classification results of SC-ACNet in different head postures

Fig. 16 Prediction results and attention graphs of SC-ACNet on KDEF dataset

References 59

1	WANG Y， SONG W， TAO W， et al. A systematic review on affective computing： emotion models， databases， and recent advances ［J］. Information Fusion， 2022， 83： 19-52.
2	MEHRABIAN A. Communication without words ［M］// MORTENSEN C D. Communication Theory. 2nd ed. New York： Routledge， 2008： 193-200.
3	周进，叶俊民，李超. 多模态学习情感计算：动因、框架与建议［J］. 电化教育研究， 2021， 42（7）： 26-32.
	ZHOU J， YE J M， LI C. Multimodal learning affective computing： motivations， frameworks， and suggestions ［J］. e-Education Research， 2021， 42（7）： 26-32.
4	WEN J， JIANG D， TU G， et al. Dynamic interactive multiview memory network for emotion recognition in conversation ［J］. Information Fusion， 2023， 91： 123-133.
5	SUN B， WU Y， ZHAO K， et al. Student class behavior dataset： a video dataset for recognizing， detecting， and captioning students’ behaviors in classroom scenes ［J］. Neural Computing and Applications， 2021， 33（14）： 8335-8354.
6	MASUD U， SAEED T， MALAIKAH H M， et al. Smart assistive system for visually impaired people obstruction avoidance through object detection and classification ［J］. IEEE Access， 2022， 10： 13428-13441.
7	CARION N， MASSA F， SYNNAEVE G， et al. End-to-end object detection with Transformers ［C］// Proceedings of the 16th European Conference on Computer Vision， LNCS 12346. Cham： Springer， 2020： 213-229.
8	ZHU C， HE Y， SAVVIDES M. Feature selective anchor-free module for single-shot object detection ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 840-849.
9	GE Z， LIU S， WANG F， et al. YOLOX： exceeding YOLO series in 2021 ［EB/OL］. ［2023-11-15］. .
10	LI C， LI L， JIANG H， et al. YOLOv6： a single-stage object detection framework for industrial applications ［EB/OL］. （2022-09-07）［2023-11-18］. .
11	WANG C Y， BOCHKOVSKIY A， LIAO H Y M. YOLOv7： trainable bag-of-freebies sets new state-of-the-art for real-time object detectors ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 7464-7475.
12	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-3007.
13	LU X， LI B， YUE Y， et al. Grid R-CNN ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 7355-7364.
14	CAI Z， VASCONCELOS N. Cascade R-CNN： delving into high quality object detection ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6154-6162.
15	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks ［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015： 91-99.
16	FENG C， ZHONG Y， GAO Y， et al. TOOD： task-aligned one-stage object detection ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 3490-3499.
17	TIAN Z， SHEN C， CHEN H， et al. FCOS： fully convolutional one-stage object detection ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 9626-9635.
18	ZHONG P， WANG D， MIAO C. EEG-based emotion recognition using regularized graph neural networks ［J］. IEEE Transactions on Affective Computing， 2022， 13（3）： 1290-1301.
19	YE F， PU S， ZHONG Q， et al. Dynamic GCN： context-enriched topology learning for skeleton-based action recognition ［C］// Proceedings of the 28th ACM International Conference on Multimedia. New York： ACM， 2020： 55-63.
20	GUPTA S， KUMAR P， TEKCHANDANI R K. Facial emotion recognition based real-time learner engagement detection system in online learning context using deep learning models ［J］. Multimedia Tools and Applications， 2023， 82（8）： 11365-11394.
21	HOU C， AI J， LIN Y， et al. Evaluation of online teaching quality based on facial expression recognition ［J］. Future Internet， 2022， 14（6）： No.177.
22	DONG Z， JI X， LAI C S， et al. Memristor-based hierarchical attention network for multimodal affective computing in mental health monitoring ［J］. IEEE Consumer Electronics Magazine， 2023， 12（4）： 94-106.
23	CALVO M G， LUNDQVIST D. Facial expressions of emotion （KDEF）： identification under different display-duration conditions［J］. Behavior Research Methods， 2008， 40（1）： 109-115.
24	LANGNER O， DOTSCH R， BIJLSTRA G， et al. Presentation and validation of the Radboud faces database ［J］. Cognition and Emotion， 2010， 24（8）： 1377-1388.
25	QIAO S， CHEN L C， YUILLE A. DetectoRS： detecting objects with recursive feature pyramid and switchable atrous convolution ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 10208-10219.
26	LIU Y， SHAO Z， HOFFMANN N. Global attention mechanism： retain information to enhance channel-spatial interactions［EB/OL］. ［2023-10-13］. .
27	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module ［C］// Proceedings of the 2018 European Conference on Computer Vision. Cham： Springer， 2018： 3-19.
28	LIU R， LEHMAN J， MOLINO P， et al. An intriguing failing of convolutional neural networks and the CoordConv solution ［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2018： 9628-9639.
29	YANG Z， WANG X， LI J. EIoU： an improved vehicle detection algorithm based on VehicleNet neural network ［J］. Journal of Physics： Conference Series， 2021， 1924： No.012001.
30	XU C， WANG J， YANG W， et al. Detecting tiny objects in aerial images： a normalized Wasserstein distance and a new benchmark［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2022， 190： 79-93.
31	MEHTA S， RASTEGARI M. MobileViT： light-weight， general-purpose， and mobile-friendly vision transformer［EB/OL］. （2022-03-04）［2023-08-02］. .
32	ZHU X， HU H， LIN S， et al. Deformable ConvNets v2： more deformable， better results ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 9300-9308.
33	RIDNIK T， BEN-BARUCH E， ZAMIR N， et al. Asymmetric loss for multi-label classification ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 82-91.
34	HU J， SHEN L， SUN G. Squeeze-and-excitation networks ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
35	LIU H， LIU F， FAN X， et al. Polarized self-attention： towards high-quality pixel-wise mapping ［J］. Neurocomputing， 2022， 506： 158-167.
36	HOU Q， ZHOU D， FENG J. Coordinate attention for efficient mobile network design ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13708-13717.
37	FAN Z， LIU Z， WANG Y， et al. Sequential recommendation via stochastic self-attention ［C］// Proceedings of the ACM Web Conference 2022. New York： ACM， 2022： 2036-2047.
38	YANG L， ZHANG R Y， LI L， et al. SimAM： a simple， parameter-free attention module for convolutional neural networks［C］// Proceedings of the 38th International Conference on Machine Learning. New York： JMLR.org， 2021： 11863-11874.
39	MISRA D， NALAMADA T， ARASANIPALAI A U， et al. Rotate to attend： convolutional triplet attention module ［C］// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2021： 3138-3147.
40	GEVORGYAN Z. SIoU loss： more powerful learning for bounding box regression［EB/OL］. （2022-05-25）［2023-03-29］. .
41	HE J， ERFANI S， MA X， et al. Alpha-IoU： a family of power intersection over union losses for bounding box regression ［C］// Proceedings of the 35th Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2021： 20230-20242.
42	ZHANG Y F， REN W， ZHANG Z， et al. Focal and efficient IOU loss for accurate bounding box regression ［J］. Neurocomputing， 2022， 506： 146-157.
43	TONG Z， CHEN Y， XU Z， et al. Wise-IoU： bounding box regression loss with dynamic focusing mechanism ［EB/OL］. ［2023-10-14］. .
44	ZHU X， SU W， LU L， et al. Deformable DETR： deformable Transformers for end-to-end object detection ［EB/OL］. （2021-03-18）［2023-11-11］. .
45	LIU Z， MAO H， WU C Y， et al. A ConvNet for the 2020s ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 11966-11976.
46	SANDLER M， HOWARD A， ZHU M， et al. MobileNetV2： inverted residuals and linear bottlenecks ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 4510-4520.
47	TAN M， LE Q V. EfficientNet： rethinking model scaling for convolutional neural networks ［C］// Proceedings of the 36th International Conference on Machine Learning. New York： JMLR.org， 2019： 6105-6114.
48	MA N， ZHANG X， ZHENG H T， et al. ShuffleNet V2： practical guidelines for efficient CNN architecture design ［C］// Proceedings of the 15th European Conference on Computer Vision， LNCS 11218. Cham： Springer， 2018： 122-138.
49	HUANG G， LIU Z， VAN DER MAATEN L， et al. Densely connected convolutional networks ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2261-2269.
50	WANG C Y， LIAO H Y M， WU Y H， et al. CSPNet： a new backbone that can enhance learning capability of CNN ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2020： 1571-1580.
51	GUO M H， LU C Z， LIU Z N， et al. Visual attention network ［J］. Computational Visual Media， 2023， 9（4）： 733-752.
52	YU W， LUO M， ZHOU P， et al. MetaFormer is actually what you need for vision ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 10809-10819.
53	LI Y， WU C Y， FAN H， et al. MViTv2： improved multiscale vision Transformers for classification and detection ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 4794-4804.
54	LIU Z， HU H， LIN Y， et al. Swin Transformer V2： scaling up capacity and resolution ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 11999-12009.
55	NG D， CHEN Y， TIAN B， et al. ConvMixer： feature interactive convolution with curriculum learning for small footprint and noisy far-field keyword spotting ［C］// Proceedings of the 2022 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2022： 3603-3607.
56	CHU X， TIAN Z， WANG Y， et al. Twins： revisiting the design of spatial attention in vision Transformers ［C］// Proceedings of the 35th Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2021： 9355-9366.
57	ZHANG C， ZHANG C， ZHENG S， et al. A complete survey on generative AI （AIGC）： is ChatGPT from GPT-4 to GPT-5 all you need？［EB/OL］. ［2023-03-21］. .
58	KIRILLOV A， MINTUN E， RAVI N， et al. Segment anything［EB/OL］. ［2023-06-11］. .
59	AMIN M M， CAMBRIA E， SCHULLER B W. Will affective computing emerge from foundation models and general artificial intelligence？ A first evaluation of ChatGPT ［J］. IEEE Intelligent Systems， 2023， 38（2）： 15-23.

[1]	Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413.
[2]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.
[3]	Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444.
[4]	Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944.
[5]	Hao YANG, Yi ZHANG. Feature pyramid network algorithm based on context information and multi-scale fusion importance awareness [J]. Journal of Computer Applications, 2023, 43(9): 2727-2734.
[6]	Hong WANG, Qing QIAN, Huan WANG, Yong LONG. Lightweight image tamper localization algorithm based on large kernel attention convolution [J]. Journal of Computer Applications, 2023, 43(9): 2692-2699.
[7]	Shuai ZHENG, Xiaolong ZHANG, He DENG, Hongwei REN. 3D liver image segmentation method based on multi-scale feature fusion and grid attention mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2303-2310.
[8]	Chunlan ZHAN, Anzhi WANG, Minghui WANG. Camouflage object segmentation method based on channel attention and edge fusion [J]. Journal of Computer Applications, 2023, 43(7): 2166-2172.
[9]	Zhouhua ZHU, Qi QI. Automatic detection and recognition of electric vehicle helmet based on improved YOLOv5s [J]. Journal of Computer Applications, 2023, 43(4): 1291-1296.
[10]	You YANG, Ruhui ZHANG, Pengcheng XU, Kang KANG, Hao ZHAI. Improved U-Net for seal segmentation of Republican archives [J]. Journal of Computer Applications, 2023, 43(3): 943-948.
[11]	Xin ZHAO, Qianqian ZHU, Cong ZHAO, Jialing WU. Segmentation of breast nodules in ultrasound images based on multi-scale and cross-spatial fusion [J]. Journal of Computer Applications, 2023, 43(11): 3599-3606.
[12]	LYU Yuchao, JIANG Xi, XU Yinghao, ZHU Xijun. Improved brachial plexus nerve segmentation method based on multi-scale feature fusion [J]. Journal of Computer Applications, 2023, 43(1): 273-279.
[13]	Zanxia QIANG, Xianfu BAO. Residual attention deraining network based on convolutional long short-term memory [J]. Journal of Computer Applications, 2022, 42(9): 2858-2864.
[14]	Tianhao QIU, Shurong CHEN. EfficientNet based dual-branch multi-scale integrated learning for pedestrian re-identification [J]. Journal of Computer Applications, 2022, 42(7): 2065-2071.
[15]	HAN Jiandong, LI Xiaoyu. Pedestrian re-identification method based on multi-scale feature fusion [J]. Journal of Computer Applications, 2021, 41(10): 2991-2996.

Adaptive hybrid network for affective computing in student classroom

面向学生课堂情感计算的自适应混合网络

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 25

References 59

Related Articles 15

Recommended Articles

Metrics