Gradient-discriminative and feature norm-driven open-world object detection

doi:10.11772/j.issn.1001-9081.2024070944

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (7): 2203-2210.DOI: 10.11772/j.issn.1001-9081.2024070944

• Artificial intelligence • Previous Articles Next Articles

Gradient-discriminative and feature norm-driven open-world object detection

Yingjun ZHANG¹, Weiwei YAN¹(), Binhong XIE¹, Rui ZHANG¹, Wangdong LU²

^1.College of Computer Science and Technology，Taiyuan University of Science and Technology，Taiyuan Shanxi 030024，China
^2.Shanxi Tianhe Cloud Computing Company Limited，Lyuliang Shanxi 033000，China

Received:2024-07-08 Revised:2024-10-09 Accepted:2024-10-09 Online:2025-07-10 Published:2025-07-10
Contact: Weiwei YAN
About author:ZHANG Yingjun， born in 1969， M. S.， professor of engineering. His research interests include intelligent software， software architecture.
YAN Weiwei， born in 1999， M. S. candidate. Her research interests include open-world object detection.
XIE Binhong， born in 1971， M. S.， professor. His research interests include intelligent software， machine learning.
ZHANG Rui， born in 1987， Ph. D.， associate professor. His research interests include intelligent information processing.
LU Wangdong， born in 1970， M. S.， senior engineer. His research interests include signal and information system.
Supported by:
Basic Research Program of Shanxi Province(20210302123216);High-Level Scientific and Technological Talents Introduction Key Research and Development Project of Lyuliang City(2022RC08)

梯度区分与特征范数驱动的开放世界目标检测

张英俊¹, 闫薇薇¹(), 谢斌红¹, 张睿¹, 陆望东²

^1.太原科技大学计算机科学与技术学院，太原 030024
^2.山西天河云计算有限公司，山西吕梁 033000

通讯作者: 闫薇薇
作者简介:张英俊（1969—），男，山西河津人，教授级高级工程师，硕士，主要研究方向：智能化软件、软件体系结构
闫薇薇（1999—），女，山西晋城人，硕士研究生，主要研究方向：开放世界目标检测 yww1374670805@163.com
谢斌红（1971—），男，山西万荣人，教授，硕士，主要研究方向：智能化软件、机器学习
张睿（1987—），男，山西太原人，副教授，博士，主要研究方向：智能信息处理
陆望东（1970—），男，山西吕梁人，高级工程师，硕士，主要研究方向：信号与信息系统。
基金资助:
山西省基础研究计划项目(20210302123216);吕梁市引进高层次科技人才重点研发项目(2022RC08)

Abstract

Abstract:

Open-World Object Detection （OWOD） extends the object detection task to real and variable environments， and requires models to identify known and unknown objects accurately and learn new knowledge gradually. In response to the low recall for unknown classes and the problem of false identification in the existing OWOD methods， a Gradient-Discriminative and Feature Norm-driven OWOD （GDFN-OWOD） network model was proposed. To address the issue of low recall for unknown classes， a Gradient-Discriminative Representation Module （GDRM） was proposed， which uses the gradient difference from backpropagation to distinguish unknown classes from the background accurately， thereby improving the recall for unknown classes. In addition， a Graph Segmentation-based Bounding box Clustering （GSBC） algorithm was introduced to model the determination of object bounding boxes as a graph decomposition problem， thereby reducing redundant bounding boxes， and thus reducing the computational complexity of the model. To tackle the problem of false identification for unknown classes， a FeatureNorm-Based Classifier （FN-BC） was employed to select the best-performing convolutional layer to identity known and unknown classes for higher identification precision. Experimental results on M-OWODB dataset show that compared with the best performance of comparison models in tasks T₁， T₂， and T₃， GDFN-OWOD has the recall for unknown classes increased by 1.1， 2.1， and 0.9 percentage points， respectively， and the Absolute Open-Set Error （A-OSE） reduced by 35.1%， 28.7%， and 12.2%， respectively. It can be seen that compared with the existing OWOD methods， the proposed method alleviates the problems of low recall for unknown classes and false identification effectively.

Key words: Open-World Object Detection (OWOD), backpropagation gradient, graph segmentation algorithm, feature norm, Convolutional Neural Network (CNN)

摘要：

开放世界目标检测（OWOD）将目标检测任务拓展至真实多变的环境中，要求模型能准确识别已知和未知对象，并逐步学习新知识。针对现有OWOD网络模型中未知类的召回率偏低和误识别的问题，提出一种梯度区分与特征范数驱动的开放世界目标检测（GDFN-OWOD）网络模型。针对未知类召回率偏低的问题，提出梯度区分性表征模块（GDRM），即利用反向传播的梯度差异区分未知类别和背景，以提高未知类召回率；此外，引入基于图分割的框聚类（GSBC）算法将物体边界框的确定建模为图分解问题，从而减少冗余的边界框，进而降低模型的计算量；针对未知类误识别的问题，采用基于特征范数的分类器（FN-BC）选择性能最优的卷积层识别已知和未知类别，以达到更高的识别准确率。在M-OWODB数据集上的实验结果表明，与最优对比模型相比在T₁、T₂、T₃任务中GDFN-OWOD的未知类召回率分别提升了1.1、2.1、0.9个百分点，而绝对开集误差（A-OSE）分别降低了35.1%、28.7%和12.2%。可见，与现有的OWOD网络模型相比，所提网络模型有效缓解了未知类的召回率偏低和误识别的问题。

关键词: 开放世界目标检测, 反向传播梯度, 图分割算法, 特征范数, 卷积神经网络

CLC Number:

TP391.4

Yingjun ZHANG, Weiwei YAN, Binhong XIE, Rui ZHANG, Wangdong LU. Gradient-discriminative and feature norm-driven open-world object detection[J]. Journal of Computer Applications, 2025, 45(7): 2203-2210.

张英俊, 闫薇薇, 谢斌红, 张睿, 陆望东. 梯度区分与特征范数驱动的开放世界目标检测[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2203-2210.

Figures/Tables 10

References 20

[1]	郭庆梅，刘宁波，王中训，等.基于深度学习的目标检测算法综述［J］.探测与控制学报，2023， 45（6）： 10-20.
	GUO Q M， LIU N B， WANG Z X， et al. Review of deep learning based object detection algorithms ［J］. Journal of Detection and Control， 2023， 45（6）： 10-20.
[2]	JOSEPH K J， KHAN S， KHAN F S， et al. Towards open world object detection ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 5826-5836.
[3]	GUPTA A， NARAYAN S， JOSEPH K J， et al. OW-DETR： open-world detection transformer ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 9225-9234.
[4]	MA S， WANG Y， WEI Y， et al. CAT： loCalization and identificAtion cascade detection Transformer for open-world object detection ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 19681-19690.
[5]	ZOHAR O， WANG K C， YEUNG S. PROB： probabilistic objectness for open world object detection ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway：IEEE， 2023： 11444-11453.
[6]	田霖，李华，李林轩，等.基于特征解耦的开放世界目标检测［J］.重庆理工大学学报（自然科学）， 2023， 37（10）： 166-173.
	TIAN L， LI H， LI L X， et al. Open world object detection based on feature disentanglement ［J］. Journal of Chongqing University of Technology （Natural Science）， 2023， 37（10）： 166-173.
[7]	谢斌红，张鹏举，张睿.结合Graph-FPN与稳健优化的开放世界目标检测［J］.计算机科学与探索，2023， 17（12）： 2954-2966.
	XIE B H， ZHANG P J， ZHANG R. Open world object detection combining graph-FPN and robust optimization ［J］. Journal of Frontiers of Computer Science and Technology， 2023， 17（12）： 2954-2966.
[8]	谢斌红，唐彪，张睿. UBA-OWDT：一种新型的开放世界目标检测网络［J/OL］.计算机工程与应用［2024-06-07］. .
	XIE B H， TANG B， ZHANG R. UBA-OWDT： a novel network of open world object detection ［J/OL］. Computer Engineering and Applications ［2024-06-07］. .
[9]	WU Z， LU Y， CHEN X， et al. UC-OWOD： unknown-classified open world object detection ［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13670. Cham： Springer， 2022： 193-210.
[10]	ZEILER M D， FERGUS R. Visualizing and understanding convolutional networks ［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS 8689. Cham： Springer， 2014： 818-833.
[11]	SPRINGENBERG J T， DOSOVITSKIY A， BROX T， et al. Striving for simplicity： the all convolutional net ［EB/OL］. ［2024-05-22］. .
[12]	GOODFELLOW I J， SHLENS J， SZEGEDY C. Explaining and harnessing adversarial examples ［EB/OL］. ［2024-05-22］. .
[13]	KURAKIN A， GOODFELLOW I， BENGIO S. Adversarial machine learning at scale ［EB/OL］. ［2024-05-22］. .
[14]	PERRONNIN F， DANCE C. Fisher kernels on visual vocabularies for image categorization ［C］// Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2007： 1-8.
[15]	HENDRYCKS D， GIMPEL K. A baseline for detecting misclassified and out-of-distribution examples in neural networks ［EB/OL］. ［2024-05-22］. .
[16]	LIU W， WANG X， OWENS J D， et al. Energy-based out-of-distribution detection ［EB/OL］. ［2024-05-22］. .
[17]	KIM M， JAIN A K， LIU X. AdaFace： quality adaptive margin for face recognition ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 18729-18738.
[18]	DHAMIJA A R， GÜNTHER M， BOULT T E. Reducing network agnostophobia ［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2018： 9175-9186.
[19]	YU J， MA L， LI Z， et al. Open-world object detection via discriminative class prototype learning ［C］// Proceedings of the 2022 IEEE International Conference on Image Processing. Piscataway： IEEE， 2022： 626-630.
[20]	CHEN Y， MA L， JING L， et al. BSDP： brain-inspired streaming dual-level perturbations for online open world object detection ［J］. Pattern Recognition， 2024， 152： No.110472.

T_t	语义分割	#训练图像	#训练实例	#测试图像	#测试实例
T₁	VOC Classes	16 551	47 223	4 952	14 976
T₂	Outdoor， Accessories，Appliance， Truck	45 520	113 741	1 914	4 966
T₃	Sports， Food	39 402	114 452	1 642	4 826
T₄	Electronic， Indoor，Kitchen， Furniture	40 260	138 996	1 738	6 039

T_t	语义分割	#训练图像	#训练实例	#测试图像	#测试实例
T₁	VOC Classes	16 551	47 223	4 952	14 976
T₂	Outdoor， Accessories，Appliance， Truck	45 520	113 741	1 914	4 966
T₃	Sports， Food	39 402	114 452	1 642	4 826
T₄	Electronic， Indoor，Kitchen， Furniture	40 260	138 996	1 738	6 039

数据集	图片数	已知类实例数	未知类实例数
VOC_Pretest	200	5.09	0.00
COCO-OOD	504	0.00	3.28

数据集	图片数	已知类实例数	未知类实例数
VOC_Pretest	200	5.09	0.00
COCO-OOD	504	0.00	3.28

T_i	指标		Faster R-CNN	Faster R-CNN+Finetuning	ORE-EBUI^［2］	UC-OWOD^［9］	OCPL^［19］	OW-DETR^［3］	BSDP^［20］	GDFN-OWOD
T₁	U-Recall		—	—	4.9	2.4	8.3	7.5	8.3	9.4
T₁	mAP	Current Known	60.3	60.3	56.0	50.7	56.6	59.2	56.2	61.2
T₂	U-Recall		—	—	2.9	3.4	7.7	6.2	7.4	9.8
	mAP	Previously known	0.7	57.6	52.7	33.1	50.6	53.6	53.3	54.8
		Current Known	35.2	34.0	26.0	30.5	27.5	33.5	23.7	35.4
		Both	17.9	45.3	39.4	31.8	39.1	42.9	38.5	44.6
T₃	U-Recall		—	—	3.9	8.7	11.9	5.7	10.3	12.8
	mAP	Previously known	0.3	43.8	38.2	28.8	38.7	38.3	39.3	40.5
		Current Known	23.5	22.3	12.7	16.3	14.7	15.8	13.4	17.6
		Both	8.0	36.6	29.7	24.6	30.7	30.8	30.7	32.1
T₄	mAP	Previously known	0.7	35.6	29.6	25.6	30.7	31.4	30.7	32.6
		Current Known	20.1	19.5	12.4	15.9	14.4	17.1	12.9	18.3
		Both	5.5	31.5	25.3	23.2	26.7	27.8	26.3	28.4

Gradient-discriminative and feature norm-driven open-world object detection

梯度区分与特征范数驱动的开放世界目标检测

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 10

References 20

Related Articles 15

Recommended Articles

Metrics

模型	T₁			T₂			T₃
模型	U-Recall/%	WI	A-OSE	U-Recall/%	WI	A-OSE	U-Recall/%	WI	A-OSE
ORE-EBUI^［2］	4.9	0.062 1	10 459	2.9	0.028 2	10 445	3.9	0.021 1	7 990
OW-DETR^［3］	7.5	0.057 1	10 240	6.2	0.027 8	8 441	5.7	0.015 6	6 803
OCPL^［19］	8.3	0.041 3	5 670	7.6	0.022 0	5 690	11.9	0.016 2	5 166
BSDP^［20］	8.3	0.042 7	5 520	7.4	0.024 3	5 386	10.3	0.016 8	4 308
GDFN-OWOD	9.4	0.0306	3581	9.8	0.0211	3842	12.8	0.0153	3783

网络架构	所选块名称	输出尺寸	深度
ResNet50	Block 4.2	2 048×7×7	N-1
VGG16	Layer 13	512×14×14	N
MobileNetV3	Block 17	960×7×7	N

实验	G	B	F	T₁			T₂			T₃			T₄
实验	G	B	F	U-Recall/%	mAP/%	A-OSE	U-Recall/%	mAP/%	A-OSE	U-Recall/%	mAP/%	A-OSE	mAP/%
对照组				—	60.5	10 459	—	17.9	10 440	—	8.0	7 990	5.5
实验1	✓	✓		—	60.3	8 955	—	18.9	9 792	—	12.5	7 245	12.8
实验2		✓	✓	2.8	60.4	3 862	2.6	40.5	3 957	3.0	31.4	3 829	26.7
实验3	✓		✓	9.4	61.1	3 580	9.8	44.6	3 824	12.8	32.1	3 782	28.4
实验4	✓	✓	✓	9.4	61.2	3581	9.8	44.6	3842	12.8	32.1	3783	28.4

[1]	Yongpeng TAO, Shiqi BAI, Zhengwen ZHOU. Neural architecture search for multi-tissue segmentation using convolutional and transformer-based networks in glioma segmentation [J]. Journal of Computer Applications, 2025, 45(7): 2378-2386.
[2]	Dan WANG, Wenhao ZHANG, Lijuan PENG. Channel estimation of reconfigurable intelligent surface assisted communication system based on deep learning [J]. Journal of Computer Applications, 2025, 45(5): 1613-1618.
[3]	Baohua YUAN, Jialu CHEN, Huan WANG. Medical image segmentation network integrating multi-scale semantics and parallel double-branch [J]. Journal of Computer Applications, 2025, 45(3): 988-995.
[4]	Dixin WANG, Jiahao WANG, Min LI, Hao CHEN, Guangyao HU, Yu GONG. Abnormal attack detection for underwater acoustic communication network [J]. Journal of Computer Applications, 2025, 45(2): 526-533.
[5]	Xinran XU, Shaobing ZHANG, Miao CHENG, Yang ZHANG, Shang ZENG. Bearings fault diagnosis method based on multi-pathed hierarchical mixture-of-experts model [J]. Journal of Computer Applications, 2025, 45(1): 59-68.
[6]	Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910.
[7]	Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499.
[8]	Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization [J]. Journal of Computer Applications, 2024, 44(7): 1987-1994.
[9]	Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242.
[10]	Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919.
[11]	Jianjing LI, Guanfeng LI, Feizhou QIN, Weijun LI. Multi-relation approximate reasoning model based on uncertain knowledge graph embedding [J]. Journal of Computer Applications, 2024, 44(6): 1751-1759.
[12]	Wenshuo GAO, Xiaoyun CHEN. Point cloud classification network based on node structure [J]. Journal of Computer Applications, 2024, 44(5): 1471-1478.
[13]	Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545.
[14]	Jie WANG, Hua MENG. Image classification algorithm based on overall topological structure of point cloud [J]. Journal of Computer Applications, 2024, 44(4): 1107-1113.
[15]	Tianhua CHEN, Jiaxuan ZHU, Jie YIN. Bird recognition algorithm based on attention mechanism [J]. Journal of Computer Applications, 2024, 44(4): 1114-1120.

模型	RPC	推理速度/（frame·s^-1）
GDFN-OWOD-B+N	16.92	8.21
GDFN-OWOD	4.95	13.86

模型	RPC	推理速度/（frame·s^-1）
GDFN-OWOD-B+N	16.92	8.21
GDFN-OWOD	4.95	13.86