Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (11): 3558-3563.DOI: 10.11772/j.issn.1001-9081.2021122122

• ChinaVR 2021 • Previous Articles    

Object detection algorithm combined with optimized feature extraction structure

Nan XIANG(), Chuanzhong PAN, Gaoxiang YU   

  1. Liangjiang International College,Chongqing University of Technology,Chongqing 401135,China
  • Received:2021-12-17 Revised:2022-02-13 Accepted:2022-02-14 Online:2022-03-02 Published:2022-11-10
  • Contact: Nan XIANG
  • About author:XIANG Nan, born in 1984, Ph. D., associate professor. His research interests include affective computing, social computing, object detection.
    PAN Chuanzhong, born in 1995, M. S. candidate. His research interests include object detection.
    YU Gaoxiang, born in 1995, M. S. candidate. His research interests include object detection.
  • Supported by:
    National Natural Science Foundation of China(61872051);Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN202001118);Application Research Project of Banan Science and Technology Commission(2018TJ02)

融合优化特征提取结构的目标检测算法

向南(), 潘传忠, 虞高翔   

  1. 重庆理工大学 两江国际学院,重庆 401135
  • 通讯作者: 向南
  • 作者简介:向南(1984—),男,陕西旬阳人,副教授,博士,CCF会员,主要研究方向:情感计算、社交计算、目标检测 xiangnan@cqut.edu.cn
    潘传忠(1995—),男,湖北咸宁人,硕士研究生,主要研究方向:目标检测
    虞高翔(1995—),男,江西上饶人,硕士研究生,主要研究方向:目标检测。
  • 基金资助:
    国家自然科学基金资助项目(61872051);重庆市教委科学技术研究计划项目(KJQN202001118);巴南区科委应用研究项目(2018TJ02)

Abstract:

Concerning the problem of low object detection precision of DEtection TRansformer (DETR) for small targets, an object detection algorithm with optimized feature extraction structure, called CF?DETR (DETR combined CSP?Darknet53 and Feature pyramid network), was proposed on the basis of DETR. Firstly, CSP?Darknet53 combined with the optimized Cross Stage Partial (CSP) network was used to extract the features of the original image, and feature maps of 4 scales were output. Secondly, the Feature Pyramid Network (FPN) was used to splice and fuse the 4 scale feature maps after down?sampling and up?sampling, and output a 52×52 size feature map. Finally, the obtained feature map and the location coding information were combined and input into the Transformer to obtain the feature sequence. Through the Forward Feedback Networks (FFNs) as the prediction head, the category and location information of the prediction object was output. On COCO2017 dataset, compared with DETR, CF?DETR has the number of model hyperparameters reduced by 2×106, the average detection precision of small objects improved by 2.1 percentage points, and the average detection precision of medium? and large?sized objects improved by 2.3 percentage points. Experimental results show that the optimized feature extraction structure can effectively improve the DETR detection precision while reducing the number of model hyperparameters.

Key words: object detection, samll target, DEtection TRansformer (DETR) algorithm, feature extraction, Cross Stage Partial (CSP) network, Feature Pyramid Network (FPN), Transformer

摘要:

针对DETR对小目标的检测精度低的问题,基于DETR提出一种优化特征提取结构的目标检测算法——CF?DETR。首先通过结合了优化跨阶段部分(CSP)网络的CSP?Darknet53对原始图进行特征提取并输出4种尺度的特征图;其次利用特征金字塔网络(FPN)对4种尺度特征图进行下采样和上采样后进行拼接融合,并输出52×52尺寸的特征图;最后将该特征图与位置编码信息结合输入Transformer后得到特征序列,输入到作为预测头的前向反馈网络后输出预测目标的类别与位置信息。在COCO2017数据集上,与DETR相比,CF?DETR的模型的超参数量减少了2×106,在小目标上的平均检测精度提高2.1个百分点,在中、大尺寸目标上的平均检测精度提高了2.3个百分点。实验结果表明,优化特征提取结构能够在降低模型超参数量的同时有效提高DETR的检测精度。

关键词: 目标检测, 小目标, DETR算法, 特征提取, 跨阶段部分网络, 特征金字塔网络, Transformer

CLC Number: