《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2201-2209.DOI: 10.11772/j.issn.1001-9081.2021050734

• 多媒体计算与计算机仿真 • 上一篇    

基于改进YOLOv4的轻量化目标检测算法

钟志峰, 夏一帆(), 周冬平, 晏阳天   

  1. 湖北大学 计算机与信息工程学院,武汉 430062
  • 收稿日期:2021-05-10 修回日期:2021-09-22 接受日期:2021-09-24 发布日期:2021-09-22 出版日期:2022-07-10
  • 通讯作者: 夏一帆
  • 作者简介:钟志峰(1971—),男,湖北黄冈人,教授,博士,主要研究方向:人工智能、信号处理、系统集成
    周冬平(1997—),男,湖北随州人,硕士研究生,主要研究方向:推荐系统、知识图谱
    晏阳天(1997—),男,湖北孝感人,硕士研究生,主要研究方向:深度学习、自然语言处理。
  • 基金资助:
    湖北省技术创新专项(2018ACA13)

Lightweight object detection algorithm based on improved YOLOv4

Zhifeng ZHONG, Yifan XIA(), Dongping ZHOU, Yangtian YAN   

  1. School of Computer Science and Information Engineering,Hubei University,Wuhan Hubei 430062,China
  • Received:2021-05-10 Revised:2021-09-22 Accepted:2021-09-24 Online:2021-09-22 Published:2022-07-10
  • Contact: Yifan XIA
  • About author:ZHONG Zhifeng, born in 1971, Ph. D., professor. His research interests include artificial intelligence, signal processing, system integration.
    ZHOU Dongping, born in 1997, M. S. candidate. His research interests include recommender system, knowledge graph.
    YAN Yangtian, born in 1997, M. S. candidate. His research interests include deep learning, natural language processing.
  • Supported by:
    Hubei Province Technological Innovation Special Project(2018ACA13)

摘要:

针对当前YOLOv4目标检测网络结构复杂、参数多、训练所需的配置高以及实时检测每秒传输帧数(FPS)低的问题,提出一种基于YOLOv4的轻量化目标检测算法ML-YOLO。首先,用MobileNetv3结构替换YOLOv4的主干特征提取网络,从而通过MobileNetv3中的深度可分离卷积大幅减少主干网络的参数量;然后,用简化的加权双向特征金字塔网络(Bi-FPN)结构替换YOLOv4的特征融合网络,从而用Bi-FPN中的注意力机制提高目标检测精度;最后,通过YOLOv4的解码算法来生成最终的预测框,并实现目标检测。在VOC2007数据集上的实验结果表明,ML-YOLO算法的平均准确率均值(mAP)达到80.22%,与YOLOv4算法相比降低了3.42个百分点,与YOLOv5m算法相比提升了2.82个百分点;而ML-YOLO算法的模型大小仅为44.75 MB,与YOLOv4算法相比减小了199.54 MB,与YOLOv5m算法相比,只高了2.85 MB。实验结果表明,所提的ML-YOLO模型,一方面较YOLOv4模型大幅减小了模型大小,另一方面保持了较高的检测精度,表明该算法可以满足移动端或者嵌入式设备进行目标检测的轻量化和准确性需求。

关键词: 目标检测, 轻量化网络, YOLOv4, MobileNetv3, 加权双向特征金字塔网络

Abstract:

YOLOv4 (You Only Look Once version 4) object detection network has complex structure, many parameters, high configuration required for training and low Frames Per Second (FPS) for real-time detection. In order to solve the above problems, a lightweight object detection algorithm based on YOLOv4, named ML-YOLO (MobileNetv3Lite-YOLO), was proposed. Firstly, MobileNetv3 was used to replace the backbone feature extraction network of YOLOv4, which greatly reduced the amount of backbone network parameters through the depthwise separable convolution in MobileNetv3. Then, a simplified weighted Bi-directional Feature Pyramid Network (Bi-FPN) structure was used to replace the feature fusion network of YOLOv4. Therefore, the object detection accuracy was optimized by the attention mechanism in Bi-FPN. Finally, the final prediction box was generated through the YOLOv4 decoding algorithm, and the object detection was realized. Experimental results on VOC (Visual Object Classes) 2007 dataset show that the mean Average Precision (mAP) of the ML-YOLO algorithm reaches 80.22%, which is 3.42 percentage points lower than that of the YOLOv4 algorithm, and 2.82 percentage points higher than that of the YOLOv5m algorithm; at the same time, the model size of the ML-YOLO algorithm is only 44.75 MB, compared with the YOLOv4 algorithm, it is reduced by 199.54 MB, and compared with the YOLOv5m algorithm, it is only 2.85 MB larger. Experimental results prove that the proposed ML-YOLO model greatly reduces the size of the model compared with the YOLOv4 model while maintaining a higher detection accuracy, indicating that the proposed algorithm can meet the lightweight and accuracy requirements of mobile or embedded devices for object detection.

Key words: object detection, lightweight network, YOLOv4 (You Only Look Once version 4), MobileNetv3, Bi-FPN (weighted Bi-directional Feature Pyramid Network)

中图分类号: