《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (5): 1619-1628.DOI: 10.11772/j.issn.1001-9081.2023050675

• 多媒体计算与计算机仿真 • 上一篇    

基于分离式标签协同学习的YOLOv5多属性分类

李鑫, 孟乔(), 皇甫俊逸, 孟令辰   

  1. 青海大学 计算机技术与应用系,西宁 810016
  • 收稿日期:2023-06-01 修回日期:2023-09-17 接受日期:2023-10-11 发布日期:2023-10-17 出版日期:2024-05-10
  • 通讯作者: 孟乔
  • 作者简介:李鑫(1995—),男,四川南充人,硕士研究生,主要研究方向:智能交通、计算机视觉
    皇甫俊逸(1998—),男,江西上饶人,硕士,主要研究方向:图像处理、视频分析
    孟令辰(1999—),男,河南南阳人,硕士研究生,主要研究方向:智能交通。
    第一联系人:孟乔(1983—),女,陕西咸阳人,讲师,博士,CCF会员,主要研究方向:智能交通、信息系统工程
  • 基金资助:
    青海省自然科学基金资助项目(2023?ZJ?989Q)

YOLOv5 multi-attribute classification based on separable label collaborative learning

Xin LI, Qiao MENG(), Junyi HUANGFU, Lingchen MENG   

  1. Department of Computer Technology and Applications,Qinghai University,Xining Qinghai 810016,China
  • Received:2023-06-01 Revised:2023-09-17 Accepted:2023-10-11 Online:2023-10-17 Published:2024-05-10
  • Contact: Qiao MENG
  • About author:LI Xin, born in 1995, M. S. candidate. His research interests include intelligent transportation, computer vision.
    HUANGFU Junyi, born in 1998, M. S. His research interests include image processing, video analysis.
    MENG Lingchen, born in 1999, M. S. candidate. His research interests include intelligent transportation.
  • Supported by:
    Natural Science Foundation of Qinghai Province(2023-ZJ-989Q)

摘要:

针对图像分类任务中卷积网络提取图像细粒度特征能力不足、多属性之间的依赖关系无法识别的问题,提出一种基于YOLOv5的车辆多属性分类方法Multi-YOLOv5。该方法设计了多头非极大值抑制(Multi-NMS)和分离式标签损失(Separate-Loss)函数协同工作机制实现车辆的多属性分类任务,并采用卷积块注意力模块(CBAM)、SA(Shuffle Attention)和CoordConv方法重构了YOLOv5检测模型,分别从提升多属性特征能力提取、增强不同属性之间的关联关系、增强网络对位置信息的感知能力三方面提升模型对目标多属性分类的精准性。在VeRi等数据集上进行了训练与测试,实验结果表明,与基于GoogLeNet、残差网络(ResNet)、EfficientNet、ViT(Vision Transformer)等的网络结构相比,Multi-YOLOv5方法在目标的多属性分类方面取得了较好的识别结果,在VeRi数据集上,它的平均精度均值(mAP)达到了87.37%,较上述表现最佳的方法提高了4.47个百分点,且比原YOLOv5模型具有更好的鲁棒性,能为密集环境下的交通目标感知提供可靠的数据信息。

关键词: 多属性分类, 深度学习, 多特征融合, 注意力, YOLOv5

Abstract:

An Multi-YOLOv5 method was proposed for vehicle multi-attribute classification based on YOLOv5 to address the challenges of insufficient ability of convolutional networks to extract fine-grained features of images and inability to recognize dependencies between multiple attributes in image classification tasks. A collaborative working mechanism of Multi-head Non-Maximum Suppression (Multi-NMS) and separable label loss (Separate-Loss) function was designed to complete the multi-attribute classification task of vehicles. Additionally, the YOLOv5 detection model was reconstructed by using Convolutional Block Attention Module (CBAM), Shuffle Attention (SA), and CoordConv methods to enhance the ability of extracting multi-attribute features, strengthen the correlation between different attributes, and enhance the network’s perception of positional information, thereby improving the accuracy of the model in multi-attribute classification of objects. Finally, training and testing were conducted on datasets such as VeRi. Experimental results demonstrate that the Multi-YOLOv5 approach achieves superior recognition outcomes in multi-attribute classification of objects compared to network architectures including GoogLeNet, Residual Network (ResNet), EfficientNet, and Vision Transformer (ViT). The mean Average Precision (mAP) of Multi-YOLOv5 reaches 87.37% on VeRi dataset, with a remarkable improvement of 4.47 percentage points over the best-performing method mentioned above. Moreover, Multi-YOLOv5 exhibits better robustness compared to the original YOLOv5 model, thus providing reliable data information for traffic object perception in dense environments.

Key words: multi-attribute classification, deep learning, multi-feature fusion, attention, YOLOv5

中图分类号: