《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2564-2571.DOI: 10.11772/j.issn.1001-9081.2023050586

所属专题: 多媒体计算与计算机仿真

• 多媒体计算与计算机仿真 • 上一篇    下一篇

融合视觉特征增强机制的机器人弱光环境抓取检测

李淦1, 牛洺第1,2, 陈路1,2,3(), 杨静4, 闫涛1,2, 陈斌5,6   

  1. 1.山西大学 计算机与信息技术学院, 太原 030006
    2.山西大学 大数据科学与产业研究院, 太原 030006
    3.太原卫星发射中心 技术部, 太原 030027
    4.山西大学 自动化与软件学院, 太原 030031
    5.哈尔滨工业大学 重庆研究院, 重庆 401151
    6.哈尔滨工业大学(深圳) 国际人工智能研究院, 深圳 518055
  • 收稿日期:2023-05-16 修回日期:2023-06-12 接受日期:2023-06-16 发布日期:2023-08-07 出版日期:2023-08-10
  • 通讯作者: 陈路
  • 作者简介:李淦(2001—),男,山西吕梁人,主要研究方向:抓取检测、深度学习
    牛洺第(2000—),男,河南平顶山人,硕士研究生,主要研究方向:抓取检测、图像增强
    杨静(1990—),女,山西太原人,讲师,博士,主要研究方向:机器学习、图像处理
    闫涛(1987—),男,山西定襄人,副教授,博士,CCF会员,主要研究方向:三维重建
    陈斌(1970—),男,四川广汉人,教授,博士,主要研究方向:机器视觉。
  • 基金资助:
    国家自然科学基金资助项目(62003200);山西省基础研究计划项目(202203021222010);山西省科技重大专项(202201020101006)

Robotic grasp detection in low-light environment by incorporating visual feature enhancement mechanism

Gan LI1, Mingdi NIU1,2, Lu CHEN1,2,3(), Jing YANG4, Tao YAN1,2, Bin CHEN5,6   

  1. 1.School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China
    2.Institute of Big Data Science and Industry,Shanxi University,Taiyuan Shanxi 030006,China
    3.Technology Department,Taiyuan Satellite Launch Center,Taiyuan Shanxi 030027,China
    4.School of Automation and Software Engineering,Shanxi University,Taiyuan Shanxi 030031,China
    5.Chongqing Research Institute,Harbin Institute of Technology,Chongqing 401151,China
    6.International Institute of Artificial Intelligence,Harbin Institute of Technology (Shenzhen),Shenzhen Guangdong 518055,China
  • Received:2023-05-16 Revised:2023-06-12 Accepted:2023-06-16 Online:2023-08-07 Published:2023-08-10
  • Contact: Lu CHEN
  • About author:LI Gan, born in 2001. His research interests include grasp detection, deep learning.
    NIU Mingdi, born in 2000, M. S. candidate. His research interests include grasp detection, image enhancement.
    YANG Jing, born in 1990, Ph. D., lecturer. Her research interests include machine learning, image processing.
    YAN Tao, born in 1987, Ph. D., associate professor. His research interests include 3D reconstruction.
    CHEN Bin, born in 1970, Ph. D., professor. His research interests include machine vision.
  • Supported by:
    National Natural Science Foundation of China(62003200);Fundamental Research Program of Shanxi Province(202203021222010);Science and Technology Major Program of Shanxi Province(202201020101006)

摘要:

现有的机器人抓取操作通常在良好光照条件下开展,此时目标细节清晰、区域对比度高,而在夜间、遮挡等弱光环境下目标的视觉特征微弱,会导致现有的机器人抓取检测模型的检测准确率急剧下降。为提高弱光场景下稀疏、微弱抓取特征的表征能力,提出一种融合视觉特征增强机制的抓取检测模型,通过视觉增强子任务为抓取检测施加特征增强约束。对于抓取检测模块,采用仿U-Net框架的编码器-解码器结构实现特征的高效融合;对于弱光增强模块,从局部、全局层面分别提取纹理、颜色信息,以实现兼顾目标细节与视觉效果的特征增强。此外,分别构建弱光Cornell数据集和弱光Jacquard数据集两个新的弱光抓取基准数据集,并基于上述数据集开展对比实验。实验结果表明,所提弱光抓取检测模型在基准数据集上的准确率分别达到了95.5%和87.4%,与生成抓取卷积神经网络(GG-CNN)、生成残差卷积神经网络(GR-ConvNet)等现有抓取检测模型相比,准确率在弱光Cornell数据集提升11.1、1.2个百分点,在弱光Jacquard数据集上提升5.5、5.0个百分点,取得了较好的抓取检测效果。

关键词: 机器人, 抓取检测, 弱光成像, 深度神经网络, 视觉增强

Abstract:

Existing robotic grasping operations are usually performed under well-illuminated conditions with clear object details and high regional contrast. At the same time, for low-light conditions caused by night and occlusion, where the objects’ visual features are weak, the detection accuracies of existing robotic grasp detection models decrease dramatically. In order to improve the representation ability of sparse and weak grasp features in low-light scenarios, a grasp detection model incorporating visual feature enhancement mechanism was proposed to use the visual enhancement sub-task to impose feature enhancement constraints on grasp detection. In grasp detection module, the U-Net like encoder-decoder structure was adopted to achieve efficient feature fusion. In low-light enhancement module, the texture and color information was respectively extracted from local and global level, thereby balancing the object details and visual effect in feature enhancement. In addition, two low-light grasp datasets called low-light Cornell dataset and low-light Jacquard dataset were constructed as new benchmark dataset of low-light grasp and used to conduct the comparative experiments. Experimental results show that the accuracies of the proposed low-light grasp detection model are 95.5% and 87.4% on the benchmark datasets respectively, which are 11.1, 1.2 percentage points higher on low-light Cornell dataset and 5.5, 5.0 percentage points higher on low-light Jacquard dataset than those of the existing grasp detection models, including Generative Grasping Convolutional Neural Network (GG-CNN), and Generative Residual Convolutional Neural Network (GR-ConvNet), indicating that the proposed model has good grasp detection performance.

Key words: robot, grasp detection, low-light imaging, deep neural network, visual enhancement

中图分类号: