融合视觉特征增强机制的机器人弱光环境抓取检测

doi:10.11772/j.issn.1001-9081.2023050586

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2564-2571.DOI: 10.11772/j.issn.1001-9081.2023050586

所属专题：多媒体计算与计算机仿真

• 多媒体计算与计算机仿真 • 上一篇下一篇

融合视觉特征增强机制的机器人弱光环境抓取检测

李淦¹, 牛洺第¹^,², 陈路¹^,²^,³(), 杨静⁴, 闫涛¹^,², 陈斌⁵^,⁶

^1.山西大学计算机与信息技术学院, 太原 030006
^2.山西大学大数据科学与产业研究院, 太原 030006
^3.太原卫星发射中心技术部, 太原 030027
^4.山西大学自动化与软件学院, 太原 030031
^5.哈尔滨工业大学重庆研究院, 重庆 401151
^6.哈尔滨工业大学(深圳) 国际人工智能研究院, 深圳 518055

收稿日期:2023-05-16 修回日期:2023-06-12 接受日期:2023-06-16 发布日期:2023-08-07 出版日期:2023-08-10
通讯作者: 陈路
作者简介:李淦（2001—），男，山西吕梁人，主要研究方向：抓取检测、深度学习
牛洺第（2000—），男，河南平顶山人，硕士研究生，主要研究方向：抓取检测、图像增强
杨静（1990—），女，山西太原人，讲师，博士，主要研究方向：机器学习、图像处理
闫涛（1987—），男，山西定襄人，副教授，博士，CCF会员，主要研究方向：三维重建
陈斌（1970—），男，四川广汉人，教授，博士，主要研究方向：机器视觉。
基金资助:
国家自然科学基金资助项目(62003200);山西省基础研究计划项目(202203021222010);山西省科技重大专项(202201020101006)

Robotic grasp detection in low-light environment by incorporating visual feature enhancement mechanism

Gan LI¹, Mingdi NIU¹^,², Lu CHEN¹^,²^,³(), Jing YANG⁴, Tao YAN¹^,², Bin CHEN⁵^,⁶

^1.School of Computer and Information Technology，Shanxi University，Taiyuan Shanxi 030006，China
^2.Institute of Big Data Science and Industry，Shanxi University，Taiyuan Shanxi 030006，China
^3.Technology Department，Taiyuan Satellite Launch Center，Taiyuan Shanxi 030027，China
^4.School of Automation and Software Engineering，Shanxi University，Taiyuan Shanxi 030031，China
^5.Chongqing Research Institute，Harbin Institute of Technology，Chongqing 401151，China
^6.International Institute of Artificial Intelligence，Harbin Institute of Technology （Shenzhen），Shenzhen Guangdong 518055，China

Received:2023-05-16 Revised:2023-06-12 Accepted:2023-06-16 Online:2023-08-07 Published:2023-08-10
Contact: Lu CHEN
About author:LI Gan， born in 2001. His research interests include grasp detection， deep learning.
NIU Mingdi， born in 2000， M. S. candidate. His research interests include grasp detection， image enhancement.
YANG Jing， born in 1990， Ph. D.， lecturer. Her research interests include machine learning， image processing.
YAN Tao， born in 1987， Ph. D.， associate professor. His research interests include 3D reconstruction.
CHEN Bin， born in 1970， Ph. D.， professor. His research interests include machine vision.
Supported by:
National Natural Science Foundation of China(62003200);Fundamental Research Program of Shanxi Province(202203021222010);Science and Technology Major Program of Shanxi Province(202201020101006)

摘要/Abstract

摘要：

现有的机器人抓取操作通常在良好光照条件下开展，此时目标细节清晰、区域对比度高，而在夜间、遮挡等弱光环境下目标的视觉特征微弱，会导致现有的机器人抓取检测模型的检测准确率急剧下降。为提高弱光场景下稀疏、微弱抓取特征的表征能力，提出一种融合视觉特征增强机制的抓取检测模型，通过视觉增强子任务为抓取检测施加特征增强约束。对于抓取检测模块，采用仿U-Net框架的编码器-解码器结构实现特征的高效融合；对于弱光增强模块，从局部、全局层面分别提取纹理、颜色信息，以实现兼顾目标细节与视觉效果的特征增强。此外，分别构建弱光Cornell数据集和弱光Jacquard数据集两个新的弱光抓取基准数据集，并基于上述数据集开展对比实验。实验结果表明，所提弱光抓取检测模型在基准数据集上的准确率分别达到了95.5%和87.4%，与生成抓取卷积神经网络（GG-CNN）、生成残差卷积神经网络（GR-ConvNet）等现有抓取检测模型相比，准确率在弱光Cornell数据集提升11.1、1.2个百分点，在弱光Jacquard数据集上提升5.5、5.0个百分点，取得了较好的抓取检测效果。

关键词: 机器人, 抓取检测, 弱光成像, 深度神经网络, 视觉增强

Abstract:

Existing robotic grasping operations are usually performed under well-illuminated conditions with clear object details and high regional contrast. At the same time， for low-light conditions caused by night and occlusion， where the objects’ visual features are weak， the detection accuracies of existing robotic grasp detection models decrease dramatically. In order to improve the representation ability of sparse and weak grasp features in low-light scenarios， a grasp detection model incorporating visual feature enhancement mechanism was proposed to use the visual enhancement sub-task to impose feature enhancement constraints on grasp detection. In grasp detection module， the U-Net like encoder-decoder structure was adopted to achieve efficient feature fusion. In low-light enhancement module， the texture and color information was respectively extracted from local and global level， thereby balancing the object details and visual effect in feature enhancement. In addition， two low-light grasp datasets called low-light Cornell dataset and low-light Jacquard dataset were constructed as new benchmark dataset of low-light grasp and used to conduct the comparative experiments. Experimental results show that the accuracies of the proposed low-light grasp detection model are 95.5% and 87.4% on the benchmark datasets respectively， which are 11.1， 1.2 percentage points higher on low-light Cornell dataset and 5.5， 5.0 percentage points higher on low-light Jacquard dataset than those of the existing grasp detection models， including Generative Grasping Convolutional Neural Network （GG-CNN）， and Generative Residual Convolutional Neural Network （GR-ConvNet）， indicating that the proposed model has good grasp detection performance.

Key words: robot, grasp detection, low-light imaging, deep neural network, visual enhancement

中图分类号:

TP391.4

李淦, 牛洺第, 陈路, 杨静, 闫涛, 陈斌. 融合视觉特征增强机制的机器人弱光环境抓取检测[J]. 计算机应用, 2023, 43(8): 2564-2571.

Gan LI, Mingdi NIU, Lu CHEN, Jing YANG, Tao YAN, Bin CHEN. Robotic grasp detection in low-light environment by incorporating visual feature enhancement mechanism[J]. Journal of Computer Applications, 2023, 43(8): 2564-2571.

图/表 13

图1 GR-ConvNet和本文模型的弱光条件下抓取检测结果对比

Fig. 1 Comparison of detection results under low-light conditions between GR-ConvNet and our model

图2 本文弱光抓取检测网络总体结构

Fig. 2 Overall structure of the proposed low-light grasp detection network

图3 特征提取模块结构

Fig. 3 Structure of feature extraction module

图4 抓取检测模块结构

Fig. 4 Structure of grasp detection module

图5 弱光增强模块结构

Fig. 5 Structure of low-light enhancement module

图6 调节不同Gamma值和加入不同噪声后弱光Cornell数据集和弱光Jacquard数据集对比

Fig. 6 Comparison of low-light Cornell dataset and low-light Jacquard dataset after adjusting different Gamma values and adding different noises

表1 不同模型在弱光Cornell数据集上的检测准确率对比（g=1.5，高斯白噪声） (%)

Tab. 1 Comparison of detection accuracy of different models on low-light Cornell dataset （g=1.5，white Gaussian noise）

模型	准确率	模型	准确率
GG-CNN^［27］	84.0	ResNet-50^［31］	90.7
AlexNet^［5］	81.0	GR-ConvNet^［9］	94.3
SqueezeNet^［30］	89.3	本文模型	95.5

表2 不同模型在弱光Jacquard数据集上的检测准确率对比（g=1.5，高斯白噪声） (%)

Tab. 2 Comparison of detection accuracy of different model on low-light Jacquard dataset （g=1.5，white Gaussian noise）

模型	准确率	模型	准确率
GG-CNN^［27］	81.9	GR-ConvNet^［9］	82.4
TF-grasp^［32］	85.8	本文模型	87.4

图7 弱光Cornell数据集上的抓取检测结果

Fig. 7 Grasp detection results on low-light Cornell dataset

图8 弱光Jacquard数据集上的抓取检测结果

Fig. 8 Grasp detection results on low-light Jacquard dataset

图9 多抓取框结果对比

Fig. 9 Comparison of multiple grasp box results

表3 所提模型在不同Gamma值和噪声类型下的抓取检测结果对比（弱光Cornell数据集）

Tab. 3 Grasp detection results comparison of the proposed algorithm under different Gamma values and noises （low-light Cornell dataset）

g值	噪声类型	准确率/%
1.2	椒盐噪声	96.6
	高斯噪声	94.3
	高斯白噪声	96.6
	泊松噪声	95.5
	乘性噪声	97.7
1.5	椒盐噪声	96.6
	高斯噪声	97.7
	高斯白噪声	95.5
	泊松噪声	94.4
	乘性噪声	94.4
2.0	椒盐噪声	92.1
	高斯噪声	96.6
	高斯白噪声	92.1
	泊松噪声	92.1
	乘性噪声	92.1

表4 弱光Cornell数据集下的消融实验结果

Tab. 4 Ablation experimental results on low-light Cornell dataset

GDM	Local	Global	准确率/%
√			91.0
√		√	93.2
√	√		94.4

参考文献 32

1	王耀南，江一鸣，姜娇，等. 机器人感知与控制关键技术及其智能制造应用［J］. 自动化学报， 2023， 49（3）： 494-513.
	WANG Y N， JIANG Y M， JIANG J， et al. Key technologies of robot perception and control and its intelligent manufacturing applications［J］. Acta Automatica Sinica， 2023， 49（3）： 494-513.
2	韩鑫，余永维，杜柳青. 基于改进单次多框检测算法的机器人抓取系统［J］. 计算机应用， 2020， 40（8）： 2434-2440.
	HAN X， YU Y W， DU L Q. Robotic grasping system based on improved single shot multibox detector algorithm［J］. Journal of Computer Applications， 2020， 40（8）： 2434-2440.
3	姚日辉，陈雯柏，陈启丽，等. 家庭服务机器人知识图谱的构建与应用［J］. 北京邮电大学学报， 2022， 45（5）：72-78.
	YAO R H， CHEN W B， CHEN Q L， et al. Construction and application of knowledge graph for home service robot［J］. Journal of Beijing University of Posts and Telecommunications， 2022， 45（5）： 72-78.
4	韩非，张道辉，赵新刚，等. 面向水下抓取作业的复合腔体仿生软体手设计［J］. 机器人， 2023， 45（2）：207-217. 10.13973/j.cnki.robot.210473
	HAN F， ZHANG D H， ZHAO X G， et al. Design of a bionic soft hand with compound cavity for underwater grasping［J］. Robot， 2023， 45（2）： 207-217. 10.13973/j.cnki.robot.210473
5	REDMON J， ANGELOVA A. Real-time grasp detection using convolutional neural networks［C］// Proceedings of the 2015 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2015： 1316-1322. 10.1109/icra.2015.7139361
6	ASIF U， TANG J B， HARRER S. GraspNet： an efficient convolutional neural network for real-time grasp detection for low-powered devices［C］// Proceedings of the 27th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2018： 4875-4882. 10.24963/ijcai.2018/677
7	WU Y X， ZHANG F H， FU Y L. Real-time robotic multigrasp detection using anchor-free fully convolutional grasp detector［J］. IEEE Transactions on Industrial Electronics， 2022， 69（12）： 13171-13181. 10.1109/tie.2021.3135629
8	YU S， ZHAI D H， XIA Y Q， et al. SE-ResUNet： a novel robotic grasp detection method［J］. IEEE Robotics and Automation Letters， 2022， 7（2）： 5238-5245. 10.1109/lra.2022.3145064
9	KUMRA S， JOSHI S， SAHIN F. Antipodal robotic grasping using generative residual convolutional neural network［C］// Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2020： 9626-9633. 10.1109/iros45743.2020.9340777
10	IGNATOV A， KOBYSHEV N， TIMOFTE R， et al. DSLR-quality photos on mobile devices with deep convolutional networks［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3297-3305. 10.1109/iccv.2017.355
11	GOODFELLOW I， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial networks［J］. Communications of the ACM， 2020， 63（11）： 139-144. 10.1145/3422622
12	SHARMA V， DIBA A， NEVEN D， et al. Classification-driven dynamic image enhancement［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 4033-4041. 10.1109/cvpr.2018.00424
13	LIU W Y， REN G F， YU R S， et al. Image-adaptive YOLO for object detection in adverse weather conditions［C］// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2022： 1792-1800. 10.1609/aaai.v36i2.20072
14	JIANG Y， MOSESON S， SAXENA A. Efficient grasping from RGBD images： learning using a new rectangle representation［C］// Proceedings of the 2011 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2011： 3304-3311. 10.1109/icra.2011.5980145
15	PINTO L， GUPTA A. Supersizing self-supervision： learning to grasp from 50K tries and 700 robot hours［C］// Proceedings of the 2016 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2016： 3406-3413. 10.1109/icra.2016.7487517
16	AINETTER S， FRAUNDORFER F. End-to-end trainable deep neural network for robotic grasp detection and semantic segmentation from RGB［C］// Proceedings of the 2021 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2021： 13452-13458. 10.1109/icra48506.2021.9561398
17	SATISH V， MAHLER J， GOLDBERG K. On-policy dataset synthesis for learning robot grasping policies using fully convolutional deep networks［J］. IEEE Robotics and Automation Letters， 2019， 4（2）： 1357-1364. 10.1109/lra.2019.2895878
18	CAO H， CHEN G， LI Z J， et al. Lightweight convolutional neural network with Gaussian-based grasping representation for robotic grasping detection［EB/OL］. （2021-01-25）［2023-06-07］..
19	SONG Y X， WEN J， LIU D F， et al. Deep robotic grasping prediction with hierarchical RGB-D fusion［J］. International Journal of Control， Automation and Systems， 2022， 20（1）： 243-254. 10.1007/s12555-020-0197-z
20	SHUKLA P， PRAMANIK N， MEHTA D， et al. Generative model based robotic grasp pose prediction with limited dataset［J］. Applied Intelligence， 2022， 52（9）： 9952-9966. 10.1007/s10489-021-03011-z
21	WEI C， WANG W J， YANG W H， et al. Deep Retinex decomposition for low-light enhancement［C］// Proceedings of the 2018 British Machine Vision Conference. Durham： BMVA Press， 2018： No.451. 10.48550/arXiv.1808.04560
22	WANG Y， CAO Y， ZHA Z J， et al. Progressive Retinex： mutually reinforced illumination-noise perception network for low-light image enhancement［C］// Proceedings of the 27th ACM International Conference on Multimedia. New York： ACM， 2019： 2015-2023. 10.1145/3343031.3350983
23	LIU R S， MA L， ZHANG J A， et al. Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 10556-10565. 10.1109/cvpr46437.2021.01042
24	GUO C L， LI C Y， GUO J C， et al. Zero-reference deep curve estimation for low-light image enhancement［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 1777-1786. 10.1109/cvpr42600.2020.00185
25	JIANG Y F， GONG X Y， LIU D， et al. EnlightenGAN： deep light enhancement without paired supervision［J］. IEEE Transactions on Image Processing， 2021， 30： 2340-2349. 10.1109/tip.2021.3051462
26	CUI Z T， LI K C， GU L， et al. You only need 90k parameters to adapt light： a light weight transformer for image enhancement and exposure correction［C］// Proceedings of the 2022 British Machine Vision Conference. Durham： BMVA Press， 2022： No.238.
27	MORRISON D， CORKE P， LEITNER J. Learning robust， real-time， reactive robotic grasping［J］. The International Journal of Robotics Research， 2020， 39（2/3）： 183-201. 10.1177/0278364919859066
28	RONNEBERGER O， FISCHER P， BROX T. U-Net： convolutional networks for biomedical image segmentation［C］// Proceedings of the 2015 Medical Image Computing and Computer-Assisted Intervention， LNCS 9351. Cham： Springer， 2015： 234-241.
29	DEPIERRE A， DELLANDRÉA E， CHEN L M. Jacquard： a large scale dataset for robotic grasp detection［C］// Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2018： 3511-3516. 10.1109/iros.2018.8593950
30	IANDOLA F N， HAN S， MOSKEWICZ M W， et al. SqueezeNet： AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size［EB/OL］. （2016-11-04）［2023-06-07］..
31	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
32	WANG S C， ZHOU Z L， KAN Z. When transformer meets robotic grasping： exploits context for efficient grasp detection［J］. IEEE Robotics and Automation Letters， 2022， 7（3）： 8170-8177. 10.1109/lra.2022.3187261

[1]	石锐, 李勇, 朱延晗. 基于特征梯度均值化的调制信号对抗样本攻击算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2521-2527.
[2]	何浩东, 符浩, 王强, 周帅, 刘伟. 基于深度强化学习的多机器人路径跟随与编队[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2626-2633.
[3]	胡映, 陈志环. 侧滑和打滑下的轮式移动机器人轨迹跟踪控制[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2294-2300.
[4]	马天, 席润韬, 吕佳豪, 曾奕杰, 杨嘉怡, 张杰慧. 基于深度强化学习的移动机器人三维路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2055-2064.
[5]	王美, 苏雪松, 刘佳, 殷若南, 黄珊. 时频域多尺度交叉注意力融合的时间序列分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1842-1847.
[6]	肖斌, 杨模, 汪敏, 秦光源, 李欢. 独立性视角下的相频融合领域泛化方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1002-1009.
[7]	黄海新, 于广威, 程寿山, 李春明. 基于改进灰狼优化的桥梁检测爬壁机器人全覆盖路径规划[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 966-971.
[8]	颜梦玫, 杨冬平. 深度神经网络平均场理论综述[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 331-343.
[9]	李源潮, 陶重犇, 王琛. 基于最大熵深度强化学习的双足机器人步态控制方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 445-451.
[10]	邓辅秦, 官桧锋, 谭朝恩, 付兰慧, 王宏民, 林天麟, 张建民. 基于请求与应答通信机制和局部注意力机制的多机器人强化学习路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 432-438.
[11]	柴汶泽, 范菁, 孙书魁, 梁一鸣, 刘竟锋. 深度度量学习综述[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 2995-3010.
[12]	乔恩保, 高向阳, 程俊. 基于支持向量机的自恢复自适应蒙特卡洛定位算法[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3246-3251.
[13]	赵旭剑, 李杭霖. 基于混合机制的深度神经网络压缩算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2686-2691.
[14]	申云飞, 申飞, 李芳, 张俊. 基于张量虚拟机的深度神经网络模型加速方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2836-2844.
[15]	李校林, 杨松佳. 基于深度学习的多用户毫米波中继网络混合波束赋形[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2511-2516.

融合视觉特征增强机制的机器人弱光环境抓取检测

Robotic grasp detection in low-light environment by incorporating visual feature enhancement mechanism

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 32

相关文章 15

编辑推荐

Metrics