融合空间-傅里叶域信息的机器人低光环境抓取检测

doi:10.11772/j.issn.1001-9081.2024111686

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (5): 1686-1693.DOI: 10.11772/j.issn.1001-9081.2024111686

• 多媒体计算与计算机仿真 • 上一篇

融合空间-傅里叶域信息的机器人低光环境抓取检测

陈路¹^,², 王怀瑶¹^,², 刘京阳¹^,², 闫涛¹^,²(), 陈斌³^,⁴^,⁵

^1.山西大学大数据科学与产业研究院，太原 030006
^2.山西大学计算机与信息技术学院，太原 030006
^3.中国科学院大学，北京 100049
^4.哈尔滨工业大学（深圳）国际人工智能研究院，广东深圳 518055
^5.哈尔滨工业大学重庆研究院，重庆 401151

收稿日期:2024-12-02 修回日期:2025-01-27 接受日期:2025-02-12 发布日期:2025-02-14 出版日期:2025-05-10
通讯作者: 闫涛
作者简介:陈路（1991—），男，山东聊城人，副教授，博士，CCF会员，主要研究方向：机器人抓取、图像增强
王怀瑶（2000—），女，山西吕梁人，硕士研究生，主要研究方向：抓取检测、低光图像增强
刘京阳（1999—），男，山西大同人，硕士研究生，CCF会员，主要研究方向：6D位姿估计、6D抓取检测
闫涛（1987—），男，山西定襄人，副教授，博士，CCF会员，主要研究方向：三维重建
陈斌（1970—），男，四川广汉人，研究员，博士，主要研究方向：机器视觉、工业检测、深度学习。
基金资助:
国家自然科学基金资助项目(62373233);山西省基础研究计划项目(202203021222010);山西省科技重大专项(202201020101006)

Robotic grasp detection with feature fusion of spatial-Fourier domain information under low-light environments

Lu CHEN¹^,², Huaiyao WANG¹^,², Jingyang LIU¹^,², Tao YAN¹^,²(), Bin CHEN³^,⁴^,⁵

^1.Institute of Big Data Science and Industry，Shanxi University，Taiyuan Shanxi 030006，China
^2.School of Computer and Information Technology，Shanxi University，Taiyuan Shanxi 030006，China
^3.University of Chinese Academy of Sciences，Beijing 100049，China
^4.International Institute for Artificial Intelligence，Harbin Institute of Technology （Shenzhen），Shenzhen Guangdong 518055，China
^5.Chongqing Research Institute，Harbin Institute of Technology，Chongqing 401100，China

Received:2024-12-02 Revised:2025-01-27 Accepted:2025-02-12 Online:2025-02-14 Published:2025-05-10
Contact: Tao YAN
About author:CHEN Lu， born in 1991， Ph. D.， associated professor. His research interests include robotic grasping， image enhancement.
WANG Huaiyao， born in 2000， M. S. candidate. Her research interests include grasp detection， low-light image enhancement.
LIU Jingyang， born in 1999， M. S. candidate. His research interests include 6D pose estimation， 6D grasp detection.
YAN Tao， born in 1987， Ph. D.， associated professor. His research interests include 3D reconstruction.
CHEN Bin， born in 1970， Ph. D.， research fellow. His research interests include machine vision， industrial inspection， deep learning.
Supported by:
National Natural Science Foundation of China(62373233);Fundamental Research Program of Shanxi Province(202203021222010);Science and Technology Major Project of Shanxi Province(202201020101006)

摘要/Abstract

摘要：

针对现有抓取检测方法无法有效感知稀疏、微弱特征，导致低光环境下机器人抓取检测性能下降的问题，提出一种融合空间-傅里叶域信息的机器人低光环境抓取检测方法。首先，该方法的骨干网络采用编-解码器结构，在网络深层特征与浅层特征融合过程中进行空间域-傅里叶域的特征提取。具体地，在空间域中通过水平和垂直方向的条带卷积捕获全局上下文信息，提取对抓取检测任务敏感的特征；在傅里叶域中分别调整振幅和相位，实现对图像细节和纹理特征的恢复。其次，引入R-CoA（Row-Column Attention）模块平衡图像全局与局部信息，并对图像进行行、列相对位置编码以强化与抓取任务相关的位置信息。最后，在低光Cornell、低光Jacquard以及所构建的低光C?Cornell数据集上分别进行验证，所提低光抓取检测方法最高准确率分别达到96.62%、92.01%和95.50%。在低光Cornell数据集（高斯噪声且γ=1.5）上，与GR-ConvNetv2（Generative Residual Convolutional Neural Network v2）、SE?ResUNet（Squeeze-and-Excitation ResUNet）相比，所提方法的准确率分别提升2.24个百分点和1.12个百分点。所提方法能够在低光环境下有效提升抓取检测的鲁棒性和准确性，为机器人在低光照条件下的抓取任务提供支持。

关键词: 机器人, 抓取检测, 空间-傅里叶域, 注意力机制, 深度神经网络

Abstract:

Aiming at the inadequacy of the existing grasp detection methods that cannot effectively perceive sparse and weak features， leading to performance degradation in robot grasp detection under low-light environments， a robotic grasp detection method that integrated spatial-Fourier domain information for low-light environments was proposed. Firstly， the proposed model utilized an encoder-decoder architecture as its backbone， and performed spatial-Fourier domain feature extraction during the fusion of deep and shallow features within the network. Specifically， in the spatial domain， global contextual information was captured using strip convolutions applied in horizontal and vertical directions， enabling the extraction of information critical to the grasp detection task. In the Fourier domain， image details and texture features were restored by independently modulating amplitude and phase components. Furthermore， a R-CoA （Row-Column Attention） module was incorporated to effectively balance global and local image information， while encoding the relative positional relationships of image rows and columns to emphasize positional information pertinent to grasp tasks. Finally， validation on low-light Cornell， low-light Jacquard， and the constructed low-light C-Cornell datasets demonstrates that the proposed method achieves highest accuracies of 96.62%， 92.01%， and 95.50%， respectively. Specifically， on the low-light Cornell dataset （Gaussian noise and $γ = 1.5$ ）， the proposed method outperforms GR-ConvNetv2 （Generative Residual Convolutional Neural Network v2） and SE-ResUNet （Squeeze-and-Excitation ResUNet） in accuracy by 2.24 percentage points and 1.12 percentage points， respectively. The proposed method can effectively improve the robustness and accuracy of grasp detection in low-light environments， providing support for robotic grasping tasks under insufficient illumination conditions.

Key words: robot, grasp detection, spatial-Fourier domain, attention mechanism, deep neural network

中图分类号:

TP391.4

陈路, 王怀瑶, 刘京阳, 闫涛, 陈斌. 融合空间-傅里叶域信息的机器人低光环境抓取检测[J]. 计算机应用, 2025, 45(5): 1686-1693.

Lu CHEN, Huaiyao WANG, Jingyang LIU, Tao YAN, Bin CHEN. Robotic grasp detection with feature fusion of spatial-Fourier domain information under low-light environments[J]. Journal of Computer Applications, 2025, 45(5): 1686-1693.

图/表 12

图1 基于空间域与基于空间-傅里叶域的抓取检测方法对比

Fig. 1 Comparison of spatial domain based and spatial-Fourier domain based grasp detection methods

图2 本文模型整体流程

Fig. 2 Overall flow of proposed model

图3 FFE模块结构

Fig. 3 Structure of FFE block

图4 FFE模块结构

Fig. 4 Structure of FFE module

图5 R-CoA模块结构

Fig. 5 Structure of R-CoA module

表1 低光Jacquard数据集上不同γ及噪声下的抓取检测准确率对比 (%)

Tab. 1 Comparison of grasp detection accuracies on low-light Jacquard dataset with different γ and noise

$γ$	高斯噪声	椒盐噪声	局部方差噪声	泊松噪声	斑点噪声
$1.2$	91.77	91.35	91.55	91.19	91.88
1.5	91.01	91.02	91.09	92.01	91.72
2.0	90.73	91.59	91.00	91.96	91.39

表1 低光Jacquard数据集上不同γ及噪声下的抓取检测准确率对比 (%)

Tab. 1 Comparison of grasp detection accuracies on low-light Jacquard dataset with different γ and noise

$γ$	高斯噪声	椒盐噪声	局部方差噪声	泊松噪声	斑点噪声
$1.2$	91.77	91.35	91.55	91.19	91.88
1.5	91.01	91.02	91.09	92.01	91.72
2.0	90.73	91.59	91.00	91.96	91.39

表2 γ=1.5时不同方法在高斯与椒盐噪声下抓取检测准确率对比（低光Cornell数据集）

Tab. 2 Grasp detection accuracy comparison of different methods in Gaussian and S&P noise when γ=1.5 （low-light Cornell dataset）

方法	准确率/%
方法	高斯噪声	椒盐噪声
GG-CNN^［12］	84.00	88.76
GR-ConvNet^［11］	94.38	92.13
GR-ConvNetv2^［11］	94.38	93.25
SE-ResUNet^［15］	95.50	94.38
本文方法	96.62	96.62

图6 本文方法在低光Cornell数据集上不同γ时的抓取检测准确率

Fig. 6 Accuracy comparison of proposed method at different γ in low-light Cornell dataset

表3 低光C-Cornell数据集上不同方法的性能对比

Tab. 3 Performance comparison of different methods on low-light C-Cornell dataset

方法	准确率/%	GFLOPs	Params/10⁶	Time/ms
GR-ConvNet^［11］	92.13	13.56	1.90	3.66
GR-ConvNetv2^［11］	93.25	13.56	1.90	3.48
GG-CNN^［12］	92.13	1.18	0.07	0.63
TFgrasp^［16］	93.25	1.50	6.80	12.17
SE-ResUNet^［15］	93.25	24.88	3.89	4.39
本文方法	95.50	40.74	8.42	16.41

图7 不同方法在低光C-Cornell数据集上的抓取结果

Fig. 7 Grasp results of different methods on low-light C-Cornell dataset

图8 本文方法在低光Cornell和低光Jacquard数据集上不同噪声和不同γ下的抓取结果

Fig. 8 Grasp results of proposed method on low-light Cornell and Jacquard datasets with different noise and γ

表4 在低光C-Cornell数据集下的消融实验结果

Tab. 4 Ablation experimental results on low-light C-Cornell dataset

序号	SFE	FFE	R-CoA	准确率/%
1	√			93.22
2		√		93.78
3			√	92.65
4	√	√		93.78
5	√		√	94.35
6		√	√	94.35
7	√	√	√	95.50

参考文献 29

1	韩非，张道辉，赵新刚，等.面向水下抓取作业的复合腔体仿生软体手设计［J］.机器人，2023，45（2）：207-217.
	HAN F， ZHANG D H， ZHAO X G， et al. Design of a bionic soft hand with compound cavity for under water grasping［J］. Robot， 2023， 45（2）： 207-217.
2	CHENG H， MENG M. A grasp posse detection scheme with an end-to-end CNN regression approach［C］// Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics. Piscataway： IEEE， 2018， 544-549.
3	KUSHWAHA V， SHUKLA P， NANDI G C. Generating quality grasp rectangle using Pix2Pix GAN for intelligent robot grasping［J］. Machine Vision and Applications， 2023， 34（1）： No.15.
4	JIANG Y， GONG X， LIU D， et al. EnlightenGAN： deep light enhancement without paired supervision［J］. IEEE Transactions on Image Processing， 2021， 30： 2340-2349.
5	GUO C， LI C， GUO J， et al. Zero-reference deep curve estimation for low-light image enhancement［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 1780-1789.
6	NGUYEN C M， CHAN E R， BERGMAN A W， et al. Diffusion in the dark： a diffusion model for low-light text recognition［C］// Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2024： 4146-4157.
7	李淦，牛洺第，陈路，等.融合视觉特征增强机制的机器人弱光环境抓取检测［J］.计算机应用，2023，43（8）：2564-2571.
	LI G， NIU M D， CHEN L， et al. Fusion of visual feature enhancement mechanism for robot grasping detection in low light environment［J］. Journal of Computer Applications， 2023， 43（8）： 2564-2571.
8	NIU M， LU Z， CHEN L， et al. VERGNet： visual enhancement guided robotic grasp detection under low-light condition［J］. IEEE Robotics and Automation Letters， 2023， 8（12）： 8541-8548.
9	ZHANG L， LI M， JIA T， et al. Real-time grasping detection method based on attention residual block and multi-scale receptive field［J］. Journal of Physics： Conference Series， 2022， 2303（1）： No.012029.
10	ZHU J Y， PARK T， ISOLA P， et al. Unpaired image-to-image translation using cycle-consistent adversarial networks［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2223-2232.
11	KUMRA S， JOSHI S， SAHIN F. Antipodal robotic grasping using generative residual convolutional neural network［C］// Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2020： 9626-9633.
12	MORRISON D， CORKE P， LEITNER J. Learning robust， real-time， reactive robotic grasping［J］. The International Journal of Robotics Research， 2020， 39（2/3）： 183-201.
13	CHEN L， NIU M， YANG J， et al. Robotic grasp detection using structure prior attention and multiscale features［J］. IEEE Transactions on Systems， Man， and Cybernetics： Systems， 2024， 54（11）： 7039-7053.
14	NIE H， ZHAO Z， CHEN L， et al. Smaller and faster robotic grasp detection model via knowledge distillation and unequal feature encoding［J］. IEEE Robotics and Automation Letters， 2024， 9（8）： 7206-7213.
15	YU S， ZHAI D H， XIA Y， et al. SE-ResUNet： a novel robotic grasp detection method［J］. IEEE Robotics and Automation Letters， 2022， 7（2）： 5238-5245.
16	WANG S， ZHOU Z， KAN Z. When Transformer meets robotic grasping： exploits context for efficient grasp detection［J］. IEEE Robotics and Automation Letters， 2022， 7（3）： 8170-8177.
17	王雪松，王荣荣，程玉虎.安全强化学习综述［J］. 自动化学报， 2023， 49（9）： 1813-1835.
	WANG X S， WANG R R， CHENG Y H. Safe reinforcement learning： a survey［J］. Acta Automatica Sinica， 2023， 49（9）： 1813-1835.
18	李凯文，张涛，王锐，等.基于深度强化学习的组合优化研究进展［J］. 自动化学报， 2021， 47（11）： 2521-2537.
	LI K W， ZHANG T， WANG R， et al. Research reviews of combinatorial optimization methods based on deep reinforcement learning［J］. Acta Automatica Sinica， 2021， 47（11）： 2521-2537.
19	XIONG P， YIN Y， LIAO J， et al. An adaptive grasping strategy for dexterous hands based on proximity-contact sensing［C］// Proceedings of the 2023 International Conference on Advanced Robotics and Mechatronics. Piscataway： IEEE， 2023， 1181-1186.
20	MARWAN Q M， LEE C K， SHING C C. Learning pick to place objects using self-supervised learning with minimal training resources［J］. International Journal of Advanced Computer Science and Applications， 2021，12（10）， 493-499.
21	IBRAHIM H， KONG N S P. Brightness preserving dynamic histogram equalization for image contrast enhancement［J］. IEEE Transactions on Consumer Electronics， 2007， 53（4）： 1752-1758.
22	OOI C H， ISA N A M. Adaptive contrast enhancement methods with brightness preserving［J］. IEEE Transactions on Consumer Electronics， 2010， 56（4）： 2543-2551.
23	CHEN Y， WEN C， LIU W， et al. A depth iterative illumination estimation network for low-light image enhancement based on retinex theory［J］. Scientific Reports， 2023， 13（1）： No.19709.
24	ZHANG C， YAN Q， ZHU Y， et al. Attention-based network for low-light image enhancement［C］// Proceedings of the 2020 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2020： 1-6.
25	FAN J， LI Y， LIANG B， et al. Self-supervised low-light image enhancement based on Retinex model［C］// Proceedings of the 2022 IEEE 10th Joint International Information Technology and Artificial Intelligence Conference. Piscataway： IEEE， 2022： 2138-2141.
26	WANG Z， LI D， LI G， et al. Multimodal low-light image enhancement with depth information［C］// Proceedings of the 32nd ACM International Conference on Multimedia. New York： ACM， 2024： 4976-4985.
27	WANG H Y， CHEN L， GUAN Z X. Dual-branch low-light image enhancement via spatial and multi-scale frequency domain fusion［C］// Proceedings of the 2024 IEEE International Conference on Industrial Technology. Piscataway： IEEE， 2024：1-7.
28	JIANG Y， MOSESON S， SAXENA A. Efficient grasping from RGBD images： learning using a new rectangle representation［C］// Proceedings of the 2011 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2011： 3304-3311.
29	DEPIERRE A， DELLANDRÉA E， CHEN L M. Jacquard：a largescale dataset for robotic grasp detection［C］// Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2018： 3511-3516.

[1]	陈满, 杨小军, 杨慧敏. 基于图卷积网络和终点诱导的行人轨迹预测[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1480-1487.
[2]	李慧, 贾炳志, 王晨曦, 董子宇, 李纪龙, 仲兆满, 陈艳艳. 基于Swin Transformer的生成对抗网络水下图像增强模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1439-1446.
[3]	王丹, 张文豪, 彭丽娟. 基于深度学习的智能反射面辅助通信系统信道估计[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1613-1618.
[4]	王利琴, 耿智雷, 李英双, 董永峰, 边萌. 基于路径和增强三元组文本的开放世界知识推理模型[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1177-1183.
[5]	郭诗月, 党建武, 王阳萍, 雍玖. 结合注意力机制和多尺度特征融合的三维手部姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1293-1299.
[6]	张李伟, 梁泉, 胡禹涛, 朱乔乐. 基于分组卷积的通道重洗注意力机制[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1069-1076.
[7]	姜坤元, 李小霞, 王利, 曹耀丹, 张晓强, 丁楠, 周颖玥. 引入解耦残差自注意力的边界交叉监督语义分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1120-1129.
[8]	胡婕, 郑启扬, 孙军, 张龑. 基于多标签关系图和局部动态重构学习的多标签分类模型[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1104-1112.
[9]	徐春, 吉双焱, 马欢, 孙恩威, 王萌萌, 苏明钰. 基于知识图谱和对话结构的问诊推荐方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1157-1168.
[10]	王华华, 范子健, 刘泽. 基于多空间概率增强的图像对抗样本生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 883-890.
[11]	耿海军, 董赟, 胡治国, 池浩田, 杨静, 尹霞. 基于Attention-1DCNN-CE的加密流量分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 872-882.
[12]	王地欣, 王佳昊, 李敏, 陈浩, 胡光耀, 龚宇. 面向水声通信网络的异常攻击检测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 526-533.
[13]	杨晟, 李岩. 面向目标检测的对比知识蒸馏方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 354-361.
[14]	桂佳扬, 王顺吉, 周正康, 唐加山. 基于改进YOLOv8n的隧道内异物检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 655-661.
[15]	孟海腾, 赵小乐, 李天瑞. 基于非对称信息蒸馏网络的轻量级图像超分辨重建[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 601-609.

融合空间-傅里叶域信息的机器人低光环境抓取检测

Robotic grasp detection with feature fusion of spatial-Fourier domain information under low-light environments

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 29

相关文章 15

编辑推荐

Metrics