《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (5): 1686-1693.DOI: 10.11772/j.issn.1001-9081.2024111686

• 多媒体计算与计算机仿真 • 上一篇    

融合空间-傅里叶域信息的机器人低光环境抓取检测

陈路1,2, 王怀瑶1,2, 刘京阳1,2, 闫涛1,2(), 陈斌3,4,5   

  1. 1.山西大学 大数据科学与产业研究院,太原 030006
    2.山西大学 计算机与信息技术学院,太原 030006
    3.中国科学院大学,北京 100049
    4.哈尔滨工业大学(深圳) 国际人工智能研究院,广东 深圳 518055
    5.哈尔滨工业大学 重庆研究院,重庆 401151
  • 收稿日期:2024-12-02 修回日期:2025-01-27 接受日期:2025-02-12 发布日期:2025-02-14 出版日期:2025-05-10
  • 通讯作者: 闫涛
  • 作者简介:陈路(1991—),男,山东聊城人,副教授,博士,CCF会员,主要研究方向:机器人抓取、图像增强
    王怀瑶(2000—),女,山西吕梁人,硕士研究生,主要研究方向:抓取检测、低光图像增强
    刘京阳(1999—),男,山西大同人,硕士研究生,CCF会员,主要研究方向:6D位姿估计、6D抓取检测
    闫涛(1987—),男,山西定襄人,副教授,博士,CCF会员,主要研究方向:三维重建
    陈斌(1970—),男,四川广汉人,研究员,博士,主要研究方向:机器视觉、工业检测、深度学习。
  • 基金资助:
    国家自然科学基金资助项目(62373233);山西省基础研究计划项目(202203021222010);山西省科技重大专项(202201020101006)

Robotic grasp detection with feature fusion of spatial-Fourier domain information under low-light environments

Lu CHEN1,2, Huaiyao WANG1,2, Jingyang LIU1,2, Tao YAN1,2(), Bin CHEN3,4,5   

  1. 1.Institute of Big Data Science and Industry,Shanxi University,Taiyuan Shanxi 030006,China
    2.School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China
    3.University of Chinese Academy of Sciences,Beijing 100049,China
    4.International Institute for Artificial Intelligence,Harbin Institute of Technology (Shenzhen),Shenzhen Guangdong 518055,China
    5.Chongqing Research Institute,Harbin Institute of Technology,Chongqing 401100,China
  • Received:2024-12-02 Revised:2025-01-27 Accepted:2025-02-12 Online:2025-02-14 Published:2025-05-10
  • Contact: Tao YAN
  • About author:CHEN Lu, born in 1991, Ph. D., associated professor. His research interests include robotic grasping, image enhancement.
    WANG Huaiyao, born in 2000, M. S. candidate. Her research interests include grasp detection, low-light image enhancement.
    LIU Jingyang, born in 1999, M. S. candidate. His research interests include 6D pose estimation, 6D grasp detection.
    YAN Tao, born in 1987, Ph. D., associated professor. His research interests include 3D reconstruction.
    CHEN Bin, born in 1970, Ph. D., research fellow. His research interests include machine vision, industrial inspection, deep learning.
  • Supported by:
    National Natural Science Foundation of China(62373233);Fundamental Research Program of Shanxi Province(202203021222010);Science and Technology Major Project of Shanxi Province(202201020101006)

摘要:

针对现有抓取检测方法无法有效感知稀疏、微弱特征,导致低光环境下机器人抓取检测性能下降的问题,提出一种融合空间-傅里叶域信息的机器人低光环境抓取检测方法。首先,该方法的骨干网络采用编-解码器结构,在网络深层特征与浅层特征融合过程中进行空间域-傅里叶域的特征提取。具体地,在空间域中通过水平和垂直方向的条带卷积捕获全局上下文信息,提取对抓取检测任务敏感的特征;在傅里叶域中分别调整振幅和相位,实现对图像细节和纹理特征的恢复。其次,引入R-CoA(Row-Column Attention)模块平衡图像全局与局部信息,并对图像进行行、列相对位置编码以强化与抓取任务相关的位置信息。最后,在低光Cornell、低光Jacquard以及所构建的低光C?Cornell数据集上分别进行验证,所提低光抓取检测方法最高准确率分别达到96.62%、92.01%和95.50%。在低光Cornell数据集(高斯噪声且γ=1.5)上,与GR-ConvNetv2(Generative Residual Convolutional Neural Network v2)、SE?ResUNet(Squeeze-and-Excitation ResUNet)相比,所提方法的准确率分别提升2.24个百分点和1.12个百分点。所提方法能够在低光环境下有效提升抓取检测的鲁棒性和准确性,为机器人在低光照条件下的抓取任务提供支持。

关键词: 机器人, 抓取检测, 空间-傅里叶域, 注意力机制, 深度神经网络

Abstract:

Aiming at the inadequacy of the existing grasp detection methods that cannot effectively perceive sparse and weak features, leading to performance degradation in robot grasp detection under low-light environments, a robotic grasp detection method that integrated spatial-Fourier domain information for low-light environments was proposed. Firstly, the proposed model utilized an encoder-decoder architecture as its backbone, and performed spatial-Fourier domain feature extraction during the fusion of deep and shallow features within the network. Specifically, in the spatial domain, global contextual information was captured using strip convolutions applied in horizontal and vertical directions, enabling the extraction of information critical to the grasp detection task. In the Fourier domain, image details and texture features were restored by independently modulating amplitude and phase components. Furthermore, a R-CoA (Row-Column Attention) module was incorporated to effectively balance global and local image information, while encoding the relative positional relationships of image rows and columns to emphasize positional information pertinent to grasp tasks. Finally, validation on low-light Cornell, low-light Jacquard, and the constructed low-light C-Cornell datasets demonstrates that the proposed method achieves highest accuracies of 96.62%, 92.01%, and 95.50%, respectively. Specifically, on the low-light Cornell dataset (Gaussian noise and γ=1.5), the proposed method outperforms GR-ConvNetv2 (Generative Residual Convolutional Neural Network v2) and SE-ResUNet (Squeeze-and-Excitation ResUNet) in accuracy by 2.24 percentage points and 1.12 percentage points, respectively. The proposed method can effectively improve the robustness and accuracy of grasp detection in low-light environments, providing support for robotic grasping tasks under insufficient illumination conditions.

Key words: robot, grasp detection, spatial-Fourier domain, attention mechanism, deep neural network

中图分类号: