Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (4): 1255-1260.DOI: 10.11772/j.issn.1001-9081.2022020262

• Multimedia computing and computer simulation • Previous Articles    

Real-time reconstruction method of visual information for manipulator operation

Qingyu JIA1, Liang CHANG1, Xianyi YANG2(), Baohua QIANG2, Shihao ZHANG1, Wu XIE2, Minghao YANG2   

  1. 1.School of Computer Science and Information Security,Guilin University of Electronic Technology,Guilin Guangxi 541004,China
    2.Guangxi Key Laboratory of Image and Graphic Intelligent Processing (Guilin University of Electronic Technology),Guilin Guangxi 541004,China
  • Received:2022-03-08 Revised:2022-05-24 Accepted:2022-05-26 Online:2022-08-16 Published:2023-04-10
  • Contact: Xianyi YANG
  • About author:JIA Qingyu, born in 1995, M. S. candidate. Her research interests include machine learning, artificial intelligence.
    CHANG Liang, born in 1980, Ph. D., professor. His research interests include data and knowledge engineering, formal methods, trusted software.
    QIANG Baohua, born in 1972, Ph. D., professor. His research interests include big data analysis, image processing.
    ZHANG Shihao, born in 1991, Ph. D. candidate. His research interests include human skeleton key point detection, image processing.
    XIE Wu, born in 1979, Ph. D., associate professor. His research interests include data mining, information processing.
    YANG Minghao, born in 1977, Ph. D., associate research fellow. His research interests include multimodal information fusion, man-machine cooperation.
  • Supported by:
    Natural Science Foundation of Guangxi(2019GXNSFDA185006);Guangxi Science and Technology Base and Talent Project(Guike AD19110137)

面向机械臂操作的视觉信息实时重建方法

贾清玉1, 常亮1, 杨先一2(), 强保华2, 张世豪1, 谢武2, 杨明浩2   

  1. 1.桂林电子科技大学 计算机与信息安全学院, 广西 桂林 541004
    2.广西图像图形与智能处理重点实验室(桂林电子科技大学), 广西 桂林 541004
  • 通讯作者: 杨先一
  • 作者简介:贾清玉(1995—),女,山西大同人,硕士研究生,主要研究方向:机器学习、人工智能;
    常亮(1980—),男,贵州毕节人,教授,博士,CCF高级会员,主要研究方向:数据和知识工程、形式化方法、可信软件;
    强保华(1972—),男,河南南阳人,教授,博士,CCF会员,主要研究方向:大数据分析、图像处理;
    张世豪(1991—),男,河南许昌人,博士研究生,主要研究方向:人体骨骼关键点检测、图像处理;
    谢武(1979—),男,江西宜春人,副教授,博士,CCF会员,主要研究方向:数据挖掘、信息处理;
    杨明浩(1977—),男,四川达州人,副研究员,博士,CCF会员,主要研究方向:多模态信息融合、人机协同。
  • 基金资助:
    广西自然科学基金资助项目(2019GXNSFDA185006);广西科技基地和人才专项(桂科AD19110137)

Abstract:

Current skill teaching methods of manipulator mainly construct a virtual space through three-dimensional reconstruction technology for manipulator to simulate and train. However, due to the different visual angles between human and manipulator, the traditional visual information reconstruction methods have large reconstruction errors, long time, and need harsh experimental environment and many sensors, so that the skills learned by manipulator in virtual space can not be well transferred to the real environment. To solve the above problems, a visual information real-time reconstruction method for manipulator operation was proposed. Firstly, information was extracted from real-time RGB images through Mask-Region Convolutional Neural Network(Mask-RCNN). Then, the extracted RGB images and other visual information were jointly encoded, and the visual information was mapped to the three-dimensional position information of the manipulator operation space through Residual Neural Network-18 (ResNet-18). Finally, an outlier adjustment method based on Cluster Center DIStance constrained (CC-DIS) was proposed to reduce the reconstruction error, and the adjusted position information was visualized by Open Graphics Library (OpenGL). In this way, the three-dimensional real-time reconstruction of the manipulator operation space was completed. Experimental results show that the proposed method has high reconstruction speed and reconstruction accuracy. It only takes 62.92 milliseconds to complete a three-dimensional reconstruction with a reconstruction speed of up to 16 frames per second and a reconstruction relative error of about 5.23%. Therefore, it can be effectively applied to the manipulator skill teaching tasks.

Key words: skill teaching, Mask-Region Convolutional Neural Network (Mask-RCNN), Residual Neural Network-18 (ResNet-18), three-dimensional real-time reconstruction, manipulator

摘要:

现阶段的机械臂技能传授方法主要通过三维实时重建技术搭建虚拟空间进行模拟训练。然而人与机械臂视角不同,传统视觉信息重建方法由于重建误差大、时间长,而且实验环境苛刻、所需传感器较多等原因,导致机械臂在虚拟空间内习得的技能不能很好地迁移于现实环境。针对以上问题,提出了一种面向机械臂操作的视觉信息实时重建方法。首先,通过Mask-RCNN(Mask-Region Convolutional Neural Network)对实时采集到的RGB图像提取信息;然后,将提取后的RGB图像及其他视觉信息联合编码,并通过ResNet-18将视觉信息映射为机械臂操作空间的三维位置信息;最后,为减小重建误差,提出了一种聚类簇中心距离受限离群值调整方法(CC-DIS),并利用OpenGL(Open Graphics Library)将调整后的位置信息可视化,完成机械臂操作空间三维实时重建。实验结果表明,所提的实时重建方法具有较快的重建速度和较高的重建精度,完成一次三维重建仅需62.92 ms,重建速度高达每秒16帧,重建相对误差约为5.23%,能有效用于机械臂技能传授任务。

关键词: 技能传授, Mask-RCNN, ResNet-18, 三维实时重建, 机械臂

CLC Number: