《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (3): 804-809.DOI: 10.11772/j.issn.1001-9081.2021040912

• 2021年中国计算机学会人工智能会议(CCFAI 2021) • 上一篇    

基于多尺度特征融合的红外单目测距算法

刘斌1,2, 李港庆1,2(), 安澄全1, 王水根2, 王建生2   

  1. 1.哈尔滨工程大学 信息与通信工程学院,哈尔滨 150001
    2.艾睿光电科技有限公司,山东 烟台 264000
  • 收稿日期:2021-06-02 修回日期:2021-07-14 接受日期:2021-07-15 发布日期:2022-04-09 出版日期:2022-03-10
  • 通讯作者: 李港庆
  • 作者简介:刘斌(1998—),男,湖北荆州人,硕士研究生,CCF会员,主要研究方向:人工智能、计算机视觉
    安澄全(1974—),男,黑龙江双城人,副教授,博士,主要研究方向:无线通信、扩频通信、信号处理
    王水根(1990—),男,江西瑞金人,博士,主要研究方向:人工智能、深度学习、特征学习、计算机视觉
    王建生(1972—),男,重庆人,硕士,主要研究方向:图像处理、深度学习、计算机视觉。

Infrared monocular ranging algorithm based on multiscale feature fusion

Bin LIU1,2, Gangqing LI1,2(), Chengquan AN1, Shuigen WANG2, Jiansheng WANG2   

  1. 1.College of Information and Communication Engineering,Harbin Engineering University,Harbin Heilongjiang 150001,China
    2.IRay Technology Company Limited,Yantai Shandong 264000,China
  • Received:2021-06-02 Revised:2021-07-14 Accepted:2021-07-15 Online:2022-04-09 Published:2022-03-10
  • Contact: Gangqing LI
  • About author:LIU Bin, born in 1998, M. S. candidate. His research interests include artificial intelligence, computer vision.
    AN Chengquan, born in 1974, Ph. D., associate professor. His research interests include wireless communication, spread spectrum communication, signal processing.
    WANG Shuigen, born in 1990, Ph. D. His research interests include artificial intelligence, deep learning, feature learning, computer vision.
    WANG Jiansheng, born in 1972, M. S. His research interests include image processing, deep learning, computer vision.

摘要:

由于MonoDepth2的提出,无监督单目测距在可见光领域取得了重大发展;然而在某些场景例如夜间以及一些低能见度的环境,可见光并不适用,而红外热成像可以在夜间和低能见度条件下获得清晰的目标图像,因此对于红外图像的深度估计显得尤为必要。由于可见光和红外图像的特性不同,直接将现有可见光单目深度估计算法迁移到红外图像是不合理的。针对该问题,对MonoDepth2算法进行改进,提出了基于多尺度特征融合的红外单目测距算法。针对红外图像低纹理的特性设计了一项新的损失函数边缘损失函数,旨在降低图像重投影时的像素误匹配。不同于以往的无监督单目测距单纯地将四个尺度的深度图统一上采样到原图像分辨率计算投影误差而忽略了尺度之间的关联性以及不同尺度之间的贡献差异,将加权的双向特征金字塔网络(BiFPN)应用于多尺度深度图的特征融合,解决了深度图边缘模糊问题。另外用跨阶段部分网络(CSPNet)替换残差网络(ResNet)结构,以降低网络复杂度并提高运算速度。实验结果表明,边缘损失更适合红外图像测距,使得深度图质量更高;在加入BiFPN结构之后,深度图像的边缘更加清晰;将ResNet替换为CSPNet之后,推理速度提高了大约20个百分点。该算法能够准确估计出红外图像的深度,解决夜间低光照场景以及一些低能见度场景下的深度估计难题;该算法的应用也可以在一定程度上降低汽车辅助驾驶的成本。

关键词: 无监督, 单目测距, 红外图像, 双向特征金字塔网络, 跨阶段部分网络

Abstract:

Due to the introduction of MonoDepth2, unsupervised monocular ranging has made great progress in the field of visible light. However, visible light is not applicable in some scenes, such as at night and in some low-visibility environments. Infrared thermal imaging can obtain clear target images at night and under low-visibility conditions, so it is necessary to estimate the depth of infrared image. However, due to the different characteristics of visible and infrared images, it is unreasonable to migrate existing monocular depth estimation algorithms directly to infrared images. An infrared monocular ranging algorithm based on multiscale feature fusion after improving the MonoDepth2 algorithm can solve this problem. A new loss function, edge loss function, was designed for the low texture characteristic of infrared image to reduce pixel mismatch during image reprojection. The previous unsupervised monocular ranging simply upsamples the four-scale depth maps to the original image resolution to calculate projection errors, ignoring the correlation between scales and the contribution differences between different scales. A weighted Bi-directional Feature Pyramid Network (BiFPN) was applied to feature fusion of multiscale depth maps so that the blurring of depth map edge was solved. In addition, Residual Network (ResNet) structure was replaced by Cross Stage Partial Network (CSPNet) to reduce network complexity and increase operation speed. The experimental results show that edge loss is more suitable for infrared image ranging, resulting in better depth map quality. After adding BiFPN structure, the edge of depth image is clearer. After replacing ResNet with CSPNet, the inference speed is improved by about 20 percentage points. The proposed algorithm can accurately estimate the depth of the infrared image, solving the problem of depth estimation in night low-light scenes and some low-visibility scenes, and the application of this algorithm can also reduce the cost of assisted driving to a certain extent.

Key words: unsupervised, monocular ranging, infrared image, Bi-directional Feature Pyramid Network (BiFPN), Cross Stage Partial Network (CSPNet)

中图分类号: