《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (8): 2595-2603.DOI: 10.11772/j.issn.1001-9081.2023081122

• 多媒体计算与计算机仿真 • 上一篇    

改进掩码自编码器的工业缺陷检测方法

邓凯丽, 魏伟波(), 潘振宽   

  1. 青岛大学 计算机科学技术学院,山东 青岛 266071
  • 收稿日期:2023-08-22 修回日期:2023-10-30 接受日期:2023-11-03 发布日期:2024-08-22 出版日期:2024-08-10
  • 通讯作者: 魏伟波
  • 作者简介:邓凯丽(1992—),女,山东昌乐人,工程师,硕士研究生,主要研究方向:计算机视觉、缺陷检测
    魏伟波(1981—),男,山东临朐人,副教授,博士,CCF会员,主要研究方向:图像处理、目标识别与跟踪 njustwwb@163.com
    潘振宽(1966—),男,山东昌邑人,教授,博士,主要研究方向:多体系统动力学与控制、虚拟现实、计算机视觉。
  • 基金资助:
    山东省自然科学基金资助项目(ZR2020QF033)

Industrial defect detection method with improved masked autoencoder

Kaili DENG, Weibo WEI(), Zhenkuan PAN   

  1. College of Computer Science & Technology,Qingdao University,Qingdao Shandong 266071,China
  • Received:2023-08-22 Revised:2023-10-30 Accepted:2023-11-03 Online:2024-08-22 Published:2024-08-10
  • Contact: Weibo WEI
  • About author:DENG Kaili, born in 1992, M. S. candidate, engineer. Her research interests include computer vision, defect detection.
    PAN Zhenkuan, born in 1966, Ph. D., professor. His research interests include dynamics and control of multibody systems, virtual reality, computer vision.
  • Supported by:
    Natural Science Foundation of Shandong Province(ZR2020QF033)

摘要:

针对目前只需正常样本即可实现缺陷检测的方法存在漏检或过度检测的问题,构建一种改进掩码自编码器与改进Unet结合的方法实现像素级缺陷检测。首先,采用拟合缺陷模块生成缺陷掩码图像及正常图像对应的缺陷图像;其次,对缺陷图像随机掩码,去除缺陷图像大部分的缺陷信息,激励Transformer结构的自编码器从未掩码的正常区域学习表示并依据上下文修复缺陷图像,为了提高模型对细节的修复能力,设计了新的损失函数;最后,将缺陷图像与修复图像拼接后输入拥有通道方向交叉融合Transformer结构的Unet,实现像素级缺陷检测。实验结果表明,在MVTec AD数据集上,所提方法平均的基于图像的和基于像素的接受者操作特征曲线下的面积值(ROC AUC)分别达到了0.984和0.982,与DRAEM(Discriminatively trained Reconstruction Anomaly Embedding Model)相比分别提高了2.9和3.2个百分点;与CFLOW-AD(Anomaly Detection via Conditional normalizing FLOWs)相比分别提高了3.1和0.8个百分点,证明所提方法具有较高的识别率和检测精度。

关键词: 缺陷检测, 图像修复, 掩码自编码器, 梯度损失函数, Transformer, Unet

Abstract:

Considering the problem of missed detection or over detection in the existing defect detection methods that only need normal samples, an method that combined an improved masked autoencoder with an improved Unet was constructed to achieve pixel-level defect detection. Firstly, a defect fitting module was used to generate the defect mask image and the defect image corresponding to the normal image. Secondly, the defect image was randomly masked to remove most of the defect information from the defect image. The autoencoder with Transformer structure was stimulated to learn the representations from unmasked normal regions and to repair the defect image based on context. In order to improve the model’s ability to repair details of the image, a new loss function was designed. Finally, in order to achieve pixel-level defect detection, the defect image and the repaired image were concatenated and input into the Unet with the channel cross-fusion Transformer structure. Experimental results on MVTec AD dataset show that the average image-based and pixel-based Area Under the Receiver Operating Characteristic Curve (ROC AUC) of the proposed method reached 0.984 and 0.982 respectively; compared with DRAEM (Discriminatively trained Reconstruction Anomaly Embedding Model), it was increased by 2.9 and 3.2 percentage points; compared with CFLOW-AD (Anomaly Detection via Conditional normalizing FLOWs), it was increased by 3.1 and 0.8 percentage points. It verifies that the proposed method has high recognition rate and detection accuracy.

Key words: defect detection, image inpainting, masked autoencoder, gradient loss function, Transformer, Unet

中图分类号: