Unsupervised point cloud anomaly detection based on multi-representation fusion

doi:10.11772/j.issn.1001-9081.2024050652

Abstract

Abstract:

With the growing demand of industrial automation， 3D point cloud anomaly detection has played an increasingly important role in product quality control. However， the existing methods often rely on a single feature， leading to information loss and accuracy reduction. To address these issues， an unsupervised point cloud anomaly detection method based on multi-representation fusion was proposed， called MRF （Multi-Representation Fusion）. MRF used multi-angle rotation and various coloring schemes to render point clouds into multi-modal images， and employed pre-trained 2D convolutional neural networks to extract rich semantic features. Simultaneously， pre-trained Point Transformer was adopted to extract 3D structural features. After the above， by fusing 2D image semantic features and 3D structural features， MRF was able to capture point cloud information more comprehensively. In the anomaly detection stage， abnormal point clouds were identified effectively by using a method based on positive sample memory banks and nearest neighbor search. Experimental results on MVTec 3D AD dataset show that MRF achieves a point cloud-level AUROC （Area Under the Receiver Operating Characteristic curve） of 0.972 and a point-level AUPRO （Area Under the Per-Region Overlap） of 0.948， significantly outperforming existing methods. It can be seen that the effectiveness and robustness of MRF makes it a highly promising solution for industrial applications.

Key words: computer vision, point cloud, unsupervised anomaly detection, feature embedding, memory bank

摘要：

随着工业自动化需求的不断增长，三维点云异常检测在产品质量控制中扮演着越来越重要的角色。然而，现有方法通常依赖单一特征，导致信息损失和精度下降。因此，提出一种基于多表征融合的无监督点云异常检测方法MRF（Multi-Representation Fusion）。MRF利用多角度旋转和多种着色方案将点云渲染为多模态图像，并使用预训练的二维卷积神经网络提取丰富的语义特征；同时，还采用预训练的Point Transformer提取三维结构特征。之后，通过融合二维图像语义特征和三维结构特征，MRF能够更全面地捕捉点云信息。在异常检测阶段，MRF使用基于正样本记忆库和近邻搜索的方法，可有效地识别异常点云。在MVTec 3D AD数据集上的实验结果表明，MRF的点云级接受者操作特征曲线下面积（AUROC）为0.972，点级区域重叠度（AUPRO）为0.948，显著优于对比方法。可见，该方法的有效性和鲁棒性使它成为工业应用中极具潜力的解决方案。

关键词: 计算机视觉, 点云, 无监督异常检测, 特征嵌入, 记忆库

CLC Number:

TP391.4

Zihe CHEN, Bin CHEN. Unsupervised point cloud anomaly detection based on multi-representation fusion[J]. Journal of Computer Applications, 2025, 45(5): 1677-1685.

陈子和, 陈斌. 基于多表征融合的无监督点云异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1677-1685.

Figures/Tables 12

References 29

1	刘永江，陈斌. 基于多尺度记忆库的像素级无监督工业异常检测［J］．计算机应用， 2024， 44（11）： 3587-3594.
	LIU Y J， CHEN B. Pixel-level unsupervised industrial anomaly detection based on multi-scale memory bank［J］. Journal of Computer Applications， 2024， 44（11）： 3587-3594.
2	LIU J， XIE G， WANG J， et al. Deep industrial image anomaly detection： a survey［J］. Machine Intelligence Research， 2024， 21（1）： 104-135.
3	SCHÖLKOPF B， PLATT J C， SHAWE-TAYLOR J， et al. Estimating the support of a high-dimensional distribution［J］. Neural Computation， 2001， 13（7）： 1443-1471.
4	BERGMANN P， JIN X， SATTLEGGER D， et al. The MVTec 3D-AD dataset for unsupervised 3D anomaly detection and localization［C］// Proceedings of the 17th International Joint Conference on Computer Vision， Imaging and Computer Graphics Theory and Applications -Volume 5. Setúbal： SciTePress， 2022： 202-213.
5	SUN J， ZHANG Q， KAILKHURA B， et al. ModelNet40-C： a robustness benchmark for 3D point cloud recognition under corruption［EB/OL］. ［2024-08-26］..
6	YI L， KIM V G， CEYLAN D， et al. A scalable active framework for region annotation in 3D shape collections［J］. ACM Transactions on Graphics， 2016， 35（6）： No.210.
7	DENG J， DONG W， SOCHER R， et al. ImageNet： a large-scale hierarchical image database［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 248-255.
8	ROTH K， PEMULA L， ZEPEDA J， et al. Towards total recall in industrial anomaly detection［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 14298-14308.
9	HORWITZ E， HOSHEN Y. Back to the feature： classical 3D features are （almost） all you need for 3D anomaly detection［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2023： 2968-2977.
10	LOWE D G. Distinctive image features from scale-invariant keypoints［J］. International Journal of Computer Vision， 2004， 60（2）： 91-110.
11	RUSU R B， BLODOW N， BEETZ M. Fast Point Feature Histograms （FPFH） for 3D registration［C］// Proceedings of the 2009 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2009： 3212-3217.
12	CAO Y， XU X， SHEN W. Complementary pseudo multimodal feature for point cloud anomaly detection［J］. Pattern Recognition， 2024， 156： No.110761.
13	SIMARRO VIANA J， DE LA ROSA E， VANDE VYVERE T， et al. Unsupervised 3D brain anomaly detection［C］// Proceedings of the 2020 International MICCAI Brainlesion Workshop， LNCS 12658. Cham： Springer， 2021： 133-142.
14	BENGS M， BEHRENDT F， KRÜGER J， et al. Three-dimensional deep learning with spatial erasing for unsupervised anomaly segmentation in brain MRI［J］. International Journal of Computer Assisted Radiology and Surgery， 2021， 16（9）： 1413-1423.
15	RUDOLPH M， WEHRBEIN T， ROSENHAHN B， et al. Asymmetric student-teacher networks for industrial anomaly detection［C］// Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2023： 2591-2601.
16	PAPAMAKARIOS G， NALISNICK E， REZENDE D J， et al. Normalizing flows for probabilistic modeling and inference［J］. Journal of Machine Learning Research， 2021， 22： 1-64.
17	ZHAO H， JIANG L， JIA J， et al. Point Transformer［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 16239-16248.
18	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
19	REISS T， COHEN N， BERGMAN L， et al. PANDA： adapting pretrained features for anomaly detection and segmentation［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 2805-2813.
20	QI C R， SU H， MO K， et al. PointNet： deep learning on point sets for 3D classification and segmentation［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 77-85.
21	QI C R， YI L， SU H， et al. PointNet++： deep hierarchical feature learning on point sets in a metric space［C］// Proceedings of the31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 5105-5114.
22	PANG Y， WANG W， TAY F E H， et al. Masked autoencoders for point cloud self-supervised learning［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13662. Cham： Springer， 2022： 604-621.
23	AO S， HU Q， YANG B， et al. SpinNet： learning a general surface descriptor for 3D point cloud registration［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 11748-11757.
24	BERGMANN P， FAUSER M， SATTLEGGER D， et al. MVTec AD — a comprehensive real-world dataset for unsupervised anomaly detection［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 9584-9592.
25	ZHOU Q Y， PARK J， KOLTUN V. Open 3D： modern library fora 3D data processing［EB/OL］. ［2024-11-26］..
26	FISCHLER M A， BOLLES R C. Random sample consensus： a paradigm for model fitting with applications to image analysis and automated cartography［J］. Communications of the ACM， 1981， 24（6）： 381-395.
27	CHANG A X， FUNKHOUSER T， GUIBAS L， et al. ShapeNet： an information-rich 3D model repository［EB/OL］. ［2024-08-22］..
28	YU X， TANG L， RAO Y， et al. Point-BERT： pre-training 3D point cloud Transformers with masked point modeling［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 19291-19300.
29	BERGMANN P， SATTLEGGER D. Anomaly detection in 3D point clouds using deep geometric descriptors［C］// Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2023： 2612-2622.

类别	Voxel GAN^［13］	Voxel AE^［15］	Voxel VM^［4］	PatchCore^［8］	Depth SIFT^［10］	PC FPFH^［11］	Spin-Net^［23］	3D-ST^［29］	MRF
平均	53.8	57.2	69.9	63.7	71.4	75.3	52.4	74.8	97.2
bagel	38.3	69.3	75.0	62.4	69.6	82.0	53.5	86.2	99.0
cable gland	62.3	42.5	74.7	68.3	55.3	53.3	41.3	48.4	95.6
carrot	47.4	51.5	61.3	67.6	82.4	87.7	56.8	83.2	99.2
cookie	63.9	79.0	73.8	83.8	69.6	76.9	66.2	89.4	98.9
dowel	56.4	49.4	82.3	60.8	79.5	71.8	47.2	84.8	95.7
foam	40.9	55.8	69.3	55.8	77.3	57.4	48.0	66.3	90.1
peach	61.7	53.7	67.9	56.7	57.3	77.4	36.7	76.3	99.5
potato	42.7	48.4	65.2	49.6	74.6	89.5	49.4	68.7	98.2
rope	66.3	63.9	60.9	69.9	93.6	99.0	72.2	95.8	98.2
tire	57.7	58.3	69.0	61.9	55.3	58.2	52.7	48.6	97.9

类别	Voxel GAN^［13］	Voxel AE^［15］	Voxel VM^［4］	PatchCore^［8］	Depth SIFT^［10］	PC FPFH^［11］	Spin-Net^［23］	3D-ST^［29］	MRF
平均	53.8	57.2	69.9	63.7	71.4	75.3	52.4	74.8	97.2
bagel	38.3	69.3	75.0	62.4	69.6	82.0	53.5	86.2	99.0
cable gland	62.3	42.5	74.7	68.3	55.3	53.3	41.3	48.4	95.6
carrot	47.4	51.5	61.3	67.6	82.4	87.7	56.8	83.2	99.2
cookie	63.9	79.0	73.8	83.8	69.6	76.9	66.2	89.4	98.9
dowel	56.4	49.4	82.3	60.8	79.5	71.8	47.2	84.8	95.7
foam	40.9	55.8	69.3	55.8	77.3	57.4	48.0	66.3	90.1
peach	61.7	53.7	67.9	56.7	57.3	77.4	36.7	76.3	99.5
potato	42.7	48.4	65.2	49.6	74.6	89.5	49.4	68.7	98.2
rope	66.3	63.9	60.9	69.9	93.6	99.0	72.2	95.8	98.2
tire	57.7	58.3	69.0	61.9	55.3	58.2	52.7	48.6	97.9

类别	Voxel GAN^［13］	Voxel AE^［15］	Voxel VM^［4］	PatchCore^［8］	Depth SIFT^［10］	PC FPFH^［11］	Spin-Net^［23］	3D-ST^［29］	MRF
平均	58.3	34.8	49.2	58.6	86.6	92.8	65.4	83.3	94.8
bagel	44.0	26.0	45.3	70.1	89.4	97.2	63.5	95.0	96.6
cable gland	45.3	34.1	34.3	54.4	72.2	84.9	31.6	48.3	95.9
carrot	82.5	58.1	52.1	79.1	96.3	98.1	92.2	98.6	98.1
cookie	75.5	35.1	69.7	83.5	87.1	93.9	78.0	92.1	88.0
dowel	78.2	50.2	68.0	53.1	92.6	96.3	87.0	90.5	90.5
foam	37.8	23.4	28.4	10.0	61.3	69.3	38.0	63.2	88.5
peach	39.2	35.1	34.9	80.0	87.0	97.5	58.5	94.5	98.2
potato	63.9	65.8	63.4	54.9	97.3	98.1	69.9	98.8	98.2
rope	77.5	1.5	61.6	82.7	95.8	98.0	95.5	97.6	96.1
tire	38.9	18.5	34.6	18.5	87.3	94.9	40.0	54.2	97.9

类别	Voxel GAN^［13］	Voxel AE^［15］	Voxel VM^［4］	PatchCore^［8］	Depth SIFT^［10］	PC FPFH^［11］	Spin-Net^［23］	3D-ST^［29］	MRF
平均	58.3	34.8	49.2	58.6	86.6	92.8	65.4	83.3	94.8
bagel	44.0	26.0	45.3	70.1	89.4	97.2	63.5	95.0	96.6
cable gland	45.3	34.1	34.3	54.4	72.2	84.9	31.6	48.3	95.9
carrot	82.5	58.1	52.1	79.1	96.3	98.1	92.2	98.6	98.1
cookie	75.5	35.1	69.7	83.5	87.1	93.9	78.0	92.1	88.0
dowel	78.2	50.2	68.0	53.1	92.6	96.3	87.0	90.5	90.5
foam	37.8	23.4	28.4	10.0	61.3	69.3	38.0	63.2	88.5
peach	39.2	35.1	34.9	80.0	87.0	97.5	58.5	94.5	98.2
potato	63.9	65.8	63.4	54.9	97.3	98.1	69.9	98.8	98.2
rope	77.5	1.5	61.6	82.7	95.8	98.0	95.5	97.6	96.1
tire	38.9	18.5	34.6	18.5	87.3	94.9	40.0	54.2	97.9

类别	仅3D 特征	仅2D 特征	Depth 着色	Normal 着色	Uniform 着色	MRF
平均	82.1	95.3	92.41	94.3	93.7	97.2
bagel	94.0	97.8	92.2	97.4	98.3	99.0
cable gland	64.0	90.2	89.9	90.4	88.8	95.6
carrot	92.2	99.2	98.5	97.5	97.8	99.2
cookie	97.1	99.3	97.6	98.9	98.7	98.9
dowel	71.2	95.8	91.2	90.4	91.7	95.7
foam	79.9	85.9	86.1	85.0	84.2	90.1
peach	76.9	99.4	97.7	97.8	98.5	99.5
potato	84.2	97.9	98	96.1	94.8	98.2
rope	86.4	94.7	85.6	96.0	96.1	98.2
tire	75.4	93.0	87.3	94.3	88.3	97.9