Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (9): 2755-2763.DOI: 10.11772/j.issn.1001-9081.2024081232

• Artificial intelligence • Previous Articles    

Dual imputation based incomplete multi-view metric learning

Penghuan QU1, Wei WEI1,2(), Jing YAN1, Feng WANG1,2   

  1. 1.School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China
    2.Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education (Shanxi University),Taiyuan Shanxi 030006,China
  • Received:2024-09-02 Revised:2024-11-06 Accepted:2024-11-15 Online:2024-12-03 Published:2025-09-10
  • Contact: Wei WEI
  • About author:QU Penghuan, born in 2000, M. S. candidate. Her research interests include deep learning.
    YAN Jing, born in 1994, Ph. D. candidate. His research interests include data mining, machine learning.
    WANG Feng, born in 1984, Ph. D., associate professor. Her research interests include data mining, machine learning.
  • Supported by:
    National Natural Science Foundation of China(62276160);Natural Science Foundation of Shanxi Province(202203021211291);1331 Engineering Project of Shanxi Province

基于双补全的不完整多视图度量学习

曲鹏欢1, 魏巍1,2(), 闫京1, 王锋1,2   

  1. 1.山西大学 计算机与信息技术学院,太原 030006
    2.计算智能与中文信息处理教育部重点实验室(山西大学),太原 030006
  • 通讯作者: 魏巍
  • 作者简介:曲鹏欢(2000—),女,山西运城人,硕士研究生,主要研究方向:深度学习
    闫京(1994—),男,山西忻州人,博士研究生,主要研究方向:数据挖掘、机器学习
    王锋(1984—),女,山西黎城人,副教授,博士,CCF会员,主要研究方向:数据挖掘、机器学习。
  • 基金资助:
    国家自然科学基金资助项目(62276160);国家自然科学基金资助项目(62276158);山西省自然科学基金资助项目(202203021211291);山西省1331工程项目

Abstract:

In practical applications, multi-view metric learning has become an effective method for handling multi-view data. However, the incompleteness of multi-view data poses significant challenges for multi-view metric learning. Although some methods have attempted to address incomplete multi-view issue, they still have the following shortcomings: 1) most of the existing methods rely on k-Nearest Neighbors (kNN) of the existing samples to fill in missing data, and ignore unique characteristics of samples or views easily; 2) they only utilize the existing sample representations to calculate neighbors, and cannot fully express neighbor relationships between samples. To address these issues, a Dual imputation based Incomplete Multi-View Metric Learning method (DIMVML) was proposed. Firstly, latent features of each view were extracted using a deep autoencoder, and then missing samples were filled in by combining distribution information of samples and difference information between views. Secondly, the results were fused according to quality of the completed samples to obtain higher-quality completion results. Finally, intra-view and inter-view relationships were optimized through a loss function. Experimental results show that in clustering experiments, the proposed method achieves superior accuracy and F1 score on HandWritten, Caltech101-7, Leaves, and YouTubeFace10 datasets compared to advanced multi-view methods such as Subgraph Propagation and Contrastive Calibration (SPCC) and Latent Heterogeneous Graph Network (LHGN); in classification experiments, the proposed method outperforms other multi-view methods significantly in accuracy on CUB, ORL, and HandWritten datasets.

Key words: incomplete multi-view, metric learning, representation learning, difference, consistency

摘要:

在实际应用中,多视图度量学习成了处理多视图数据的有效方法。然而,多视图数据的不完整性给多视图度量学习带来了巨大挑战。尽管已有一些方法试图解决不完整多视图问题,但它们仍存在以下不足:1)现有方法大多依赖于已有样本的k近邻(kNN)来补全缺失数据,而容易忽视样本或视图的独特特征;2)它们仅利用现有样本表示来计算近邻,而无法充分表达样本间的近邻关系。因此,提出基于双补全的不完整多视图度量学习方法(DIMVML)。首先,利用深度自编码器提取各视图的潜在特征,再结合样本的分布信息和视图间的差异信息补全缺失样本;其次,根据补全后的样本的质量进行结果融合,以获得更高质量的补全结果;最后,通过损失函数优化视图内和视图间的关系。实验结果表明:在聚类实验中,所提方法在HandWritten、Caltech101-7、Leaves和YouTubeFace10数据集上的准确率和F1分数均优于SPCC(Subgraph Propagation and Contrastive Calibration)、LHGN(Latent Heterogeneous Graph Network)等先进的多视图方法;在分类实验中,所提方法在CUB、ORL和HandWritten数据集上的准确率显著超过其他多视图方法。

关键词: 不完整多视图, 度量学习, 表示学习, 差异性, 一致性

CLC Number: