Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (9): 2657-2664.DOI: 10.11772/j.issn.1001-9081.2022091404

• 2022 10th CCF Conference on Big Data • Previous Articles     Next Articles

Adaptive learning-based multi-view unsupervised feature selection method

Tian HE1, Zongxin SHEN1, Qianqian HUANG2, Yanyong HUANG1()   

  1. 1.School of Statistics,Southwestern University of Finance and Economics,Chengdu Sichuan 611130,China
    2.School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu Sichuan 611756,China
  • Received:2022-09-20 Revised:2022-10-27 Accepted:2022-11-03 Online:2023-09-10 Published:2023-09-10
  • Contact: Yanyong HUANG
  • About author:HE Tian, born in 1995, M. S. His research interests include data mining.
    SHEN Zongxin, born in 1996, Ph. D. candidate. His research interests include data mining, machine learning.
    HUANG Qianqian, born in 1991, Ph. D. candidate. Her research interests include data mining, knowledge discovery.
  • Supported by:
    Youth Foundation of Humanities and Social Sciences of Ministry of Education(21YJCZH045);Fundamental Research Funds for the Central Universities(JBK2304037)


何添1, 沈宗鑫1, 黄倩倩2, 黄雁勇1()   

  1. 1.西南财经大学 统计学院,成都 611130
    2.西南交通大学 计算机与人工智能学院,成都 611756
  • 通讯作者: 黄雁勇
  • 作者简介:何添(1995—),男,重庆人,硕士,主要研究方向:数据挖掘
  • 基金资助:


Most of the existing multi-view unsupervised feature selection methods have the following problem: the similarity matrix of samples, the weight matrix of different views, and the feature weight matrix are usually predefined, and cannot effectively describe the real intrinsic structure of data and reflect the importance of different views and features, which results in the failure of selection of useful features. In order to address the above issue, firstly, adaptive learning of view weight and feature weight was performed on the basis of multi-view fuzzy C-means clustering, thereby achieving feature selection and guaranteeing the clustering performance simultaneously. Then, under the constraint of Laplacian rank, the similarity matrix of samples was learned adaptively, and an Adaptive Learning-based Multi-view Unsupervised Feature Selection (ALMUFS) method was constructed. Finally, an alternate iterative optimization algorithm was designed to solve the objective function, and the proposed method was compared with six unsupervised feature selection baseline methods on eight real datasets. Experimental results show that ALMUFS is superior to other methods in terms of clustering accuracy and F-measure. In specific, ALMUFS method improves the clustering accuracy and F-measure by 8.99 and 11.87 percentage points compared to Adaptive Collaborative Similarity Learning (ACSL) averagely and respectively and by 11.09 and 13.21 percentage points compared to Adaptive Similarity and View Weight (ASVM) averagely and respectively, which demonstrates the feasibility and effectiveness of the proposed method.

Key words: multi-view unsupervised feature selection, adaptive learning, similarity matrix, view weight, feature weight


现有的多视图无监督特征选择方法大多存在以下问题:样本的相似度矩阵、不同视图的权重矩阵和特征的权重矩阵往往是预先定义的,不能有效刻画数据间的真实结构以及反映不同视图和特征的重要性,进而导致不能选出有用的特征。为解决上述问题,首先,在多视图模糊C均值聚类的基础上进行视图权重和特征权重的自适应学习,以同时实现特征选择并保证聚类性能;然后,在拉普拉斯秩约束下自适应地学习样本的相似度矩阵,并构建一个基于自适应学习的多视图无监督特征选择(ALMUFS)方法;最后,设计一种交替迭代优化算法对目标函数进行求解,并在8个真实数据集上将所提方法与6种无监督特征选择基线方法进行比较。实验结果表明,ALMUFS的聚类精度和F-measure优于其他方法,与自适应协作相似性学习(ACSL)相比,平均提高8.99和11.87个百分点;与ASVM(Adaptive Similarity and View Weight)相比,平均提高11.09和13.21个百分点,验证了所提方法的可行性和有效性。

关键词: 多视图无监督特征选择, 自适应学习, 相似度矩阵, 视图权重, 特征权重

CLC Number: