Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1323-1329.DOI: 10.11772/j.issn.1001-9081.2022030419

• China Conference on Data Mining 2022 (CCDM 2022) •     Next Articles

Discriminative multidimensional scaling for feature learning

Haitao TANG1,2, Hongjun WANG1,2(), Tianrui LI1,2   

  1. 1.School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu Sichuan 611756,China
    2.National Engineering Laboratory of Integrated Transportation Big Data Application Technology (Southwest Jiaotong University),Chengdu Sichuan 611756,China
  • Received:2022-04-01 Revised:2022-05-16 Accepted:2022-05-19 Online:2023-05-08 Published:2023-05-10
  • Contact: Hongjun WANG
  • About author:TANG Haitao, born in 1999, M. S. candidate. His research interests include feature learning, clustering.
    WANG Hongjun, born in 1977, Ph. D., associate research fellow. His research interests include machine learning, ensemble learning, data mining.
    LI Tianrui, born in 1969, Ph. D., professor. His research interests include rough set, granular computing, cloud computing, data mining.
  • Supported by:
    National Key Research and Development Program of China(2020AAA0105101);National Natural Science Foundation of China(61773324)

判别多维标度特征学习

唐海涛1,2, 王红军1,2(), 李天瑞1,2   

  1. 1.西南交通大学 计算机与人工智能学院,成都 611756
    2.综合交通大数据应用技术国家工程实验室(西南交通大学),成都 611756
  • 通讯作者: 王红军
  • 作者简介:唐海涛(1999—),男,四川南充人,硕士研究生,CCF会员,主要研究方向:特征学习、聚类
    王红军(1977—),男,四川广安人,副研究员,博士,CCF高级会员,主要研究方向:机器学习、集成学习、数据挖掘 wanghongjun@swjtu.edu.cn
    李天瑞(1969—),男,福建莆田人,教授,博士,CCF杰出会员,主要研究方向:粗糙集、粒计算、云计算、数据挖掘。
  • 基金资助:
    国家重点研发计划项目(2020AAA0105101);国家自然科学基金资助项目(61773324)

Abstract:

Traditional multidimensional scaling method achieves low-dimensional embedding, which maintains the topological structure of data points but ignores the discriminability of the low-dimensional embedding itself. Based on this, an unsupervised discriminative feature learning method based on multidimensional scaling method named Discriminative MultiDimensional Scaling model (DMDS) was proposed to discover the cluster structure while learning the low-dimensional data representation. DMDS can make the low-dimensional embeddings of the same cluster closer to make the learned data representation be more discriminative. Firstly, a new objective function corresponding to DMDS was designed, reflecting that the learned data representation could maintain the topology and enhance discriminability simultaneously. Secondly, the objective function was reasoned and solved, and a corresponding iterative optimization algorithm was designed according to the reasoning process. Finally, comparison experiments were carried out on twelve public datasets in terms of average accuracy and average purity of clustering. Experimental results show that DMDS outperforms the original data representation and the traditional multidimensional scaling model based on the comprehensive evaluation of Friedman statistics, the low-dimensional embeddings learned by DMDS are more discriminative.

Key words: discriminative feature learning, multidimensional scaling, dimensionality reduction, fuzzy clustering, iterative optimization algorithm

摘要:

传统多维标度方法学习得到的低维嵌入保持了数据点的拓扑结构,但忽略了低维嵌入数据类别间的判别性。基于此,提出一种基于多维标度法的无监督判别性特征学习方法——判别多维标度模型(DMDS),该模型能在学习低维数据表示的同时发现簇结构,并通过使同簇的低维嵌入更接近,让学习到的数据表示更具有判别性。首先,设计了DMDS对应的目标公式,体现所学习特征在保留拓扑性的同时增强判别性;其次,对目标函数进行了推理和求解,并根据推理过程设计所对应的迭代优化算法;最后,在12个公开的数据集上对聚类平均准确率和平均纯度进行对比实验。实验结果表明,根据Friedman统计量综合评价DMDS在12个数据集上的性能优于原始数据表示和传统多维标度模型的数据表示,它的低维嵌入更具有判别性。

关键词: 判别性特征学习, 多维标度法, 降维, 模糊聚类, 迭代优化算法

CLC Number: