计算机应用 ›› 2012, Vol. 32 ›› Issue (09): 2516-2519.DOI: 10.3724/SP.J.1087.2012.02516

• 人工智能 • 上一篇    下一篇

基于邻域保持的流形学习算法评价模型

石陆魁*,张军,宫晓腾   

  1. 河北工业大学 计算机科学与软件学院,天津 300401
  • 收稿日期:2012-02-27 修回日期:2012-03-31 发布日期:2012-09-01 出版日期:2012-09-01
  • 通讯作者: 石陆魁
  • 作者简介:石陆魁(1974-),男,河北邯郸人,副教授,博士,主要研究方向:机器学习、数据挖掘; 张军(1976-),男,河北宣化人,讲师,博士,主要研究方向:数据挖掘、网络系统可靠性; 宫晓腾(1988-),男,河北深州人,硕士研究生,主要研究方向:机器学习。
  • 基金资助:

    天津市应用基础及前沿技术研究计划项目(10JCZDJC16000)

Evaluation model based on neighborhood preservation for manifold learning algorithms

SHI Lu-kui*,ZHANG Jun,GONG Xiao-teng   

  1. School of Computer Science and Engineering,Hebei University of Technology,Tianjin 300401,China
  • Received:2012-02-27 Revised:2012-03-31 Online:2012-09-01 Published:2012-09-01
  • Contact: Lukui Shi
  • Supported by:

    10JCZDJC16000

摘要: 应力函数和残差只适合于评价距离严格保持的流形学习算法,dy-dx表示法又是一个定性模型。虽然距离比例方差可以比较和评价大多数的流形学习算法,但其需要计算测地线距离,具有较高的计算复杂度。为此,提出一种基于邻域保持的流形学习算法定量评价模型,该模型仅仅需要确定两个空间中每个对象的k个近邻,并计算出每个点在低维空间中的近邻保持情况,不用计算测地线距离。理论分析表明,邻域保持模型的计算复杂度远远低于距离比例方差的复杂度。在三个数据集上比较了两个模型的性能,实验结果表明,利用邻域保持模型不但可以评价同一算法在不同邻域参数下的嵌入效果,而且可以在不同的流形学习算法之间进行比较,并且其评价流形学习算法的性能优于距离比例方差。

关键词: 流形学习, 应力函数, 残差, 距离比例方差, dy-dx表示法

Abstract: The stress function and the residual variance are only fit to evaluate the manifold learning algorithms with strict distance preservation. And dy-dx representation is only a qualitative measure. Although the variance of distance ratios can compare and judge most of manifold learning algorithms, the geodesic distances need computing in the method which leads to high computational complexity. An evaluation model based neighborhood preservation was proposed. In the model, only k nearest neighbors needed determining and the preservation ratio of the neighborhood in the low dimensional space needed computing for each object. The geodesic distances did not need calculating in the method. The theoretical analysis shows that the computational complexity of the proposed model is great lower than that of the variance of distance ratios. The performance of the two models was compared on three data sets. Experiments demonstrate that the proposed model not only can judge results from the same method with different parameters, but also can compare results by different algorithms. And the evaluation performance of manifold learning algorithms of the model is superior to that of the variance of distance ratios.

Key words: manifold learning, stress function, residual variance, variance of distance ratios, dy-dx representation

中图分类号: