计算机应用 ›› 2015, Vol. 35 ›› Issue (2): 470-475.DOI: 10.11772/j.issn.1001-9081.2015.02.0470

• 人工智能 • 上一篇    下一篇

基于判别式扩散映射分析的非线性特征提取

张成1, 刘亚东2, 李元2   

  1. 1. 沈阳化工大学 数理系, 沈阳 110142;
    2. 沈阳化工大学 信息工程学院, 沈阳 110142
  • 收稿日期:2014-09-05 修回日期:2014-11-15 出版日期:2015-02-10 发布日期:2015-02-12
  • 通讯作者: 李元
  • 作者简介:张成(1979-),男,辽宁沈阳人,讲师,博士研究生,主要研究方向:系统监控、故障检测; 刘亚东(1989-),男,河北石家庄人,硕士研究生,主要研究方向:故障诊断; 李元(1964-),女,辽宁沈阳人,教授,博士,主要研究方向:过程控制、故障诊断。
  • 基金资助:

    国家自然科学基金资助项目(60774070,61174119);国家自然科学基金重点课题资助项目(61034006)。

Nonlinear feature extraction based on discriminant diffusion map analysis

ZHANG Cheng1, LIU Yadong2, LI Yuan2   

  1. 1. Department of Science, Shenyang University of Chemical Technology, Shenyang Liaoning 110142, China;
    2. College of Information Engineering, Shenyang University of Chemical Technology, Shenyang Liaoning 110142, China
  • Received:2014-09-05 Revised:2014-11-15 Online:2015-02-10 Published:2015-02-12

摘要:

针对高维数据难以被人们直观理解,且难以被机器学习和数据挖据算法有效地处理的问题,提出一种新的非线性降维方法——判别式扩散映射分析(DDMA)。该方法将判别核方案应用到扩散映射框架中,依据样本类别标签在类内窗宽和类间窗宽中判别选取高斯核窗宽,使核函数能够有效提取数据的关联特性,准确描述数据空间的结构特征。通过在人工合成Swiss-roll测试和青霉素发酵过程中的仿真应用,与主成分分析(PCA)、线性判别分析(LDA)、核主成分分析(KPCA)、拉普拉斯特征映射(LE)算法和扩散映射(DM)进行比较,实验结果表明DDMA方法在低维空间中代表高维数据的同时成功保留了数据的原始特性,且通过该方法在低维空间中产生的数据结构特性优于其他方法,在数据降维与特征提取性能上验证了该方案的有效性。

关键词: 扩散映射, 非线性降维, 判别核方案, 类别标签, 核函数, 流形学习

Abstract:

Aiming at that high-dimensional data is hard to be understood intuitively, and cannot be effectively processed by traditional machine learning and data mining techniques, a new method for nonlinear dimensionality reduction called Discriminant Diffusion Maps Analysis (DDMA) was proposed. It was implemented by applying a discriminant kernel scheme to the framework of the diffusion maps. The Gaussian kernel window width was selected from the within-class width and the between-class width according to discriminating sample category labels, it made kernel function effectively extract data correlation features and exactly describe the structure characteristics of data space. The DDMA was used in artificial Swiss-roll test and penicillin fermentation process, with comparisons with Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), Kernel Principle Components Analysis (KPCA), Laplacian Eigenmaps (LE) and Diffusion Maps (DM). The results show that DDMA represents the high-dimensional data in a low-dimensional space while successfully retaining original characteristics of the data; in addition, the data structure features in low-dimensional space generated by DDMA are superior to those generated by the comparison methods, the performance of data dimension reduction and feature extraction verifies effectiveness of the proposed scheme.

Key words: Diffusion Maps (DM), nonlinear dimensionality reduction, discriminant kernel scheme, category label, kernel function, manifold learning

中图分类号: