基于判别式扩散映射分析的非线性特征提取

doi:10.11772/j.issn.1001-9081.2015.02.0470

计算机应用 ›› 2015, Vol. 35 ›› Issue (2): 470-475.DOI: 10.11772/j.issn.1001-9081.2015.02.0470

基于判别式扩散映射分析的非线性特征提取

张成¹, 刘亚东², 李元²

1. 沈阳化工大学数理系, 沈阳 110142;
2. 沈阳化工大学信息工程学院, 沈阳 110142

收稿日期:2014-09-05 修回日期:2014-11-15 发布日期:2015-02-12 出版日期:2015-02-10
通讯作者: 李元
作者简介:张成(1979-),男,辽宁沈阳人,讲师,博士研究生,主要研究方向:系统监控、故障检测; 刘亚东(1989-),男,河北石家庄人,硕士研究生,主要研究方向:故障诊断; 李元(1964-),女,辽宁沈阳人,教授,博士,主要研究方向:过程控制、故障诊断。
基金资助:
国家自然科学基金资助项目(60774070,61174119);国家自然科学基金重点课题资助项目(61034006)。

Nonlinear feature extraction based on discriminant diffusion map analysis

ZHANG Cheng¹, LIU Yadong², LI Yuan²

1. Department of Science, Shenyang University of Chemical Technology, Shenyang Liaoning 110142, China;
2. College of Information Engineering, Shenyang University of Chemical Technology, Shenyang Liaoning 110142, China

Received:2014-09-05 Revised:2014-11-15 Online:2015-02-12 Published:2015-02-10

摘要/Abstract

摘要：

针对高维数据难以被人们直观理解,且难以被机器学习和数据挖据算法有效地处理的问题,提出一种新的非线性降维方法——判别式扩散映射分析(DDMA)。该方法将判别核方案应用到扩散映射框架中,依据样本类别标签在类内窗宽和类间窗宽中判别选取高斯核窗宽,使核函数能够有效提取数据的关联特性,准确描述数据空间的结构特征。通过在人工合成Swiss-roll测试和青霉素发酵过程中的仿真应用,与主成分分析(PCA)、线性判别分析(LDA)、核主成分分析(KPCA)、拉普拉斯特征映射(LE)算法和扩散映射(DM)进行比较,实验结果表明DDMA方法在低维空间中代表高维数据的同时成功保留了数据的原始特性,且通过该方法在低维空间中产生的数据结构特性优于其他方法,在数据降维与特征提取性能上验证了该方案的有效性。

关键词: 扩散映射, 非线性降维, 判别核方案, 类别标签, 核函数, 流形学习

Abstract:

Aiming at that high-dimensional data is hard to be understood intuitively, and cannot be effectively processed by traditional machine learning and data mining techniques, a new method for nonlinear dimensionality reduction called Discriminant Diffusion Maps Analysis (DDMA) was proposed. It was implemented by applying a discriminant kernel scheme to the framework of the diffusion maps. The Gaussian kernel window width was selected from the within-class width and the between-class width according to discriminating sample category labels, it made kernel function effectively extract data correlation features and exactly describe the structure characteristics of data space. The DDMA was used in artificial Swiss-roll test and penicillin fermentation process, with comparisons with Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), Kernel Principle Components Analysis (KPCA), Laplacian Eigenmaps (LE) and Diffusion Maps (DM). The results show that DDMA represents the high-dimensional data in a low-dimensional space while successfully retaining original characteristics of the data; in addition, the data structure features in low-dimensional space generated by DDMA are superior to those generated by the comparison methods, the performance of data dimension reduction and feature extraction verifies effectiveness of the proposed scheme.

Key words: Diffusion Maps (DM), nonlinear dimensionality reduction, discriminant kernel scheme, category label, kernel function, manifold learning

中图分类号:

TP311

张成, 刘亚东, 李元. 基于判别式扩散映射分析的非线性特征提取[J]. 计算机应用, 2015, 35(2): 470-475.

ZHANG Cheng, LIU Yadong, LI Yuan. Nonlinear feature extraction based on discriminant diffusion map analysis[J]. Journal of Computer Applications, 2015, 35(2): 470-475.

参考文献

[1] JARDINE A K S, LIN D, BANJEVIC D. A review on machinery diagnostics and prognostics implementing condition-based maintenance [J]. Mechanical Systems and Signal Processing, 2006, 20(7): 1483-1510.
[2] LEE J, NI J, DJURDJANOVIC D, et al. Intelligent prognostics tools and e-maintenance [J]. Computers in Industry, 2006, 57(6): 476-489.
[3] KORN F, PAGEL B-U, FALOUTSOS C. On the "dimensionality curse" and the "self-similarity blessing" [J]. IEEE Transactions on Knowledge and Data Engineering, 2001, 13(1): 96-111.
[4] ZHOU D, LI G, LI Y. Data driven industrial process fault diagnosis technology — Based on PCA and PLS methods [M]. Beijing: Science Press, 2011: 22-30. (周东华,李钢,李元.数据驱动的工业过程故障诊断技术——PCA与PLS的方法 [M]. 北京:科学出版社, 2011: 22-30.)
[5] HUANG D, QUAN Y, HE M, et al. Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data [J]. Journal of Experimental and Clinical Cancer Research, 2009, 28: 149.
[6] CUI P, LI J, WANG G. Improved kernel principal component analysis for fault detection [J]. Expert Systems with Applications, 2008, 34(2): 1210-1219.
[7] BELKIN M, NIYOGI P. Laplacian eigenmaps for dimensionality reduction and data representation [J]. Neural Computation, 2003, 15(6): 1373-1396.
[8] COIFMAN R R, LAFON S. Diffusion maps [J]. Applied and Computational Harmonic Analysis, 2006, 21(1): 5-30.
[9] SINGER A, WU H-T. Vector diffusion maps and the connection Laplacian [J]. Communications on Pure and Applied Mathematics, 2012, 65(8): 1067-1144.
[10] SHANG X, SONG Y. A nonlinear dimension reduction algorithm based on diffusion mapping [J]. Journal of Xidian University, 2010, 37(1): 130-135. (尚晓清,宋宜美. 一种基于扩散映射的非线性降维算法[J].西安电子科技大学学报, 2010, 37(1): 130-135.)
[11] ZHAO X, ZHOU J. An IWO-FCM data mining algorithm of chemical industrial process based on diffusion mapping [J]. Journal of Lanzhou University of Technology, 2014,40(3):102-105. (赵小强,周金虎.一种基于扩散映射的化工过程IWO-FCM数据挖掘算法[J].兰州理工大学学报,2014,40(3):102-105.)
[12] XIA L, HU N, QIN G. Abnormal recognition algorithm based on manifold learning for turbopump mass data [J]. Journal of Aerospace Power, 2011, 26(3): 689-703. (夏鲁瑞, 胡茑庆, 秦国军. 基于流形学习的涡轮泵海量数据异常识别算法 [J].航空动力海报, 2011, 26(3): 689-703.)
[13] YU K, JI L, ZHANG X. Kernel nearest neighbor algorithm [J]. Neural Processing Letters, 2002, 15(2): 147-156.
[14] SMOLA A J, SCHOLKOPF B. A tutorial on support vector regression [J]. Statistics and Computing, 2004, 14(3): 199-222.
[15] LAFON S S. Diffusion maps and geometric harmonics [D]. New Haven: Yale University, 2004: 33.
[16] LIU X. Data dimension reduction and classification of manifold learning research [D]. Hangzhou: Zhejiang University, 2007: 53 (刘小明. 数据降维及分类中的流形学习研究 [D].杭州:浙江大学, 2007: 53.)
[17] BIROL G, VNDEY C, INAR A. A modular simulation package for fed-batch fermentation: penicillin production [J]. Computers & Chemical Engineering, 2002, 26(11): 1553-1565.
[18] NOMIKOS P, MACGREGOR J F. Monitoring batch processes using multiway principal component analysis [J]. AIChE Journal, 1994, 40(8): 1361-1375.

基于判别式扩散映射分析的非线性特征提取

Nonlinear feature extraction based on discriminant diffusion map analysis

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	吴明月, 周栋, 赵文玉, 屈薇. 基于流形学习的句向量优化[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3062-3069.
[2]	邱云志, 汪廷华, 戴小路. 双重特征加权模糊支持向量机[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 683-687.
[3]	祁祥洲, 邢红杰. 基于中心核对齐的多核单类支持向量机[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 349-356.
[4]	王梅, 宋晓晖, 刘勇, 许传海. 神经正切核K‑Means聚类[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3330-3336.
[5]	范莉莉, 卢桂馥, 唐肝翌, 杨丹. 基于Hessian正则化和非负约束的低秩表示子空间聚类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 115-122.
[6]	効琦, 尹增山, 高爽. 基于检测与跟踪相互迭代的极暗弱目标搜索算法[J]. 计算机应用, 2021, 41(10): 3017-3024.
[7]	孙石磊, 王超, 赵元棣. 基于轮廓系数的参数无关空中交通轨迹聚类方法[J]. 计算机应用, 2019, 39(11): 3293-3297.
[8]	范君, 王新, 徐慧. 粒子群优化混合核极限学习机的构造煤厚度预测方法[J]. 计算机应用, 2018, 38(6): 1820-1825.
[9]	忽丽莎, 王素贞, 陈益强, 胡春雨, 蒋鑫龙, 陈振宇, 高兴宇. 基于目标均衡度量的核增量学习跌倒检测方法[J]. 计算机应用, 2018, 38(4): 928-934.
[10]	张乐园, 李佳烨, 李鹏清. 低秩约束的非线性属性选择算法[J]. 计算机应用, 2018, 38(12): 3444-3449.
[11]	南敬昌, 崔洪艳. 基于新型二维核函数动态X参数的功放建模[J]. 计算机应用, 2017, 37(8): 2421-2426.
[12]	李华, 李德玉, 王素格, 张晶. 多标记数据特征提取方法的核改进[J]. 计算机应用, 2015, 35(7): 1939-1944.
[13]	王伟东, 刘兵, 管红杰, 周勇, 夏士雄. 基于核函数的谱嵌入聚类算法[J]. 计算机应用, 2015, 35(3): 761-765.
[14]	胡彦婷, 王楠楠, 陈建军, 木拉提·哈米提, 阿布都艾尼·库吐鲁克. 基于局部约束邻域嵌入的人脸画像照片合成[J]. 计算机应用, 2015, 35(2): 535-539.
[15]	胡昭华, 邢卫国, 何军, 张秀再. 多通道核相关滤波的实时跟踪方法[J]. 计算机应用, 2015, 35(12): 3544-3549.