Parameter independent weighted local mean-based pseudo nearest neighbor classification algorithm

doi:10.11772/j.issn.1001-9081.2020091370

Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (6): 1694-1700.DOI: 10.11772/j.issn.1001-9081.2020091370

Special Issue: 数据科学与技术

• Data science and technology • Previous Articles Next Articles

Parameter independent weighted local mean-based pseudo nearest neighbor classification algorithm

CAI Ruiguang, ZHANG Desheng, XIAO Yanting

Faculty of Science, Xi'an University of Technology, Xi'an Shaanxi 710054, China

Received:2020-09-07 Revised:2020-10-24 Online:2021-06-10 Published:2020-11-10
Supported by:
This work is partially supported by the Youth Program of National Natural Foundation of China (11801438).

参数独立的加权局部均值伪近邻分类算法

蔡瑞光, 张德生, 肖燕婷

西安理工大学理学院, 西安 710054

通讯作者: 蔡瑞光
作者简介:蔡瑞光(1996-),女,陕西宝鸡人,硕士研究生,主要研究方向:数据挖掘、分类分析;张德生(1964-),男,陕西永寿人,教授,博士,主要研究方向:概率论与数理统计;肖燕婷(1981-),女,陕西户县人,副教授,博士,主要研究方向:概率论与数理统计。
基金资助:
国家自然科学基金青年科学基金资助项目（11801438）。

Abstract

Abstract: Aiming at the problem that the Local Mean-based Pseudo Nearest Neighbor (LMPNN) algorithm is sensitive to the value of k and ignores the different influence of different attributes on the classification results, a Parameter Independent Weighted Local Mean-based Pseudo Nearest Neighbor classification (PIW-LMPNN) algorithm was proposed. Firstly, the Success-History based parameter Adaptation for Differential Evolution (SHADE) algorithm, the latest variant of differential evolution algorithm, was used to optimize the training set samples to obtain the best k value and a set of best weights related to the classes. Secondly, when calculating the distance between samples, different weights were assigned to different attributes of different classes, and the test set samples were classified. Finally, simulations were performed on 15 real datasets and the proposed algorithm was compared to other eight classification algorithms. The results show that the proposed algorithm has the classification accuracy and F1 value increased by about 28 percentage points and 23.1 percentage points respectively. At the same time, the comparision results of Wilcoxon signed-rank test, Friedman rank variance test and Hollander-Wolfe's pairwise processing show that the proposed improved algorithm outperforms the other eight classification algorithms in terms of classification accuracy and k value selection.

Key words: Local Mean-based Pseudo Nearest Neighbor (LMPNN) algorithm, feature weighting, optimization model, Success-History based parameter Adaptation for Differential Evolution (SHADE), parameter adaption

摘要： 针对局部均值伪近邻（LMPNN）算法对k值敏感且忽略了每个属性对分类结果的不同影响等问题，提出了一种参数独立的加权局部均值伪近邻分类（PIW-LMPNN）算法。首先，利用差分进化算法的最新变体——基于成功历史记录的自适应参数差分进化（SHADE）算法对训练集样本进行优化，从而得到最佳k值和一组与类别相关的最佳权重；其次，计算样本间的距离时赋予每类的每个属性不同的权重，并对测试集样本进行分类。在15个实际数据集上进行了仿真实验，并把所提算法与其他8种分类算法进行了比较，实验结果表明，所提算法的分类准确率和F1值分别最大提高了约28个百分点和23.1个百分点；同时Wilcoxon符号秩检验、Friedman秩方差检验以及Hollander-Wolfe两处理的比较结果表明，所提出的改进算法在分类精度以及k值选择方面相较其他8种分类算法具有明显优势。

关键词: 局部均值伪近邻算法, 特征权重, 优化模型, 基于成功历史记录的自适应参数差分进化, 参数自适应

CLC Number:

TP301

CAI Ruiguang, ZHANG Desheng, XIAO Yanting. Parameter independent weighted local mean-based pseudo nearest neighbor classification algorithm[J]. Journal of Computer Applications, 2021, 41(6): 1694-1700.

蔡瑞光, 张德生, 肖燕婷. 参数独立的加权局部均值伪近邻分类算法[J]. 计算机应用, 2021, 41(6): 1694-1700.

References

[1] DUDA R O,HART P E,STORK D G. Pattern Classification[M]. New York:Wiley & Sons,Inc.,2000:6-10.
[2] MITANI Y,HAMAMOTO Y. A local mean-based nonparametric classifier[J]. Pattern Recognition Letters,2006,27(10):1151-1159.
[3] ZENG Y,YANG Y,ZHAO L. Pseudo nearest neighbor rule for pattern classification[J]. Expert Systems with Applications,2009, 36(2 Pt 2):3587-3595.
[4] GOU J, ZHAN Y, RAO Y, et al. Improved pseudo nearest neighbor classification[J]. Knowledge-Based Systems,2014,70:361-375.
[5] 葛月月, 曾勇, 胡江平, 等. 改进局部均值与类均值权重的近邻分类[J]. 计算机工程与应用, 2017, 53(17):137-142.(GE Y Y, ZENG Y,HU J P,et al. Improved nearest neighbor classification based on local mean and class mean with modified weights[J]. Computer Engineering and Applications, 2017, 53(17):137-142.)
[6] GOU J,QIU W,YI Z,et al. A local mean representation-based knearest neighbor classifier[J]. ACM Transactions on Intelligent Systems and Technology,2019,10(3):Article No. 29.
[7] GOU J,QIU W,YI Z,et al. Locality constrained representationbased k-nearest neighbor classification[J]. Knowledge-Based Systems,2019,167:38-52.
[8] 邱文默. 基于局部多均值的近邻模式分类[D]. 镇江:江苏大学, 2019:32-52.(QIU W M. Multi local means-based nearest neighbor pattern classification[D]. Zhenjiang:Jiangsu University,2019:32-52.)
[9] MA H,GOU J,WANG X,et al. Sparse coefficient-based k-nearest neighbor classification[J]. IEEE Access,2017,5:16618-16634.
[10] STORN R,PRICE K. Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces[J]. Journal of Global Optimization,1997,11(4):341-359.
[11] TANABE R,FUKUNAGA A. Success-history based parameter adaptation for differential evolution[C]//Proceedings of the 2013 IEEE Congress on Evolutionary Computation. Piscataway:IEEE, 2013:71-78.
[12] BISWAS N, CHAKRABORTY S, MULLICK S S, et al. A parameter independent fuzzy weighted k-nearest neighbor classifier[J]. Pattern Recognition Letters,2018,101:80-87.
[13] NABABAN A A,SITOMPUL O S,TULUS. Attribute weighting based K-nearest neighbor using gain ratio[J]. Journal of Physics:Conference Series,2018,1007:Article No. 012007.
[14] 茆诗松, 吕晓玲. 概率论与数理统计教程[M]. 2版. 北京:中国人民大学出版社,2016:415-420. (MAO S S,LYU X L. Mathematical Statics[M]. 2nd ed. Beijing:China Renmin University Press,2016:415-420.)
[15] FRIEDMAN M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance[J]. Journal of the American Statistical Association,1937,32(200):675-701.
[16] 王星, 褚挺进. 非参数统计[M]. 2版. 北京:清华大学出版社, 2014:126-131.(WANG X,CHU T J. Non-parametric Statistics[M]. 2nd ed. Beijing:Tsinghua University Press, 2014:126-131.)
[17] GOU J,QIU W,MAO Q,et al. A multi-local means based nearest neighbor classifier[C]//Proceedings of the 2017 IEEE 29th International Conference on Tools with Artificial Intelligence. Piscataway:IEEE,2017:448-452.

Parameter independent weighted local mean-based pseudo nearest neighbor classification algorithm

参数独立的加权局部均值伪近邻分类算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 11

Recommended Articles

Metrics

[1]	YANG Hualong, WANG Meiyu, XIN Yuchen. Inventory routing optimization model with heterogeneous vehicles based on horizontal collaboration strategy [J]. Journal of Computer Applications, 2021, 41(10): 3040-3048.
[2]	ZHU Jie, ZHANG Junsan, WU Shufang, DONG Yukun, LYU Lin. Multi-center convolutional feature weighting based image retrieval [J]. Journal of Computer Applications, 2018, 38(10): 2778-2781.
[3]	YAN Hua, GAO Li, LIU Guoyong, WANG Hongqi. Petrol-oil and lubricants support model based on multiple time windows [J]. Journal of Computer Applications, 2015, 35(7): 2096-2100.
[4]	QIU Yunfei, LIU Shixing, LIN Mingming, SHAO Liangshan. Feature transfer weighting algorithm based on distribution and term frequency-inverse class frequency [J]. Journal of Computer Applications, 2015, 35(6): 1643-1648.
[5]	ZHI Xiaobin, XU Zhaohui. Robust soft subspace clustering algorithm with feature weight self-adjustment mechanism [J]. Journal of Computer Applications, 2015, 35(3): 770-774.
[6]	LIU Lei CHEN Xing-shu YIN Xue-yuan DUAN Yi LV Zhao. Network User Identify On Feature Weighting Naive Bayes Classification Algorithm [J]. Journal of Computer Applications, 2011, 31(12): 3268-3270.
[7]	. Margin maximization feature weighting with better adaptability [J]. Journal of Computer Applications, 2010, 30(9): 2275-2278.
[8]	. Optimal configuration of manufacturing resources based on transportation factors in networked manufacturing [J]. Journal of Computer Applications, 2010, 30(11): 2902-2905.
[9]	. Bilevel optimization based multi-model modeling method for nonlinear systems [J]. Journal of Computer Applications, 2009, 29(05): 1261-1263.
[10]	. Distributed query optimization model based on data grid [J]. Journal of Computer Applications, 2008, 28(10): 2553-2557.
[11]	. Study of supply chain optimization scheduling in mass customization based on ant colony algorithm [J]. Journal of Computer Applications, 2006, 26(11): 2631-2634.