Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (4): 1206-1213.DOI: 10.11772/j.issn.1001-9081.2022030444

• Computer software technology • Previous Articles    

Feature selection method based on self-adaptive hybrid particle swarm optimization for software defect prediction

Zhenhua YU1, Zhengqi LIU1, Ying LIU2(), Cheng GUO3   

  1. 1.College of Computer Science and Technology,Xi’an University of Science and Technology,Xi’an Shaanxi 710054,China
    2.National Key Laboratory for Complex Systems Simulation,Beijing 100101,China
    3.Xi’an Institute of Applied Optics,Xi’an Shaanxi 710065,China
  • Received:2022-04-08 Revised:2022-06-02 Accepted:2022-06-02 Online:2023-01-11 Published:2023-04-10
  • Contact: Ying LIU
  • About author:YU Zhenhua, born in 1977, Ph. D., professor. His research interests include software defect prediction, cyber-physical systems.
    LIU Zhengqi, born in 1997, M. S. candidate. His research interests include software defect prediction.
    GUO Cheng, born in 1977, Ph. D., research fellow. His research interests include software architecture, overall design of optoelectronic system.
  • Supported by:
    National Natural Science Foundation of China(61873277)

基于自适应混合粒子群优化的软件缺陷预测特征选择方法

于振华1, 刘争气1, 刘颖2(), 郭城3   

  1. 1.西安科技大学 计算机科学与技术学院, 西安 710054
    2.复杂系统仿真总体重点实验室, 北京 100101
    3.西安应用光学研究所, 西安 710065
  • 通讯作者: 刘颖
  • 作者简介:于振华(1977—),男,山东乳山人,教授,博士,主要研究方向:软件缺陷预测、信息物理融合系统;
    刘争气(1997—),男,山东菏泽人,硕士研究生,主要研究方向:软件缺陷预测;
    郭城(1977—),男,陕西紫阳人,研究员,博士,主要研究方向:软件体系结构、光电系统总体设计。
  • 基金资助:
    国家自然科学基金资助项目(61873277)

Abstract:

Feature selection is a key step in data preprocessing for software defect prediction. Aiming at the problems of existing feature selection methods such as not significant dimension reduction performance and low classification accuracy of selected optimal feature subset, a feature selection method for software defect prediction based on Self-adaptive Hybrid Particle Swarm Optimization (SHPSO) was proposed. Firstly, combined with population partition, a self-adaptive weight update strategy based on Q-learning was designed, in which Q-learning was introduced to adaptively adjust the inertia weight according to the states of the particles. Secondly, to balance the global search ability in the early stage of the algorithm and the convergence speed in the later stage, the curve adaptivity based time-varying learning factors were proposed. Finally, a hybrid location update strategy was adopted to help particles jump out of the local optimal solution as soon as possible and increase the diversity of particles. Experiments were carried out on 12 public software defect datasets. The results show that the proposed method can effectively improve the classification accuracy of software defect prediction model and reduce the dimension of feature space compared with the method using all features, the commonly used traditional feature selection methods and the mainstream feature selection methods based on intelligent optimization algorithms. Compared with Improved Salp Swarm Algorithm (ISSA), the proposed method increases the classification accuracy by about 1.60% on average and reduces the feature subset size by about 63.79% on average. Experimental results show that the proposed method can select a feature subset with high classification accuracy and small size.

Key words: feature selection, software defect prediction, Particle Swarm Optimization (PSO) algorithm, Sine Cosine Algorithm (SCA), Q-learning

摘要:

特征选择是软件缺陷预测中数据预处理的关键步骤。针对现有特征选择方法存在的降维效果不显著、选取的最优特征子集分类精度低等问题,提出了一种基于自适应混合粒子群优化(SHPSO)的软件缺陷预测特征选择方法。首先,结合种群划分设计了基于Q学习的自适应权重更新策略,其中引入Q学习根据粒子的状态自适应地调整惯性权重;其次,为了平衡算法前期的全局搜索能力和后期的收敛速度,提出了基于曲线自适应的时变学习因子;最后,采用混合位置更新策略帮助粒子尽快跳出局部最优解,并增加粒子的多样性。在12个公开软件缺陷数据集上进行实验验证的结果表明,与使用全部特征的方法、常用的传统特征选择方法及主流的基于智能优化算法的特征选择方法相比,所提方法在提高软件缺陷预测模型分类性能和降低特征空间维度上均取得了有效的结果。与改进樽海鞘群算法(ISSA)相比,所提方法的分类精度平均提高了约1.60%,特征子集规模平均降低了约63.79%。实验结果表明,所提方法可以选出分类精度较高且数量较少的特征子集。

关键词: 特征选择, 软件缺陷预测, 粒子群优化算法, 正余弦算法, Q学习

CLC Number: