Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (6): 1852-1861.DOI: 10.11772/j.issn.1001-9081.2021040555

• Artificial intelligence • Previous Articles    

Improved sine cosine algorithm for optimizing feature selection and data classification

Liang CHEN, Xianfeng TANG()   

  1. Information Technology Center,Zhejiang University,Hangzhou Zhejiang 310027,China
  • Received:2021-04-12 Revised:2021-07-12 Accepted:2021-07-20 Online:2022-06-22 Published:2022-06-10
  • Contact: Xianfeng TANG
  • About author:CHEN Liang,born in 1980,M. S.,engineer. His research interests include artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61602141)

改进正余弦算法优化特征选择及数据分类

陈亮, 汤显峰()   

  1. 浙江大学 信息技术中心,杭州 310027
  • 通讯作者: 汤显峰
  • 作者简介:陈亮(1980—),男,四川遂宁人,工程师,硕士,主要研究方向:人工智能
  • 基金资助:
    国家自然科学基金资助项目(61602141)

Abstract:

To address the shortcomings of the traditional Sine Cosine Algorithm (SCA) in dealing with complex optimization problems with local optimum and slow convergence,an improved SCA based on Inertia Weights and Cauchy Chaotic mutation (IWCCSCA) was proposed. Firstly, a curve adaptive amplitude adjustment factor update method based on exponential function was designed to balance global search and local development capacities; then, an adaptive decreasing inertia weight update mechanism was designed to improve the way of individual position update and accelerate algorithm convergence; and an individual disturbance mechanism based on elite Cauchy chaotic mutation was proposed to enhance the population diversity and avoid falling into the local optimum. IWCCSCA was verified to be effective in improving convergence speed and optimizing accuracy by solving the best solutions of eight benchmark functions. Furthermore, IWCCSCA was used for feature subset selection problem in original data feature set, and a feature selection algorithm based on IWCCSCA was put forward, namely IWCCSCA-FS. The mapping relationship between individual position and feature subset was realized through converting the continuous optimization of sine cosine function to binary optimization of feature selection, and the quality of candidate solutions was evaluated by a fitness function considering feature selection number and classification accuracy simultaneously. Test results on UCI benchmark datasets validate that IWCCSCA-FS can effectively select the optimal feature subset, reduce feature dimension and improve data classification accuracy.

Key words: Sine Cosine Algorithm (SCA), inertia weight, Cauchy mutation, chaotic mapping, feature selection

摘要:

针对传统正余弦算法(SCA)处理复杂优化问题时存在易得局部最优和收敛慢的不足,提出一种基于惯性权重与柯西混沌变异的改进正余弦算法IWCCSCA。首先设计了基于指数函数的曲线自适应振幅调整因子更新方法,用于均衡个体的全局搜索与局部开发能力;接着设计了自适应递减惯性权重更新机制,以改进个体位置更新方式,加快算法收敛;还设计了基于精英柯西混沌变异的个体扰动机制,以提升种群多样性,避免局部最优。利用8种基准函数寻优测试验证了IWCCSCA能够有效提升收敛速度和寻优精度。此外,将IWCCSCA应用于数据原始特征集中的特征子集选取问题,提出了基于IWCCSCA的特征选择算法IWCCSCA-FS。通过将正余弦函数的连续优化转换为特征选择的二进制优化,实现了个体位置与特征子集间的映射关系,以同步考虑特征选择量与分类准确率的适应度函数来评估候选解质量。UCI基准数据集的测试结果表明,IWCCSCA-FS算法可以有效选择最优特征子集,降低特征维度,提高数据分类准确率。

关键词: 正余弦算法, 惯性权重, 柯西变异, 混沌映射, 特征选择

CLC Number: