计算机应用 ›› 2013, Vol. 33 ›› Issue (05): 1285-1288.DOI: 10.3724/SP.J.1087.2013.01285

• 先进计算 • 上一篇    下一篇

基于聚类和微粒群优化的基因选择新方法

杨善秀,韩飞,关健   

  1. 江苏大学 计算机科学与通信工程学院,江苏 镇江 212013
  • 收稿日期:2012-12-04 修回日期:2012-12-27 出版日期:2013-05-01 发布日期:2013-05-08
  • 通讯作者: 杨善秀
  • 作者简介:杨善秀(1988-),女,江苏建湖人,硕士研究生,主要研究方向:机器学习、生物信息学;韩飞(1976-),男,安徽潜山人,副教授,主要研究方向:模式识别、进化计算、智能信息处理;关健(1987-),男,湖北潜江人,硕士研究生, 主要研究方向:机器学习、生物信息学。
  • 基金资助:

    国家自然科学基金资助项目(61271385, 60702056);江苏省自然科学基金资助项目(BK2009197)

New gene selection method based on clustering and particle swarm optimization

YANG Shanxiu,HAN Fei,GUAN Jian   

  1. School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang Jiangsu 212013, China
  • Received:2012-12-04 Revised:2012-12-27 Online:2013-05-08 Published:2013-05-01
  • Contact: YANG Shanxiu

摘要: 鉴于传统的基因选择方法会选出大量冗余基因从而导致较低的样本预测准确率,提出一种基于聚类和微粒群优化的基因选择算法。首先采用聚类算法将基因分成固定数目的簇;然后,采用极限学习机作为分类器进行簇中的特征基因分类性能评价,得到一个备选基因库;最后,采用基于微粒群优化和极限学习机的缠绕法从备选基因库中选择具有最大分类率、最小数目的基因子集。所选出的基因具有良好的分类性能。在两个公开的微阵列数据集上的实验结果表明,相对于一些经典的方法,新方法能够以较少的基因获得更高的分类性能。

关键词: 基因选择, 微阵列数据, 聚类, 微粒群优化, 极限学习机

Abstract: Since traditional gene selection methods may select a large number of irrelevant genes, which leads to low sample prediction accuracy, a new hybrid method based on clustering and Particle Swarm Optimization (PSO) was proposed for gene selection of microarray data in this paper. Firstly, genes were partitioned into a certain number of clusters by using clustering algorithm. Then Extreme Learning Machine (ELM) was applied to validate the classification performance of the genes selected from each cluster, which formed an initial gene pool. Finally, the wrapper approach based on PSO and ELM was used to select compact gene subset with high classification accuracy from the initial gene pool. The better classification accuracy on microarray data was provided with the genes selected by the proposed method. The experiments on two public microarray data sets verify that the proposed method can obtain better classification performance with fewer genes than other classical methods.

Key words: gene selection, microarray data, clustering, Particle Swarm Optimization (PSO), Extreme Learning Machine (ELM)

中图分类号: