计算机应用 ›› 2016, Vol. 36 ›› Issue (11): 2969-2973.DOI: 10.11772/j.issn.1001-9081.2016.11.2969

• 第十六届中国粗糙集与软计算联合学术会议(CRSSC 2016)论文 • 上一篇    下一篇

融合粒子群优化和遗传算法的基因调控网络构建

孟军, 史贯丽   

  1. 大连理工大学 计算机科学与技术学院, 辽宁 大连 116023
  • 收稿日期:2016-06-03 修回日期:2016-06-20 出版日期:2016-11-10 发布日期:2016-11-12
  • 通讯作者: 孟军
  • 作者简介:孟军(1964-),女,辽宁大连人,副教授,博士,CCF会员,主要研究方向:机器学习、数据挖掘;史贯丽(1990-),女,河北邯郸人,硕士研究生,主要研究方向:机器学习、调控网络构建。
  • 基金资助:
    国家自然科学基金资助项目(61472061)。

Construction of gene regulatory network based on hybrid particle swarm optimization and genetic algorithm

MENG Jun, SHI Guanli   

  1. School of Computer Science and Technology, Dalian University of Technology, Dalian Liaoning 116023, China
  • Received:2016-06-03 Revised:2016-06-20 Online:2016-11-10 Published:2016-11-12
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61472061).

摘要: MicroRNA(miRNA)是一类大小为21~25 nt的内源性非编码小核糖核酸(RNA),通过与mRNA的3’-UTR互补结合,导致mRNA降解或翻译抑制来调控编码基因的表达。为了提高构建基因调控网络的准确度,提出一种基于粗糙集、融合粒子群(PSO)和遗传算法(GA)的基因调控网络构建方法(PSO-GA-RS)。该方法首先通过对序列信息进行特征提取;然后采用粗糙集的依赖度作为适应度函数,融合粒子群和遗传算法选出较优的特征子集;最后使用支持向量机(SVM)建立模型,预测未知的调控关系。在拟南芥数据集上进行实验,相比基于粗糙集和粒子群优化的特征选择方法和Rosetta算法,所提方法的预测准确率、F值和受试者工作特征(ROC)曲线面积最多能提高5%,在水稻数据集上最多能提高8%。实验结果表明所提方法能够比较准确地预测miRNA和靶基因之间的调控关系。

关键词: 基因调控网络, 粒子群优化, 遗传算法, 粗糙集, 特征选择

Abstract: MicroRNA(miRNA) is endogenous small non-coding RiboNucleic Acid (RNA), approximately 21~25 nucleotides in length, which plays an important role in gene expression via binding to the 3'-UnTranslated Region (UTR) of their mRNA target genes for translational repression or degradation of target messenger RNA. To improve the accuracy of gene regulatory network, a Rough Set based hybrid Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) method (PSO-GA-RS) was proposed. Firstly, features of sequence information were extracted, and then using rough set dependence as a fitness function, an optimal feature subset was selected through hybrid PSO and GA. At last, Support Vector Machine (SVM) was used to establish the model to predict the unknown regulatory relationships. The experimental results show that, compared with Feature Selection based on Rough Set and PSO (PSORSFS) and Rosetta algorithm, the accuracy, F measure and Receiver Operating Characteristic (ROC) curve area of PSO-GA-RS was improved at most 5% on Arabidopsis thaliana, and improved at most 8% on Oryza sativa dataset. The proposed method achieves an improved performance in identifying true connections between miRNA and their target genes.

Key words: gene regulatory network, Particle Swarm Optimization (PSO), Genetic Algorithm (GA), rough set, feature selection

中图分类号: