计算机应用 ›› 2016, Vol. 36 ›› Issue (8): 2144-2149.DOI: 10.11772/j.issn.1001-9081.2016.08.2144

• 第六届中国数据挖掘会议(CCDM 2016) • 上一篇    下一篇

面向疾病分类的人类互作网络拓扑模块的功能同质性分析

高盼盼, 王宁, 周雪忠, 刘光明, 王惠欣   

  1. 北京交通大学 计算机与信息技术学院, 北京 100044
  • 收稿日期:2016-03-01 修回日期:2016-05-11 出版日期:2016-08-10 发布日期:2016-08-10
  • 通讯作者: 周雪忠
  • 作者简介:高盼盼(1989-),女,河南新蔡人,硕士研究生,主要研究方向:数据挖掘;王宁(1992-),男,山西大同人,硕士研究生,主要研究方向:数据挖掘;周雪忠(1977-),男,浙江永嘉人,教授,博士,主要研究方向:数据挖掘、机器学习、数据仓库、网络医学;刘光明(1986-),男,河北衡水人,博士,主要研究方向:机器学习、数据挖掘、人工智能;王惠欣(1989-),女,河南焦作人,硕士研究生,主要研究方向:数据挖掘。

Functional homogeneity analysis on topology module of human interaction network for disease classification

GAO Panpan, WANG Ning, ZHOU Xuezhong, LIU Guangming, WANG Huixin   

  1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044
  • Received:2016-03-01 Revised:2016-05-11 Online:2016-08-10 Published:2016-08-10

摘要: 鉴于网络医学中尚未有对疾病分类与功能蛋白模块功能同质性分析之间关系的研究,展开以下研究工作:首先,利用Mesh、String9等数据库中的数据构建了基因关系网络;其次,采用基于优化模块度的模块划分方法(如BGLL、非负矩阵分解(NMF)等聚类算法)对基因关系网络进行了划分;再次,对划分出来的模块进行了GO富集分析,通过对高致病拓扑模块和低致病拓扑模块的GO富集分析的比较,发现了疾病分类和蛋白模块功能特性在生物过程、细胞组分、分子功能等方面存在重要的生物学提示;最后,分析了疾病分类的拓扑模块的功能特性,通过对网络拓扑性质如平均度、密度、平均最短路径长度等方面的分析得到了各模块的功能特点数据,进一步揭示了疾病分类和功能模块之间的相关关系。

关键词: 网络医学, 疾病分类, GO富集分析, 蛋白功能模块, 拓扑模块, Mesh, String

Abstract: Concerning that there is no research about the relationship between disease classification and functional homogeneity analysis of functional protein module in network medicine, the following research work was carried out. Firstly, a gene relationship network was constructed based on the Mesh database and String9 database. Secondly, the gene relationship network was divided by using optimized modularity-based module classification method (such as BGLL, Nonnegtive Matrix Factorization (NMF) and other clustering algorithms). Thirdly, the GO enrichment analysis was carried out for divided modules, and through the comparison of GO enrichment analysis to the high and low pathogenic topology module, important biology suggests for disease classification could be found from protein functional module characteristics in the aspects of biological process, cellular component, molecular function and so on. Finally, the functional characteristics of topological module for disease classification were analyzed, and the data about the functional features of each module was obtained by the analysis to the properties of the network topology such as average degree, density, and average shortest path length, and further correlativity between disease classification and functional module was revealed.

Key words: network medicine, classification of disease, GO enrichment analysis, protein functional module, topological module, Mesh, String

中图分类号: