基于统一计算设备架构和基因表达式编程的自动聚类算法

doi:10.11772/j.issn.1001-9081.2013.07.1890

计算机应用 ›› 2013, Vol. 33 ›› Issue (07): 1890-1893.DOI: 10.11772/j.issn.1001-9081.2013.07.1890

基于统一计算设备架构和基因表达式编程的自动聚类算法

杜欣,刘大刚,张开活,申远,赵康,倪友聪

福建师范大学软件学院, 福州 350108

收稿日期:2013-01-04 修回日期:2013-03-01 出版日期:2013-07-01 发布日期:2013-07-06
通讯作者: 刘大刚
作者简介:杜欣(1979-)，女，新疆石河子人，副教授，博士，主要研究方向：演化计算、分布式计算；刘大刚(1988-)，男，山东潍坊人，硕士研究生，主要研究方向：演化计算、分布式计算；张开活(1990-)，男，福建福州人，主要研究方向：分布式计算。
基金资助:
福建省自然科学基金资助项目(2011J05146,2012J01250);福建省杰出青年培育计划项目(福建省教育厅［2011］29号);福建师范大学青年骨干教师培育计划项目(fjsdjk2012083);福建省科技计划重大项目(2011H6006);武汉大学软件工程国家重点实验室开放基金资助项目(SKLSE2012-09-28);福建省教育厅科技项目(JA12077,JA12080,JB11028,JB11029)

Auto-clustering algorithm based on compute unified device architecture and gene expression programming

DU Xin,LIU Dagang,ZHANG Kaihuo,SHEN Yuan,ZHAO Kang,NI Youcong

Faculty of Software, Fujian Normal University, Fuzhou Fujian 350108, China

Received:2013-01-04 Revised:2013-03-01 Online:2013-07-06 Published:2013-07-01
Contact: LIU Dagang

摘要/Abstract

摘要： 针对基于基因表达式编程(GEP)的自动聚类算法GEP-Cluster中聚类中心的筛选和聚合、计算数据对象到各聚类中心距离两个关键步骤效率不高的问题，提出了一种基于统一计算设备架构(CUDA)和GEP的自动聚类改进算法(CGEP-Cluster)。CGEP-Cluster算法采用基因阅读运算器方法对GEP-Cluster算法的聚类中心筛选和聚合步骤进行改进，并基于CUDA将GEP-Cluster算法中数据对象到各聚类中心距离的计算并行化。实验结果表明，在数据对象规模较大时，CGEP-Cluster算法可获得8倍左右的加速比。CGEP-Cluster算法可用于聚类数未知且数据对象规模较大情况下的自动聚类。

关键词: 统一计算设备架构, 基因表达式编程, 聚类算法, GEP-Cluster, 演化算法

Abstract: There are two inefficient steps in GEP-Cluster algorithm: one is screening and aggregation of clustering centers and the other is the calculation of distance between data objects and clustering centers. To solve the inefficiency, an auto-clustering algorithm based on Compute Unified Device Architecture (CUDA) and Gene Expression Programming (GEP), named as CGEP-Cluster, was proposed. Specifically, the screening, and aggregation of clustering center step was improved by Gene Read & Compute Machine (GRCM) method, and CUDA was used to parallel the calculation of distance between data objects and clustering centers. The experimental results show that compared with GEP-Cluster algorithm, CGEP-Cluster algorithm can speed up by almost eight times when the scale of data objects is large. CGEP-Cluster can be used to implement automatic clustering with the clustering number unknown and large data object scale.

Key words: Compute Unified Device Architecture (CUDA), Gene Expression Programming (GEP), clustering algorithm, GEP-cluster, evolutionary algorithm

中图分类号:

TP301.6

杜欣刘大刚张开活申远赵康倪友聪. 基于统一计算设备架构和基因表达式编程的自动聚类算法[J]. 计算机应用, 2013, 33(07): 1890-1893.

DU Xin LIU Dagang ZHANG Kaihuo SHEN Yuan ZHAO Kang NI Youcong. Auto-clustering algorithm based on compute unified device architecture and gene expression programming[J]. Journal of Computer Applications, 2013, 33(07): 1890-1893.

参考文献

［1］WAGSTAFF K, CARDIE C, ROGERS S. Constrained k-means clustering with background knowledge ［C］// Proceedings of the Eighteenth International Conference on Machine Learning. San Francisco： Morgan Kaufmann, 2001： 577-584.

［2］赖玉霞,刘建平,杨国兴.基于遗传算法的K均值聚类分析［J］.计算机工程,2008,34(20)：200-202.

［3］KUO R J, SYU Y J, CHEN Z Y. Integration of particle swarm optimization and genetic algorithm for dynamic clustering ［J］. Information Sciences, 2012, 195(4)：124-140.

［4］陈瑜,唐常杰,叶尚玉,等.基于基因表达式编程的自动聚类方法［J］.四川大学学报：工程科学版,2007,39(6)：107-112.

［5］姜代红,张三友.基于基因表达式编程的K均值自动聚类算法［J］.计算机仿真,2010,27(12)：216-220.

［6］DEBATTISTI S, MARLAT N, MUSSI L. Implementation of a simple genetic algorithm within the CUDA architecture ［C］// Proceedings of the 11th Annual Conference Companion Genetic and Evolutionary Computation. New York： ACM Press, 2009： 151-152.
［7］POSPCHAL P, JAROS J, SCHWARZ J. Parallel genetic algorithm on the CUDA architecture ［C］// Proceedings of the 2010 International Conference on Applications of Evolutionary Computation. Berlin： Springer, 2010： 442-451.

［8］HARDING S, BANZHAF W. Fast genetic programming on GPUs ［C］// Proceedings of the 10th European Conference on Genetic Programming. Berlin： Springer, 2007： 90-101.

［9］CANO A, ZAFRA A, VENTURA S. Solving classification problems using genetic programming algorithms on GPUs ［C］// Proceedings of the 5th International Conference on Hybrid Artificial Intelligence Systems. Berlin： Springer, 2010： 17-26.

［10］SHAO S, LIU X, ZHOU M. A GPU-based implementation of an enhanced GEP algorithm ［C］// Proceedings of the Fourteenth International Conference on Genetic and Evolutionary Computation Conference. New York： ACM Press, 2012： 999-1006.

［11］倪胜巧,唐常杰,王有为.基于GPU的基因表达式编程性能提升技术［J］.计算机研究与发展,2008,45(增刊)：227-233.

［12］姜大志,吴志健,康立山,等.基因表达式程序设计的 GRCM 方法［J］.系统仿真学报,2006,18(6)：1466-1468.

[1]	湛航, 何朗, 黄樟灿, 李华峰, 张蔷, 谈庆. 改进的基于层次距离的基因表达式编程特征选择分类算法[J]. 计算机应用, 2021, 41(9): 2658-2667.
[2]	杨先凤, 贵红军, 傅春常. 统一计算设备架构下的F-X域预测滤波并行算法[J]. 计算机应用, 2021, 41(2): 486-491.
[3]	孙建军, 徐岩. 基于加权改进模糊C均值聚类的欠定混合矩阵估计[J]. 计算机应用, 2020, 40(6): 1769-1773.
[4]	黄永鑫, 唐雪飞. 基于近邻传播聚类和TANE算法的高校数据中函数依赖的发现[J]. 计算机应用, 2020, 40(1): 90-95.
[5]	丁超, 赵海, 司帅宗, 朱剑. 正常衰老的人脑功能网络演化模型[J]. 计算机应用, 2019, 39(4): 963-971.
[6]	毛伊敏, 刘银萍, 梁田, 毛丁慧. 基于模糊谱聚类的不确定蛋白质相互作用网络功能模块挖掘[J]. 计算机应用, 2019, 39(4): 1032-1040.
[7]	丁成, 王秋萍, 王晓峰. 基于广义反向学习的磷虾群算法及其在数据聚类中的应用[J]. 计算机应用, 2019, 39(2): 336-342.
[8]	刘晓明, 沈明玉, 侯整风. 基于Levy飞行的萤火虫模糊聚类算法[J]. 计算机应用, 2019, 39(11): 3257-3262.
[9]	叶双, 杨晓敏, 严斌宇. 基于自适应锚定邻域回归的图像超分辨率算法[J]. 计算机应用, 2019, 39(10): 3040-3045.
[10]	邱保志, 程栾. 基于拉普拉斯中心性和密度峰值的无参数聚类算法[J]. 计算机应用, 2018, 38(9): 2511-2514.
[11]	邵伦, 周新志, 赵成萍, 张旭. 基于多维网格空间的改进K-means聚类算法[J]. 计算机应用, 2018, 38(10): 2850-2855.
[12]	侯海耀, 钱育蓉, 英昌甜, 张晗, 卢学远, 赵燚. 基于Hilbert-R树分级索引的时空查询算法[J]. 计算机应用, 2018, 38(10): 2869-2874.
[13]	王红, 葛丽娜, 王苏青, 王丽颖, 张翼鹏, 梁竣程. 基于OPTICS聚类的差分隐私保护算法的改进[J]. 计算机应用, 2018, 38(1): 73-78.
[14]	李焱, 刘弘, 郑向伟. 折半聚类算法在基于社会力的人群疏散仿真中的应用[J]. 计算机应用, 2017, 37(5): 1491-1495.
[15]	王则林, 郝水侠. 运用差分演化算法实现包匹配多层核心基的提取[J]. 计算机应用, 2017, 37(3): 777-781.

基于统一计算设备架构和基因表达式编程的自动聚类算法

Auto-clustering algorithm based on compute unified device architecture and gene expression programming

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics