Journal of Computer Applications ›› 2013, Vol. 33 ›› Issue (02): 308-315.DOI: 10.3724/SP.J.1087.2013.00308

• Advanced computing • Previous Articles     Next Articles

Hierarchical clustering algorithm based on sticker and 2-armed DNA model

BAI Xue,REN Xiaoling,LIU Xiyu   

  1. School of Management Science and Engineering, Shandong Normal University, Jinan Shandong 250014, China
  • Received:2012-08-13 Revised:2012-09-21 Online:2013-02-01 Published:2013-02-25
  • Contact: BAI Xue

基于粘贴和2-臂DNA模型的层次聚类算法

白雪,任晓玲,刘希玉   

  1. 山东师范大学 管理科学与工程学院,济南 250014
  • 通讯作者: 白雪
  • 作者简介:白雪(1989-),女,北京人,硕士研究生,主要研究方向:DNA计算、聚类、信息管理;
    任晓玲(1988-),女,山东威海人,硕士研究生,主要研究方向:DNA计算、聚类;
    刘希玉(1964-),男,山东莱芜人,教授,博士研究生,主要研究方向:信息管理与电子商务、计算智能、计算机辅助创新设计。
  • 基金资助:
    国家自然科学基金资助项目;山东省自然科学基金资助项目;教育部人文社会科学研究项目;山东省社会科学基金资助项目

Abstract: In order to take full advantage of the high parallelism and huge storage capacity of DNA molecules in biological computing, this paper introduced DNA computing into hierarchical clustering to do global research on data set. For realizing the nearest neighbor hierarchical clustering, an algorithm combining sticker model with 2-armed DNA molecules was put forward. Based on the idea of MST (Minimum Spanning Tree), the first thing to do was generating complex DNA strands of all combinations of edges and then screening those containing n-1 edges. Based on the edges, it is needed to set the corresponding vertex stickers and keep those strands covering all the vertices. After that, weight strands constructed by 2-armed molecules would be appended at the end of the complex strands and the shortest ones could be detected by gel electrophoresis. Finally, by fluorescence analysis the clustering result can be got. In computer simulation, this algorithm may take different lengths of edges into account instead of varying the polynomial time complexity and the number of steps to read final results is set as a constant.

Key words: DNA computing, hierarchical clustering, Minimum Spanning Tree (MST), sticker model, 2-armed DNA molecule

摘要: 为了充分利用DNA分子在生物计算中的高度并行性和强大的存储能力,将DNA计算引入层次聚类实现对数据集的全局搜索。提出了粘贴模型与2-臂DNA分子相结合的混合模型求解最近邻层次聚类的DNA算法。针对二维数据空间,算法首先基于最小生成树思想产生图的边的所有组合链;其次筛选含n-1条边的链,基于边附着顶点,并选择包含全部顶点的复合链;再将复合链末尾连接相应边的权值片段,电泳出最短链;最后通过荧光分析法读解,得到最终的聚类结果。与已有文献同类算法对比表明,该算法在保持多项式操作时间下,更充分考虑连接边的长度,并将读解步骤数限定为常数步。

关键词: DNA计算, 层次聚类, 最小生成树, 粘贴模型, 2-臂DNA分子

CLC Number: