计算机应用 ›› 2014, Vol. 34 ›› Issue (7): 1857-1861.DOI: 10.11772/j.issn.1001-9081.2014.07.1857

• 先进计算 • 上一篇    下一篇

融合遗传和蚁群算法并行求解最短公共超串

伍世刚,钟诚   

  1. 广西大学 计算机与电子信息学院,南宁 530004
  • 收稿日期:2013-12-31 修回日期:2014-02-13 出版日期:2014-07-01 发布日期:2014-08-01
  • 通讯作者: 钟诚
  • 作者简介:伍世刚(1988-),男,广西南宁人,硕士研究生,CCF会员,主要研究方向:网络与并行计算; 钟诚(1964-),男,广西桂平人,教授,博士生导师,博士,CCF高级会员,主要研究方向:并行计算、生物信息计算。
  • 基金资助:

    国家自然科学基金资助项目;广西研究生教育创新计划项目;广西教育厅—广西大学博士点建设基金资助项目

Parallel solving shortest common superstring using genetic algorithm and ant colony optimization

WU Shigang,ZHONG Cheng   

  1. School of Computer, Electronics and Information, Guangxi University, Nanning Guangxi 530004, China
  • Received:2013-12-31 Revised:2014-02-13 Online:2014-07-01 Published:2014-08-01
  • Contact: ZHONG Cheng

摘要:

依据各级缓存容量,将CPU主存中种群个体和蚂蚁个体数据划分存储到一级、二级和三级缓存中,以减少并行计算过程中数据在各级存储之间的传输开销,在CPU与GPU之间采取异步传送和不完全传送数据、GPU多个内核函数异步执行多个流的方法,设置GPU block线程数量为16的倍数、GPU共享存储器划分大小为32倍的bank,使用GPU常量存储器存储交叉概率、变异概率等需频繁访问的只读参数,将输入串矩阵和重叠部分长度矩阵只读大数据结构绑定到GPU纹理存储器,设计实现了一种多核CPU和GPU协同求解最短公共超串问题的计算、存储和通信高效的并行算法。求解多种规模的最短公共超串问题的实验结果表明,多核CPU与GPU协同并行算法比串行算法快70倍以上。

Abstract:

According to the capacity of multi-level caches, the population individuality and ant data in CPU main memory were assigned to L3 cache, L2 cache and L1 cache to reduce data transfer overhead among multiple caches during parallel computing. The asynchronous and incomplete transmission was performed between CPU and GPU, and multiple flows were asynchronously executed by multiple GPU kernel functions. The thread number of GPU block was set to the size of 16 times and GPU public memory was divided into bank with the size of 32 times. GPU constant memory was used to store read-only parameters such as cross probability and mutate probability which were read frequently. The read-only big data structure such as string set and overlap matrix were bound to GPU texture memory, and a computation, cache and communication-efficient parallel algorithm for CPU and GPU to coordinate solving shortest common superstring problem was designed and implemented. The experimental results for solving shortest common superstring problem with several sizes show the proposed CPU and GPU parallel algorithm is faster over 70 times than the sequential algorithm.

中图分类号: