计算机应用 ›› 2011, Vol. 31 ›› Issue (02): 428-431.

• 数据库与数据挖掘 • 上一篇    下一篇

基于MPI的并行PSO混合K均值聚类算法

吕奕清,林锦贤   

  1. 福州大学数学与计算机科学学院
  • 收稿日期:2010-07-14 修回日期:2010-09-04 发布日期:2011-02-01 出版日期:2011-02-01
  • 通讯作者: 吕奕清
  • 基金资助:
    福建省高校科研专项重点项目

Parallel PSO combined with K-means clustering algorithm based on MPI

  • Received:2010-07-14 Revised:2010-09-04 Online:2011-02-01 Published:2011-02-01
  • Contact: LV Yi-Qing

摘要: 传统的串行聚类算法在对海量数据进行聚类时性能往往不尽如人意,为了适应海量数据聚类分析的性能要求,针对传统聚类算法的不足,提出一种基于消息传递接口(MPI)集群的并行PSO混合K均值聚类算法。首先将改进的粒子群与K均值结合,提高该算法的全局搜索能力,然后利用该算法提出一种新的并行聚类策略,并将该算法与K均值聚类算法、粒子群优化(PSO)聚类算法进行比较。实验结果表明,该算法不仅具有较好的全局收敛性,而且具有较高的加速比。

关键词: 消息传递接口集群, 粒子群优化算法, K均值算法, 并行聚类

Abstract: The performance of traditional serial clustering algorithm cannot meet the needs of data clustering of the huge amounts of data. To enhance the performance of clustering algorithm, a new clustering algorithm combining parallel Particle Swarm Optimization (PSO) with K-means based on MPI was proposed in this paper. Firstly, the improved PSO was combined with K-means to enhance the capacity of global search, and then a new parallel clustering algorithm was proposed, which was compared with K-means and PSO clustering algorithms. The experimental results show that the new algorithm has better global convergence, and also has higher speed-up ratio.

Key words: Message Passing Interface (MPI) cluster, Particle Swarm Optimization (PSO) algorithm, K-means algorithm, parallel clustering