计算机应用 ›› 2011, Vol. 31 ›› Issue (02): 446-449.

• 数据库与数据挖掘 • 上一篇    下一篇

基于量子遗传算法的XML聚类方法

蒋勇1,谭怀亮2,李光文1   

  1. 1. 湖南化工职业技术学院
    2. 湖南大学计算机与通信学院
  • 收稿日期:2010-07-19 修回日期:2010-09-01 发布日期:2011-02-01 出版日期:2011-02-01
  • 通讯作者: 蒋勇
  • 基金资助:
    博士点基金

XML document clustering method based on quantum genetic algorithm

  • Received:2010-07-19 Revised:2010-09-01 Online:2011-02-01 Published:2011-02-01
  • Contact: JIANG Yong

摘要: 主要用模式分析的核方法与量子遗传算法相结合研究XML聚类,提出了一种基于量子遗传算法混合核聚算法的XML文档聚类新方法。该方法先对XML文档约简,以频繁标签序列建立向量空间核的核矩阵,用高斯核函数求解初始聚类和聚类中心,然后用初始聚类中心构造量子遗传算法的初始种群,通过量子遗传算法与核聚算法相结合求得全局最优解的聚类。实验结果表明,使用该算法的聚类比改进的核聚算法、K均值算法等单一方法具有良好的收敛性、稳定性和更高的全局最优。

关键词: XML文档, 高斯核函数, 核聚类算法, 量子遗传算法, XML聚类

Abstract: This paper mainly targets on XML clustering with kernel methods for pattern analysis and the quantum genetic algorithm。Then, a new method based on the quantum genetic algorithm and kernel clustering algorithm was proposed. To eliminate the XML documents first, the vector space kernels kernel matrix was generated with frequenttag sequence, the initial clustering and clustering center with the Gaussian kernel functions were solved, then the quantum genetic algorithms initial populations were constructed by the initial clustering center structure. Clustering of the globally optimal solutions was obtained through the combination of quantum genetic algorithm and kernel clustering algorithm. The experimental results show that the proposed algorithm is superior to the improved kernel clustering algorithm and K-means in good astringency, stability and overall optimal solutions.

Key words: XML document, Guassian kernel function, kernel clustering algorithm, quantum genetic algorithm, XML clustering