计算机应用 ›› 2010, Vol. 30 ›› Issue (07): 1930-1932.

• 数据库技术 • 上一篇    下一篇

基于复杂属性相似度的聚类算法及其应用研究

彭昂1,王如龙2,陈泉泉3,张锦4   

  1. 1. 湖南大学软件学院
    2. 湖南大学教授,湖南省计算技术研究所研究员
    3. 湖南大学
    4. 湖南大学 浙江大学
  • 收稿日期:2009-12-11 修回日期:2010-03-07 发布日期:2010-07-01 出版日期:2010-07-01
  • 通讯作者: 彭昂
  • 基金资助:
    国家自然科学基金资助项目;863计划重点项目;国家科技支撑计划项目

Clustering algorithm based on complex attributes similarity and its applications

  • Received:2009-12-11 Revised:2010-03-07 Online:2010-07-01 Published:2010-07-01
  • Contact: Peng Ang
  • Supported by:
    ;National Key Technology R&D Program

摘要: 针对电信客户的有效细分问题,利用属性相似度度量思想,提出了一种面向复杂属性的聚类算法。该算法用复杂属性分布相似度函数衡量对象的相似性,然后根据相似性建立图模型,最后对图进行分割进行聚类。相比于传统基于选维和降维的聚类分析算法,提出的算法能有效处理高维数据和复杂属性。同时,算法在参数调节时,不需遍历原始数据,也减少了人工干预。利用真实电信客户数据进行的模拟实验也表明,提出的算法具有良好性能,可以有效解决电信客户细分问题。

关键词: 高维聚类, 混合属性, 客户细分, 图模型

Abstract: In order to divide the telecom customers effectively, a new clustering algorithm for complex attributes was proposed based on feature similarity measurement idea in this paper. In the algorithm, the objects similarities were measured by complex attributes’ distribution similarity function. Then, a graph model was constructed based on the similarity. Finally, the graph was divided to clusters. Compared with the traditional clustering algorithms based on selecting dimension and decreasing dimension, the proposed algorithm can process highdimension data and complex attributes effectively. Meanwhile, it does not need reviewing original date when modifying parameter. Real telecom customer data were used for simulation and the experimental results show that the algorithm can solve customer segmentation problem effectively.

Key words: high-dimension clustering, complex attribute, customer segmentation, graph model