计算机应用 ›› 2018, Vol. 38 ›› Issue (1): 1-5.DOI: 10.11772/j.issn.1001-9081.2017071967

• 2017年全国开放式分布与并行计算学术年会(DPCS 2017)论文 •    下一篇

大图结构特征对划分效果的影响

罗晓霞, 司丰玮, 罗香玉   

  1. 西安科技大学 计算机科学与技术学院, 西安 710054
  • 收稿日期:2017-08-11 修回日期:2017-08-23 出版日期:2018-01-10 发布日期:2018-01-22
  • 通讯作者: 罗香玉
  • 作者简介:罗晓霞(1964-),女,陕西扶风人,教授,主要研究方向:大数据与云计算、软件工程、分布式计算;司丰玮(1992-),男,甘肃兰州人,硕士研究生,CCF会员,主要研究方向:分布式存储、图的划分算法;罗香玉(1984-),女,河北宁晋人,讲师,博士,CCF会员,主要研究方向:分布式存储、大数据。
  • 基金资助:
    国家自然科学基金资助项目(41472234);陕西省教育厅专项科学研究计划项目(15JK1468);西安科技大学培育基金资助项目(201633)。

Effects of large-scale graph structural feature on partitioning quality

LUO Xiaoxia, SI Fengwei, LUO Xiangyu   

  1. College of Computer Science and Technology, Xi'an University of Science and Technology, Xi'an Shaanxi 710054, China
  • Received:2017-08-11 Revised:2017-08-23 Online:2018-01-10 Published:2018-01-22
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (41472234), the Scientific Research Program of Shaanxi Provincial Education Department (15JK1468), the Xi'an University of Science and Technology Foundation for Fostering Talents (201633).

摘要: 针对大图结构特征如何影响划分效果这一问题,提出一种通过顶点度分布特征来描述大图结构特征的方法。首先,基于真实的图数据产生若干顶点数和边数相同、但结构特征不同的仿真数据集,通过实验计算真实图与仿真图之间的相似度,证明该方法对描述真实大图结构特征的有效性。然后,通过Hash和点对交换划分算法,验证图结构特征与划分效果之间的关系。当点对交换划分算法执行到5万次时,划分一个有6301个顶点和20777条边的真实图其交叉边数比Hash划分算法降低了54.32%,划分仿真图数据集中结构特征差异明显的两个图时,交叉边数分别为6233和316。实验结果表明,点对交换划分算法能够减少交叉边数,图的顶点度分布差异越大,划分后交叉边数越少,划分效果越好,因此大图结构特征影响其划分效果,这为建立图的结构特征与划分效果之间的关系模型研究奠定了基础。

关键词: 大图分布式处理, 大图划分, 图结构特征, 负载均衡, 交叉边

Abstract: Focusing on how the large-scale graphs' structural features affect the partitioning quality, through the structural features of vertex degree, a method of describing large-scale graphs' structural features was proposed. Firstly, based on the real graph data, a several simulation datasets with the same numbers of vertices and edges but different structural features were generated. Through the similarity between the real graph and the simulation graph calculated by the proposed algorithm, the validity of the method for describing the structural features of the real large-scale graphs was verified. Secondly, the relationship between the structural features of the graph and the quality of partitioning was verified by the Hash algorithm and point-to-point exchange algorithm. When the point-to-point algorithm was performed for 50000 times, the number of crossed edges of a real graph with 6301 vertexes and 20777 edges was reduced by 54.32% compared with the Hash partitioning algorithm. When the two graphs with entirely different structural features in the simulation data were partitioned, the number of crossed edges were 6233 and 316 respectively. The experimental results show that the point-to-point algorithm can reduce the number of crossed edges. The larger the difference of the vertex degree distribution and the smaller the number of crossed edges are, the better the partitioning quality is. Therefore, the structural features of large graphs affect the partitioning effect, which lays the foundation for the model investigation of the relationship between structural features and partitioning quality.

Key words: distributed processing of large-scale graph, partitioning of large-scale graph, graph structural feature, load balance, crossed edge

中图分类号: