• •    

DPCS2017+5+大图结构特征对划分效果的影响研究

罗晓霞1,2,司丰玮3,罗香玉2   

  1. 1.
    2. 西安科技大学
    3. 西安科技大学研究生院
  • 收稿日期:2017-08-11 修回日期:2017-08-23 发布日期:2017-08-23
  • 通讯作者: 司丰玮

DPCS2017+5+Research on effects of large-scale graphs' structural features on partitioning quality

  • Received:2017-08-11 Revised:2017-08-23 Online:2017-08-23
  • Contact: Fengwei SI

摘要: 针对大图结构特征如何影响划分效果这一问题,首先提出一种通过顶点度分布特征来描述大图结构特征的方法。然后,基于真实的图数据产生若干顶点数和边数相同,但结构特征不同的仿真数据集,通过实验计算真实图与仿真图之间的相似度,证明该方法对描述真实大图结构特征的有效性。最后,通过Hash和点对交换划分算法,验证图结构特征与划分效果之间的关系。当点对交换划分算法执行到5万次时,划分真实图G(6301,20777)其交叉边数比Hash划分算法降低了54.32%,划分仿真图数据集中结构特征差异明显的两个图时,交叉边数分别为6233和316。实验结果表明点对交换划分算法能够减少交叉边数,图的顶点度分布差异越大,划分后交叉边数越少,划分效果越好。因此大图结构特征影响其划分效果,这为建立图的结构特征与划分效果之间的关系模型研究奠定了基础。

关键词: 大图分布式处理, 大图划分, 图结构特征, 负载均衡, 交叉边

Abstract: This research focuses on the problem of how the large-scale graphs' structural features affect the partitioning quality. Firstly, through the structural features of vertex degree, a method of describing the large-scale graphs' structural features was proposed. Then, based on the real graph data, a number of simulation data sets with the same vertices and edges but different structural features are calculated. Through the similarity between the real graph and the simulation graph calculated by the algorithm, the validity of the method for describing the structure of the real large-scale graphs was verified. Finally, the relationship between the structural features of the graph and the effect of the partition is verified by the Hash algorithm and point-to-point exchange algorithm. When the point-to-point algorithm is performed to 50,000 times, the number of intersecting edges of the real graph G (6301, 20777) is reduced by 54.32% compared with the Hash partitioning algorithm. When the two graphs with entirely different structural features are partitioned in the simulation data, the number of cross edges are 6233 and 316, respectively. The experimental results show that the point-to-point algorithm can reduce the number of intersecting edges. The larger difference of the vertex degree distribution and the smaller the number of intersecting edges are, the better partitioning quality is. Therefore, the structural features of large graphs affect the division effect, which lays the foundation for the scale-model investigation of the relationship between structural features and partitioning quality.

Key words: large-scale graphs' distributed processing, large-scale graphs' partition, graph structural features, load balance, crossed edge