Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (12): 3425-3432.DOI: 10.11772/j.issn.1001-9081.2018040934

Previous Articles     Next Articles

Parallel multi-layer graph partitioning method for solving maximum clique problems

GU Junhua1, HUO Shijie1, WU Junyan1, YIN Jun1, ZHANG Suqi2   

  1. 1. School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China;
    2. School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China
  • Received:2018-05-07 Revised:2018-07-10 Online:2018-12-10 Published:2018-12-15
  • Contact: 张素琪
  • Supported by:
    This work is partially supported by the Science and Technology Project of Tianjin (16ZXHLSF0023), the Basic Research Project of Tianjin (17JCTPJC55400), the Science and Technology Project of Hebei (17210305D), the Natural Science Foundation of Hebei (F2016202144).

求解最大团问题的并行多层图划分方法

顾军华1, 霍士杰1, 武君艳1, 尹君1, 张素琪2   

  1. 1. 河北工业大学 人工智能与数据科学学院, 天津 300401;
    2. 天津商业大学 信息工程学院, 天津 300134
  • 通讯作者: 张素琪
  • 作者简介:顾军华(1966-),男,河北赵县人,教授,博士,CCF会员,主要研究方向:智能信息处理、数据挖掘;霍士杰(1993-),男,河北石家庄人,硕士研究生,CCF会员,主要研究方向:数据挖掘、计算机仿真;武君艳(1994-),女,山西晋中人,硕士研究生,主要研究方向:数据挖掘、计算机仿真;尹君(1990-),女,陕西铜川人,硕士研究生,主要研究方向:数据挖掘;张素琪(1980-),女,河北邢台人,讲师,博士,CCF会员,主要研究方向:智能信息处理、数据挖掘。
  • 基金资助:
    天津市科技计划项目(16ZXHLSF0023);天津市基础研究计划项目(17JCTPJC55400);河北省科技计划项目(17210305D);河北省自然科学基金资助项目(F2016202144)。

Abstract: In big data environment, the mass of nodes in graph and the complexity of analysis bring forward higher requirement for the speed and accuracy of maximum clique problems. In order to solve the problems, a Parallel Multi-layer Graph Partitioning method for Solving Maximum Clique (PMGP_SMC) was proposed. Firstly, a new Multi-layer Graph Partitioning method (MGP) was proposed, the large-scale graph partitioning was executed to generate subgraphs while the clique structure of the original graph was maintained and not destroyed. Large-scale subgraphs were divided into multi-level graphs to further reduce the size of subgraphs. The graph computing framework of GraphX was used to achieve MGP to form a Parallel Multi-layer Graph Partitioning (PMGP) method. Then, according to the size of partitioned subgraph, the iteration number of Local Search algorithm Based on Penalty value (PBLS) was reduced, and the PBLS based on Speed optimization (SPBLS) was proposed to solve the maximum clique of each subgraph. Finally, PMGP method and SPBLS were combined to form PMGP_SMC. The large-scale dataset of Stanford was used for running test. The experimental results show that, the proposed PMGP reduces the maximum subgraph size by more than 100 times and the average subgraph size by 2 times compared with Parallel Single Graph Partitioning method (PSGP). Compared with PSGP for Solving Maximum Clique (PSGP_SMC), the proposed PMGP_SMC reduces the overall time by about 100 times, and its accuracy is consistent with that of Parallel Multi-layer Graph Partitioning for solving maximum clique based on Maximal Clique Enumeration (PMGP_MCE) in solving the maximum clique. The proposed PMGP_SMC can solve the maximum clique of large-scale graph quickly and accurately.

Key words: big data, maximum clique, Spark, Multi-layer Graph Partitioning method (MGP), fast local search algorithm

摘要: 在当今大数据环境下,针对图中节点的海量性和分析的复杂性对最大团问题的研究在速度和精度上都提出了更高要求的问题,提出求解最大团问题的并行多层图划分方法(PMGP_SMC)。首先,提出一种新的多层图划分(MGP)方法,在保持原有图的团结构不被破坏的情况下对大规模图例划分产生子图,并对规模较大的子图进行多层图划分,进一步缩小子图规模,并且应用GraphX图计算框架实现MGP,形成并行MGP(PMGP)方法;然后,依据划分后的子图规模,减少了惩罚值局部搜索算法(PBLS)的迭代次数,提出基于速度优化的PBLS(SPBLS)来求解划分后的各个子图的最大团;最后,将PMGP和SPBLS相结合形成PMGP_SMC。采用Stanford大规模数据集运行测试,实验结果表明,PMGP相比并行单层图划分方法(PSGP),求得的最大子图规模能缩小至原来的1/100,平均子图规模能缩小至原来的1/2;PMGP_SMC相比求解最大团问题的PSGP(PSGP_SMC),总体时间缩短至原来的1/100,并且PMGP_SMC求解最大团的精度和基于极大团枚举求解最大团问题的并行多层图划分方法(PMGP_MCE)一致。PMGP_SMC能够快速精准地求解大规模图例的最大团。

关键词: 大数据, 最大团, Spark, 多层图划分方法, 快速局部搜索算法

CLC Number: