计算机应用

• 数据挖掘 • 上一篇    下一篇

一种改进的基于最大流的Web社区挖掘算法

张金增 范明   

  1. 郑州大学 郑州大学
  • 收稿日期:2008-07-16 修回日期:1900-01-01 发布日期:2009-01-01 出版日期:2009-01-01
  • 通讯作者: 张金增

Mining Web community based on improved maximum flow algorithm

Jin-Zeng ZHANG FAN Ming   

  • Received:2008-07-16 Revised:1900-01-01 Online:2009-01-01 Published:2009-01-01
  • Contact: Jin-Zeng ZHANG

摘要: 针对原始最大流算法给每条边的边容量分配一个常量值,在社区质量及成员数量上造成的问题,提出了一种改进的Web社区挖掘算法。该算法考虑不同边的重要性差异,将加权PageRank算法中页面的重要度转化为衡量页面之间边重要性的传递概率值,并使用该值对边容量进行赋值。实验结果表明,改进的算法有效地提高了Web社区的质量。

关键词: Web社区, Web图, 最大流算法, 加权PageRank

Abstract: Given that the original maximum flow algorithm set a fixed edge capacity to each edge, which caused poor quality and improper size of communities, this paper proposed an improved algorithm for mining Web communities. The algorithm considered the differences between edges in terms of importance, and assigned different capacities to different edges by transforming the significant measurements of pages evaluated by weighted PageRank algorithm to edge-transferring probability scores to measure the importance of edges, and assigning them to corresponding edges as their capacities. The experimental results show that the improved maximum flow algorithm improves the quality of Web community effectively.

Key words: web community, web graph, maximum flow algorithm, weighted PageRank