HITS-based topic sensitive crawling method
Zongli JIANG Xueke XU Shuai LI
Journal of Computer Applications
Topic crawler is a new and practical application in the field of information retrieval. The main idea is to selectively collect Web pages on a predefined topic and avoid downloading irrelative Web pages in order to find more accurate and useful information for the user. Several key issues of topic crawler were discussed and corresponding new approaches were proposed. Then a topic crawler system was designed and implemented, employing topic sensitive Hyperlink-Induced Topic Search (HITS) to predict the priority of fetched Web pages. The experiments show our system performs well.
Related Articles |
Metrics
|