Journal of Computer Applications ›› 2009, Vol. 29 ›› Issue (11): 3100-3102.
• Database and data mining • Previous Articles Next Articles
Qi GAO,Yong-ping ZHANG
Received:
Revised:
Online:
Published:
Contact:
高琪1,张永平2
通讯作者:
Abstract: Hyperlink-Induced Topic Search (HITS) algorithm is a classic hyperlink-based algorithm. But the HITS algorithm is purely based on the hyperlink, and it ignores the text of the linked page and does not distinguish the importance between the different hyperlinks. Because of this, a theme-drift phenomenon often happens when using HITS algorithm. The improved algorithm based on the HITS algorithm makes use of the classic tf-idf algorithm to calculate the related weight between the linked page and the query. The improved algorithm can make the search engine ranking results more in line with the query, and the corresponding precision rate has also been greatly improved.
Key words: theme-drift, sort page, search engine
摘要: 超链接导向搜索(HITS)算法是比较经典的基于超链接的算法,但它忽视了链接页面的文本信息内容,没有区分链接的重要性,从而导致算法不可避免地发生主题漂移现象。为了解决这一问题,在原HITS算法的基础上,引入了经典的tf-idf算法,通过计算链接页面与查询主题的相关度来区分链接的重要性,以解决主题漂移的问题。改进算法使搜索引擎的排序结果更符合查询条件,相应的查确率也有很大提高。
关键词: 主题漂移, 页面排序, 搜索引擎
Qi GAO Yong-ping ZHANG. Study on theme-drift of hyperlink-induced topic search algorithm[J]. Journal of Computer Applications, 2009, 29(11): 3100-3102.
高琪 张永平. 超链接导向搜索算法中主题漂移的研究[J]. 计算机应用, 2009, 29(11): 3100-3102.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/
http://www.joca.cn/EN/Y2009/V29/I11/3100