计算机应用 ›› 2015, Vol. 35 ›› Issue (7): 1975-1978.DOI: 10.11772/j.issn.1001-9081.2015.07.1975

• 人工智能 • 上一篇    下一篇

面向文献搜索系统的用户实时需求发现方法

徐浩, 陈雪, 胡晓峰   

  1. 上海大学 计算机工程与科学学院, 上海 200444
  • 收稿日期:2015-01-27 修回日期:2015-03-31 出版日期:2015-07-10 发布日期:2015-07-17
  • 通讯作者: 陈雪(1981-),女,河南信阳人,副教授,博士,CCF会员,主要研究方向:语义Web、对等网络、并行体系结构,xuechen@shu.edu.cn
  • 作者简介:徐浩(1990-),男,江苏南通人,硕士研究生,主要研究方向:海量Web信息挖掘; 胡晓峰(1991-),男,湖南湘潭人,硕士研究生,主要研究方向:海量Web信息挖掘。
  • 基金资助:

    上海市教委创新项目(B.10-0108-14-202)。

Finding method of users' real-time demands for literature search systems

XU Hao, CHEN Xue, HU Xiaofeng   

  1. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
  • Received:2015-01-27 Revised:2015-03-31 Online:2015-07-10 Published:2015-07-17

摘要:

针对当前文献搜索系统不能理解用户实时需求的问题,提出了一种面向文献搜索系统的用户实时需求发现方法。首先,分析用户浏览、下载等个性化搜索行为;其次,根据用户搜索行为与用户需求的关系构建用户实时需求文档(RD);然后,从用户需求文档中提取用户需求关键词网络;最后,运用随机游走的方法提取出关键词网络的核心节点构成用户需求图。实验结果表明:在模拟用户需求的环境下,提取需求图的方法比K-medoids算法在检索指标F值上平均高2.5%;在用户搜索文献真实情况下,提取需求图的方法比DBSCAN算法在检索指标F值上平均高5.3%,因此,在用户需求比较稳定的文献搜索中,该方法能够获取用户需求从而提升用户体验。

关键词: 用户行为分析, 实时需求, 文献搜索系统, 个性化, 关键词网络

Abstract:

Because of the literature search system failing to comprehend users' real-time demands, a method to find users' real-time demands for literature search systems was proposed. Firstly, this method analyzed the users' personalized search behaviors such as browsing and downloading. Secondly, it established users' real-time Requirement Documents (RD) based on the relations between users' search behaviors and users' requirements. And then it extracted keyword network from requirement documents. Finally, it gained users' demand graphs which were formed by core nodes extracted from keyword network by means of random walk. The experimental results show that the method by extracting demand graphs increases the F-measure by 2.5%, in the comparison of the K-medoids algorithm on average, under the condition that users' demands are emulated in the experiment. And it also increases the F-measure by 5.3%, in the comparison with the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm on average, under the condition that users really searches for papers. So, when the method is used in literature search systems where users' requirements are stable, it will be able to gain users' demands to enhance users' search experiences.

Key words: user behavior analysis, real-time demand, literature search system, personalization, keyword network

中图分类号: