计算机应用 ›› 2018, Vol. 38 ›› Issue (11): 3100-3104.DOI: 10.11772/j.issn.1001-9081.2018041355

• 第七届中国数据挖掘会议(CCDM 2018) • 上一篇    下一篇

基于线索特征的Web信息时效性评价方法

徐静1, 杨小平2   

  1. 1. 中华女子学院 计算机系, 北京 100101;
    2. 中国人民大学 信息学院, 北京 100872
  • 收稿日期:2018-04-30 修回日期:2018-06-21 出版日期:2018-11-10 发布日期:2018-11-10
  • 通讯作者: 徐静
  • 作者简介:徐静(1980-),女,安徽合肥人,讲师,博士,主要研究方向:Web可用性评估、语义分析;杨小平(1956-),男,福建福州人,教授,博士,主要研究方向:信息系统工程、电子政务。
  • 基金资助:
    国家自然科学基金资助项目(71271209);安徽高校自然科学研究项目(KJ2016A603);中华女子学院校级课题(KY201703004)。

Web information timeliness evaluation based on clue characteristics

XU Jing1, YANG Xiaoping2   

  1. 1. Computer Department, China Women's University, Beijing 100101, China;
    2. School of Information, Renmin University of China, Beijing 100872, China
  • Received:2018-04-30 Revised:2018-06-21 Online:2018-11-10 Published:2018-11-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (71271209), the Natural Science Foundation of the Higher Education Institutions of Anhui Province (KJ2016A603), the Foundation of China Women's University (KY201703004).

摘要: 网络的快速发展使得在线新闻媒体成为人们获取信息的重要来源。Web站点发布的信息是否能够反映当前关注的热点,是否能够及时发布事件的最新进展情况,对Web站点的可用性有重大影响。在利用条件随机场(CRF)模型识别主题线索句的基础上,得到与Web信息相关的同一主题的线索发展趋势。通过获得的线索发展趋势来推断主题线索的时间区间,进而估计出Web信息的有效区间,在此基础上结合时效性所包含的发布及时性和内容新鲜性两个方面来对Web信息时效性进行合理的评价。实验结果表明,所提方法在Web信息时效性评价上有较好的效果。

关键词: 主题线索, 时效性, 发布及时性, 内容新鲜性, 有效区间

Abstract: The rapid development of the network makes online news become an important source of acquiring information. It has a significant impact on the usability of Web sites whether the information published on the Web sites can reflect the current focus of attention or whether the latest progress of the event on the Web sites can be timely updated. In this paper, the clue development trend on the same subject related to the information on Web sites was obtained from the topic clue sentences identified by Conditional Random Field (CRF) model. The time range of topic clues can be inferred by clue development trends obtained, and further the effective range of Web information is estimated. On this basis, combined with the timeliness the information is published and the freshness of the content on Web sites, the timeliness of Web information can be evaluated reasonably. The experimental results show that the proposed method has a good effect on the timeliness evaluation of information on Web sites.

Key words: topic clue, timeliness, publication timeliness, content freshness, effective range

中图分类号: