计算机应用 ›› 2014, Vol. 34 ›› Issue (11): 3273-3278.DOI: 10.11772/j.issn.1001-9081.2014.11.3273

• 人工智能 • 上一篇    下一篇

基于标题与正文匹配的科技论文可信质量评估方法

余玄璇1,2,曾国荪1,2,丁春玲3   

  1. 1. 山东工商学院 计算机科学与技术学院,山东 烟台 264005
    2. 同济大学 计算机科学与技术系,上海 200092
    3. 同济大学 化学系,上海 200092
  • 收稿日期:2014-06-10 修回日期:2014-07-04 出版日期:2014-11-01 发布日期:2014-12-01
  • 通讯作者: 余玄璇
  • 作者简介:余玄璇(1989-),女,安徽池州人,硕士研究生,主要研究方向:内容可信、信息安全;曾国荪(1964-),男,江西新余人,教授,博士生导师,主要研究方向:并行计算、可信软件、信息安全;丁春玲(1965-),女,江西宜春人,高级工程师,硕士,主要研究方向:建模分析。
  • 基金资助:

    国家863计划项目;国家自然科学基金资助项目;上海市优秀学科带头人计划项目;教育部科技发展中心网络时代的科技论文快速共享专项研究资助课题;华为创新研究计划项目

Trust assessment method for scientific papers based on matching between title and its content

YUXuanxuan1,2,ZENG Guosun1,2,DING Chunling3   

  1. 1. Department of Computer Science and Technology, Tongji University, Shanghai 200092, China;
    2. School of Computer Science and Technology, Shandong Institute of Business and Technology, Yantai Shandong 264005, China
    3. Department of Chemistry, Tongji University, Shanghai 200092, China
  • Received:2014-06-10 Revised:2014-07-04 Online:2014-11-01 Published:2014-12-01
  • Contact: YUXuanxuan

摘要:

为从质量参差不齐的海量网络科技文献中准确高效地找出所需的有价值文献,基于标题与正文的一致性匹配思想,提出了基于标题与正文匹配的科技论文可信质量评估方法。该方法首先将标题与正文分别用特征向量建模,利用词相似度,对标题向量和正文向量中的每个特征词进行相似度计算,取相似度大于一定阈值的词对为匹配成功的特征词对;然后,统计所有匹配成功的词对数量及词权重,计算出一个标题的可信度;最后,通过论文标题分层树型结构,利用树的深度遍历算法,计算所有标题与其对应正文的相似匹配程度,进而评估整篇科技论文的可信度。《知网》实例分析表明:该方法实现了科技论文可信质量评估,使得读者无须阅读大量科技论文,就能挑选出可信或者具有实际的参考价值的科技论文,降低了信息搜索成本,提高了决策效率。

Abstract:

It is more and more difficult to find the valuable required scientific papers accurately and efficiently on Internet, thus a new thesis evaluation method was proposed based on consistency of the title and text to deal with this problem. First of all, the title and text were modeled by eigenvectors respectively. After that, the technique of words similarity was used to calculate the matching-degree of each feature word in title and text vector. The feature word pair was successfully matched if their matching degree was greater than a certain threshold. Then all such matching pairs and their word weights were counted up to calculate the credibility of the title. Based on the hierarchical tree structure of the thesis title, the similarity matching degree of all headings and their corresponding text were calculated by Depth First Traversal (DFT) algorithm, and then the credibility of the paper was evaluated. A case study results prove that the proposed method can realize the scientific papers' credible quality assessment, which makes it be more efficient for readers in paper reading.

中图分类号: