基于标题与正文匹配的科技论文可信质量评估方法

doi:10.11772/j.issn.1001-9081.2014.11.3273

计算机应用 ›› 2014, Vol. 34 ›› Issue (11): 3273-3278.DOI: 10.11772/j.issn.1001-9081.2014.11.3273

基于标题与正文匹配的科技论文可信质量评估方法

余玄璇¹,²,曾国荪¹,²,丁春玲³

1. 山东工商学院计算机科学与技术学院,山东烟台 264005
2. 同济大学计算机科学与技术系,上海 200092
3. 同济大学化学系,上海 200092

收稿日期:2014-06-10 修回日期:2014-07-04 出版日期:2014-11-01 发布日期:2014-12-01
通讯作者: 余玄璇
作者简介:余玄璇(1989-),女,安徽池州人,硕士研究生,主要研究方向:内容可信、信息安全;曾国荪(1964-),男,江西新余人,教授,博士生导师,主要研究方向:并行计算、可信软件、信息安全;丁春玲(1965-),女,江西宜春人,高级工程师,硕士,主要研究方向:建模分析。
基金资助:
国家863计划项目;国家自然科学基金资助项目;上海市优秀学科带头人计划项目;教育部科技发展中心网络时代的科技论文快速共享专项研究资助课题;华为创新研究计划项目

Trust assessment method for scientific papers based on matching between title and its content

YUXuanxuan¹,²,ZENG Guosun¹,²,DING Chunling³

1. Department of Computer Science and Technology, Tongji University, Shanghai 200092, China;
2. School of Computer Science and Technology, Shandong Institute of Business and Technology, Yantai Shandong 264005, China
3. Department of Chemistry, Tongji University, Shanghai 200092, China

Received:2014-06-10 Revised:2014-07-04 Online:2014-11-01 Published:2014-12-01
Contact: YUXuanxuan

摘要/Abstract

摘要：

为从质量参差不齐的海量网络科技文献中准确高效地找出所需的有价值文献,基于标题与正文的一致性匹配思想,提出了基于标题与正文匹配的科技论文可信质量评估方法。该方法首先将标题与正文分别用特征向量建模,利用词相似度,对标题向量和正文向量中的每个特征词进行相似度计算,取相似度大于一定阈值的词对为匹配成功的特征词对;然后,统计所有匹配成功的词对数量及词权重,计算出一个标题的可信度;最后,通过论文标题分层树型结构,利用树的深度遍历算法,计算所有标题与其对应正文的相似匹配程度,进而评估整篇科技论文的可信度。《知网》实例分析表明:该方法实现了科技论文可信质量评估,使得读者无须阅读大量科技论文,就能挑选出可信或者具有实际的参考价值的科技论文,降低了信息搜索成本,提高了决策效率。

Abstract:

It is more and more difficult to find the valuable required scientific papers accurately and efficiently on Internet, thus a new thesis evaluation method was proposed based on consistency of the title and text to deal with this problem. First of all, the title and text were modeled by eigenvectors respectively. After that, the technique of words similarity was used to calculate the matching-degree of each feature word in title and text vector. The feature word pair was successfully matched if their matching degree was greater than a certain threshold. Then all such matching pairs and their word weights were counted up to calculate the credibility of the title. Based on the hierarchical tree structure of the thesis title, the similarity matching degree of all headings and their corresponding text were calculated by Depth First Traversal (DFT) algorithm, and then the credibility of the paper was evaluated. A case study results prove that the proposed method can realize the scientific papers' credible quality assessment, which makes it be more efficient for readers in paper reading.

中图分类号:

TP391

余玄璇曾国荪丁春玲. 基于标题与正文匹配的科技论文可信质量评估方法[J]. 计算机应用, 2014, 34(11): 3273-3278.

YUXuanxuan ZENG Guosun DING Chunling. Trust assessment method for scientific papers based on matching between title and its content[J]. Journal of Computer Applications, 2014, 34(11): 3273-3278.

参考文献

[1]GARFIELD E. Forms for literature citations[J]. Science, 1954, 120(3129):1038-1041.
[2]BERGSTROM C T. Eigenfactor: measuring the value and prestige of scholarly journals[J]. College and Research Libraries News,2007,68(5):314-316.
[3]ZHOU J, ZENG G, ZENG Y. The comprehensive assessment method of trust level for scientific papers based on retrieval features[J]. Application Research of Computers,2013,30(3):820-824.(周静,曾国荪,曾媛. 基于检索特征的科技论文可信等级的综合评估方法[J]. 计算机应用研究,2013,30(3):820-824.)
[4]WANG Y, MA J. A literature of science and technology quality evaluation algorithm based on PageRank [J].Journal of Guangxi Normal University: Natural Science,2009,27(1):165-168.(王向阳,马军.一个基于PageRank的科技文献质量评价算法[J].广西师范大学学报:自然科学版,2009,27(1):165-168.)
[5]LIU L. The research of paper in quality evaluation based on sciencepaper online[D].Changchun: Changchun University of Technology,2011.(刘乐.“中国科技论文在线”论文质量评价研究[D].长春:长春工业大学,2011.)
[6]FENG X, ZHU P. The structure of the scientific papers and writing format [J]. Journal of Shandong Meteorology, 2005,25(3):12-15.(冯晓云,朱平盛. 科技论文的构成与编写格式[J]. 山东气象,2005,25(3):12-15.)
[7]PANG J. Research and development of Web text feature extraction method [J]. 〖HJ1.33mm〗Information Studies: Theory and Application, 2006, 29(3) :338-340.(庞景安.Web文本特征提取方法的研究与发展[J].情报理论与实践,2006,29(3):338-340.)
[8]LI L. Chinese science and technology literature summarization system [D]. Chengdu: University of Electronic Science and Technology of China,2006.(李立燕. 中文科技文献自动摘要系统 [D]. 成都:电子科技大学,2006.)
[9]YU L. Research and applications on text features extraction from science and technical literatures [D].Beijing: Beijing University of Posts and Telecommunications,2009.(于亮.科技文献的文本特征抽取研究与应用[D].北京:北京邮电大学,2009.)
[10]LIU Q, ZHANG H, YU H, et al. Chinese lexical analysis based on layered hidden Markov model[J]. Journal of Computer Research and Development, 2004, 41(8):1421-1429.(刘群,张华平,俞鸿魁,等. 基于层叠隐马模型的汉语词法分析[J]. 计算机研究与发展,2004,41(8): 1421-1429.)
[11]XIE J. Chinese keyword extraction method based on word span and application in text classification [D]. Hangzhou: Zhejiang University of Technology,2011.(谢晋. 基于词跨度的中文文本关键词提取及在文本分类中的应用[D]:杭州:浙江工业大学,2011.
[12]HUANG S, ZENG G, WANG W. Web text credibility calculation method based on trust model validation[J]. Computer Science, 2011,38 (1): 177-180.(黄帅彪,曾国荪,王伟. 基于信任模式验证的论述性Web文本可信性判定方法[J]. 计算机科学, 2011,38(1): 177-180.)
[13]PENG J, YANG D, TANG S, et al. Text similarity calculation based on concept of similarity[J].China Science F: Information Science, 2009, 39(5):534-544.(彭京,杨东青,唐世渭,等. 基于概念相似度的文本相似计算[J]. 中国科学F辑:信息科学,2009,39(5):534-544.)

[1]	吴军欧阳艾嘉张琳. 基于影响度的统计显著序列模式挖掘算法[J]. 计算机应用, 0, (): 0-0.
[2]	张璐方春祝铭. 基于Res2Net-YOLACT和融合特征的室内跌倒检测算法[J]. 计算机应用, 0, (): 0-0.
[3]	殷雨昌王洪元陈莉冯尊登肖宇. 基于单标注样本的多损失学习与联合度量视频行人重识别[J]. 计算机应用, 0, (): 0-0.
[4]	胡军许正康刘立钟福金张清华. 融合多粒度社区信息的网络嵌入方法[J]. 计算机应用, 0, (): 0-0.
[5]	李润泽孙雪姣. 基于时间条件提取序列的数据流偏好查询[J]. 计算机应用, 0, (): 0-0.
[6]	罗圣钦陈金怡李洪均. 基于注意力机制的多尺度残差UNet实现乳腺癌灶分割[J]. 计算机应用, 0, (): 0-0.
[7]	曹一珉蔡磊高敬阳. 基于生成对抗网络的基因数据生成方法[J]. 计算机应用, 0, (): 0-0.
[8]	陈冲闫珠赵继轩何为梁华庆. 基于集合经验模态分解和长短期记忆网络的催化裂化装置NOx排放预测[J]. 计算机应用, 0, (): 0-0.
[9]	徐光柱林文杰陈莎匡婉雷帮军周军. U-Net与自适应阈值脉冲耦合神经网络相结合的眼底血管分割方法[J]. 计算机应用, 0, (): 0-0.
[10]	杨鼎康黄帅王顺利翟鹏李一丹张立华. 基于对抗生成网络和网络集成的面部表情识别方法EE-GAN[J]. 计算机应用, 0, (): 0-0.
[11]	李讷徐光柱雷帮军马国亮石勇涛. 交通道路行驶车辆车标识别算法[J]. 计算机应用, 0, (): 0-0.
[12]	孟杰王莉杨延杰廉飚. 基于多模态深度融合的虚假信息检测[J]. 计算机应用, 0, (): 0-0.
[13]	秦庭威赵鹏程秦品乐曾建朝柴锐黄永琦. 基于残差注意力机制的点云配准算法[J]. 计算机应用, 0, (): 0-0.
[14]	鲁永帅唐英杰马鑫然. 基于深度特征融合的无纺布低对比度浆丝缺陷检测方法[J]. 计算机应用, 0, (): 0-0.
[15]	王宇航周永霞吴良武. 基于高斯函数的池化算法[J]. 计算机应用, 0, (): 0-0.

基于标题与正文匹配的科技论文可信质量评估方法

Trust assessment method for scientific papers based on matching between title and its content

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics