计算机应用 ›› 2018, Vol. 38 ›› Issue (10): 2996-3001.DOI: 10.11772/j.issn.1001-9081.2018020302

• 计算机软件技术 • 上一篇    下一篇

基于版本控制的中文文档到源代码的自动跟踪方法

沈力1, 刘洪星1,2, 李勇华1,2   

  1. 1. 武汉理工大学 计算机科学与技术学院, 武汉 430063;
    2. 武汉理工大学 交通物联网技术湖北省重点实验室, 武汉 430070
  • 收稿日期:2018-02-01 修回日期:2018-03-29 出版日期:2018-10-10 发布日期:2018-10-13
  • 通讯作者: 李勇华
  • 作者简介:沈力(1993-),男,湖北钟祥人,硕士研究生,主要研究方向:需求工程、数据库系统;刘洪星(1963-),男,湖北洪湖人,教授,博士,主要研究方向:数据库系统、信息系统集成;李勇华(1977-),男,湖北武汉人,副教授,博士,主要研究方向:需求工程。
  • 基金资助:
    中央高校基本科研业务费专项资金资助项目(2016III028)。

Automatic tracing method from Chinese document to source code based on version control

SHEN Li1, LIU Hongxing1,2, LI Yonghua1,2   

  1. 1. College of Computer Science and Technology, Wuhan University of Technology, Wuhan Hubei 430063, China;
    2. Hubei Key Laboratory of Transportation Internet of Things, Wuhan University of Technology, Wuhan Hubei 430070, China
  • Received:2018-02-01 Revised:2018-03-29 Online:2018-10-10 Published:2018-10-13
  • Supported by:
    This work is partially supported by the Fundamental Research Funds for the Central Universities (2016III028).

摘要: 软件文档和源代码之间的可追踪性研究广泛使用了信息检索(IR)技术,但由于中文文档和源代码用不同的语言书写,使用传统IR技术进行自动跟踪时会导致精度不高。针对上述问题,提出一种基于版本控制的中文文档到源代码的自动跟踪方法。首先,结合文本到源代码的启发式规则,采用IR方法计算出文本和源代码之间的相似度得分;然后,使用软件开发和维护过程中提交到版本控制软件的更新信息来修正该分数;最后,根据设定的阈值确定中文文档与源代码之间的跟踪关系。实验结果表明,改进方法的精确度和召回率相比传统IR方法均有一定的提高,并且该方法能提取出传统IR方法中遗漏的跟踪关系。

关键词: 可追踪性, 版本控制, 自动跟踪, 信息检索, 软件工程

Abstract: Information Retrieval (IR) technology is widely used in automatic tracing from software documents to source codes, but Chinese document and source code are written in different languages, which leads to low accuracy of automatic tracing by using IR. In view of the above problems, an automatic tracing method of Chinese document to source code based on version control was proposed. Firstly, the similarity score between the documents and the source code was calculated by information retrieval method combined with text-to-source heuristic rules. Then the score was modified by the version update information which was submitted to the version control software during software development and maintenance. Finally, the tracing relationship between the Chinese document and source code was determined according to the set threshold. The experimental results show that the precision and recall of the proposed method have a certain improvement compared with the traditional IR method, and the tracing relationship missed in the traditional IR method can be extracted.

Key words: traceability, version control, automatic tracing, Information Retrieval (IR), software engineering

中图分类号: