计算机应用 ›› 2016, Vol. 36 ›› Issue (12): 3461-3467.DOI: 10.11772/j.issn.1001-9081.2016.12.3461

• 计算机软件技术 • 上一篇    下一篇

基于软件代码演化信息的克隆谱系提取方法

陈桌, 张丽萍, 王春晖   

  1. 内蒙古师范大学 计算机与信息工程学院, 呼和浩特 010022
  • 收稿日期:2016-06-07 修回日期:2016-07-22 出版日期:2016-12-10 发布日期:2016-12-08
  • 通讯作者: 张丽萍
  • 作者简介:陈桌(1989-),男,山东菏泽人,硕士研究生,主要研究方向:软件工程、软件分析;张丽萍(1974-),女,内蒙古呼和浩特人,教授,硕士,CCF会员,主要研究方向:软件工程、软件分析;王春晖(1979-),女(蒙古族),内蒙古通辽人,讲师,硕士,CCF会员,主要研究方向:软件分析、多媒体、计算机辅助教学。
  • 基金资助:
    国家自然科学基金资助项目(61462071,61363017);内蒙古自然科学基金资助项目(2014MS0613);内蒙古教育厅资助项目(NJZY16045)。

Clone genealogy extraction method based on software code evolution information

CHEN Zhuo, ZHANG Liping, WANG Chunhui   

  1. College of Computer and Information Engineering, Inner Mongolia Normal University, Hohhot Nei Mongol 010022, China
  • Received:2016-06-07 Revised:2016-07-22 Online:2016-12-10 Published:2016-12-08
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61462071, 61363017), the National Natural Science Foundation of Inner Mongolia(2014MS0613), the Foundation Project of Inner Mongolia Education Department (NJZY16045).

摘要: 针对现有克隆演化模式分类不清晰、克隆谱系提取工具少且效率低等问题,提出了根据克隆代码映射关系和演化信息自动构建克隆谱系的方法。首先通过词频向量计算、代码行距以及克隆属性相结合分阶段映射版本间克隆;然后根据映射结果为克隆群和克隆片段添加演化模式;最后串联所有版本中的克隆映射关系和演化模式构建克隆谱系。对4款开源软件进行实验并人工验证,实验结果表明克隆谱系提取工具——ECG的可行性和高效性。此外,从提取结果中发现,在演化过程中约42%的克隆代码未发生变化,发生不一致变化的克隆代码约占3.48%,此类克隆可能会引入潜在bug需要被重点关注。该方法将为克隆代码质量评估和管理提供参考和支持。

关键词: 克隆代码, 克隆映射, 演化模式, 克隆谱系, 演化分析

Abstract: The current clone evolution pattern classification is not clear, and clone genealogy extraction tool has less quantity and low efficiency. In order to solve the problems, a clone genealogy extraction method was proposed according to the code clone mapping relationships and evolution information. Firstly, clone group and clone fragment were mapped by word frequency vector calculation, code line distance and clone attribute from different stages. And then the evolution pattern was appended to clone group and clone fragment according to the mapping results. Finally, clone genealogy was constructed by combining clone mapping relationships and evolution pattern in all versions. Four open source softwares were tested and artificially verified in experiments. The experimental results show that the clone genealogy extraction tool-Extract Clone Genealogy (ECG) is valid and efficient. In addition, it is found that about 42% of clone codes have not changed in the evolution process from the extraction results, and about 3.48% of clone codes have inconsistent change, such clones may introduce potential bugs which need to be focused on. The proposed method will provide reference and data support for code clone quality assessment and management.

Key words: clone code, clone mapping, evolution pattern, clone genealogy, evolution analysis

中图分类号: