基于深度学习的耦合度相关代码坏味检测方法

doi:10.11772/j.issn.1001-9081.2021061403

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (6): 1702-1707.DOI: 10.11772/j.issn.1001-9081.2021061403

所属专题： 2021年全国开放式分布与并行计算学术年会(DPCS 2021)论文

• 2021年全国开放式分布与并行计算学术年会(DPCS 2021)论文 • 上一篇下一篇

基于深度学习的耦合度相关代码坏味检测方法

苏珊, 张杨(), 张冬雯

河北科技大学信息科学与工程学院，石家庄 050018

收稿日期:2021-08-05 修回日期:2021-09-08 接受日期:2021-10-20 发布日期:2022-01-10 出版日期:2022-06-10
通讯作者: 张杨
作者简介:苏珊（1995—），女，河北石家庄人，硕士研究生，主要研究方向：软件重构
张冬雯（1964—），女，河北石家庄人，教授，博士，CCF 会员，主要研究方向：智能软件、软件重构。
基金资助:
国家自然科学基金资助项目(61440012);河北省基础研究计划重点基础专项(18960106D)

Coupling related code smell detection method based on deep learning

Shan SU, Yang ZHANG(), Dongwen ZHANG

School of Information Science and Engineering，Hebei University of Science and Technology，Shijiazhuang Hebei 050018，China

Received:2021-08-05 Revised:2021-09-08 Accepted:2021-10-20 Online:2022-01-10 Published:2022-06-10
Contact: Yang ZHANG
About author:SU Shan，born in 1995，M. S. candidate. Her research interestsinclude software refactoring.
ZHANG Dongwen，born in 1964，Ph. D.，professor. Her researchinterests include intelligent software，software refactoring
Supported by:
National Natural Science Foundation of China(61440012);Key Basic Research Project of Hebei Fundamental Research Plan(18960106D)

摘要/Abstract

摘要：

基于启发式和机器学习的代码坏味检测方法已被证明具有一定的局限性，且现有的检测方法大多集中在较为常见的代码坏味上。针对这些问题，提出了一种深度学习方法来检测过紧的耦合、分散的耦合和散弹式修改这三种与耦合度相关检测较为少见的代码坏味。首先，提取三种代码坏味需要的度量并对得到的数据进行处理；之后，构建卷积神经网络（CNN）与注意力（Attention）机制相结合的深度学习模型，引入的注意力机制可以对输入的度量特征进行权重的分配。从21个开源项目中提取数据集，在10个开源项目中对检测方法进行了验证，并与CNN模型进行对比。实验结果表明：过紧的耦合和分散的耦合在所提模型中取得了更好的结果，相应代码坏味的查准率分别达到了93.61%和99.76%；而散弹式修改在CNN模型中有更好的结果，相应代码坏味查准率达到了98.59%。

关键词: 代码坏味, 耦合, 深度学习, 卷积神经网络, 注意力机制

Abstract:

Heuristic and machine learning based code smell detection methods have been proved to have limitations， and most of these methods focus on the common code smells. In order to solve these problems， a deep learning based method was proposed to detect three relatively rare code smells which are related to coupling， those are Intensive Coupling， Dispersed Coupling and Shotgun Surgery. First， the metrics of three code smells were extracted， and the obtained data were processed. Second， a deep learning model combining Convolutional Neural Network （CNN） and attention mechanism was constructed， and the introduced attention mechanism was able to assign weights to the metric features. The datasets were extracted from 21 open source projects， and the detection methods were validated in 10 open source projects and compared with CNN model. Experimental results show that the proposed model achieves the better performance with the code smell precisions of 93.61% and 99.76% for Intensive Coupling and Dispersed Coupling respectively， and the CNN model achieves the better results with the code smell precision of 98.59% for Shotgun Surgery.

Key words: code smell, coupling, deep learning, Convolutional Neural Network (CNN), attention mechanism

中图分类号:

TP311

苏珊, 张杨, 张冬雯. 基于深度学习的耦合度相关代码坏味检测方法[J]. 计算机应用, 2022, 42(6): 1702-1707.

Shan SU, Yang ZHANG, Dongwen ZHANG. Coupling related code smell detection method based on deep learning[J]. Journal of Computer Applications, 2022, 42(6): 1702-1707.

图/表 11

图1 代码坏味检测框架

Fig. 1 Framework of code smell detection

表1 代码坏味度量

Tab. 1 Metrics of code smell

度量	定义
CINT	被检测方法调用其他类中方法的数量
CDISP	判定一个方法耦合分散度的指标
CC	其他类中调用被检测方法的方法数量
CM	与被检测方法有联系的类的数量
MAXNESTING	被检测方法嵌套层次结构中最多层的层数

图2 Attention-CNN模型

Fig. 2 Attention-CNN model

图3 注意力机制层

Fig. 3 Attention mechanism layer

表2 训练集所涉项目

Tab. 2 Projects for training set

项目名称	项目领域	NOC	NOM	LOC
argouml	UML图绘制	1 953	17 466	160 354
axion	gradle管理插件	35	313	1 096
Emmagee	性能测试工具	9	58	907
fullsync	文件同步工具	139	806	3 900
heritrix3	爬虫工具包	555	4 722	41 972
hsqldb	数据库	659	12 915	227 069
ipscan	ip端口扫描	184	933	6 584
javacc	词法分析器	180	1 487	20 861
jGroups	群组通信工具	251	1 935	13 199
jparsec	解析jQuery	237	1 274	7 387
jspwiki	Wiki系统	30	2 243	21 296
keystore	数据证书工具	272	1 593	21 215
marauroa	服务器端框架	231	1 866	19 044
picocontainer	微核心容器	1 005	6 862	48 468
quartz	分布式框架	465	4 585	41 749
QuickServer	服务器端组件	165	1 699	16 633
roller	博客服务器	549	5 040	47 848
squirrel	数据库工具	192	1 428	8 922
xalan	xslt处理器	964	10 359	188 637
xerces	xml解析器	838	10 717	142 249
you-jextractor	下载工具	76	646	2 711

表3 过紧的耦合训练数据集

Tab. 3 Training set of Intensive Coupling

方法	CINT	CDISP	MAXNESTING	INTENSIVE
exit	0.38	16	3	1
setTool	1	1	2	0
setTime	1	1	2	0
unlock	0.18	11	3	1

表4 测试集所涉项目

Tab. 4 Projects for test set

项目名称	项目领域	NOC	NOM	LOC
ArtOfIllusion	3D动画	492	6 766	103 586
FreePlane	思维导图	787	6 938	124 937
jasperreports	报表工具	2 890	23 055	202 308
Jdeodorant	代码结构分析	391	4 265	84 726
jEdit	文本编辑器	584	7 418	104 771
jfreechart	图表绘制类库	713	6 953	70 227
pmd	代码检查工具	2 194	10 105	51 296
elasticsearch	搜索服务器	3 356	31 160	198 869
netty	Java开源框架	2 458	27 183	212 343
omega	辅助翻译工具	631	4 543	39 153

表5 过紧的耦合检测结果 ( %)

Tab. 5 Results of detection of Intensive Coupling

模型	标签	查准率	查全率	$F 1$
CNN	0	100.00	99.90	99.95
CNN	1	89.44	99.73	94.30
Attention-CNN	0	100.00	99.94	99.97
Attention-CNN	1	93.61	100.00	96.70

表5 过紧的耦合检测结果 ( %)

Tab. 5 Results of detection of Intensive Coupling

模型	标签	查准率	查全率	$F 1$
CNN	0	100.00	99.90	99.95
CNN	1	89.44	99.73	94.30
Attention-CNN	0	100.00	99.94	99.97
Attention-CNN	1	93.61	100.00	96.70

表6 分散的耦合检测结果 ( %)

Tab. 6 Results of detection of Dispersed Coupling

模型	标签	查准率	查全率	$F 1$
CNN	0	100.00	99.99	99.99
CNN	1	99.04	100.00	99.52
Attention-CNN	0	100.00	100.00	100.00
Attention-CNN	1	99.76	99.76	99.76

表6 分散的耦合检测结果 ( %)

Tab. 6 Results of detection of Dispersed Coupling

模型	标签	查准率	查全率	$F 1$
CNN	0	100.00	99.99	99.99
CNN	1	99.04	100.00	99.52
Attention-CNN	0	100.00	100.00	100.00
Attention-CNN	1	99.76	99.76	99.76

表7 散弹式修改检测结果 ( %)

Tab. 7 Results of detection of Shotgun Surgery

模型	标签	查准率	查全率	F₁
CNN	0	100.00	99.99	99.99
CNN	1	98.59	100.00	99.45
Attention-CNN	0	100.00	99.92	99.96
Attention-CNN	1	95.66	100.00	97.78

表8 三种代码坏味的时间开销 ( s)

Tab. 8 Time consumption of three code smells

模型	过紧的耦合	分散的耦合	散弹式修改
CNN	130.49	150.85	61.21
Attention-CNN	131.68	157.60	69.58

参考文献 23

1	MENS T， TOURWE T. A survey of software refactoring［J］. IEEE Transactions on Software Engineering， 2004， 30（2）： 126-139. 10.1109/tse.2004.1265817
2	FOWLER M， BECK K， BRANT J， et al. Refactoring： Improving the Design of Existing Code［M］. Boston： Addison-Wesley Professional， 1999： 71-76.
3	APRIL A， ABRAN A. Software Maintenance Management： Evaluation and Continuous Improvement ［M］. Hoboken： John Wiley & Sons， 2012： 1-5.
4	YOSHIDA N， KINOSHITA M， IIDA H. A cohesion metric approach to dividing source code into functional segments to improve maintainability［C］// Proceedings of the 16th European Conference on Software Maintenance and Reengineering. Piscataway： IEEE， 2012： 365-370. 10.1109/csmr.2012.45
5	PALOMBA F， BAVOTA G， DI PENTA M， et al. Mining version histories for detecting code smells［J］. IEEE Transactions on Software Engineering， 2015， 41（5）： 462-489. 10.1109/tse.2014.2372760
6	SALES V， TERRA R， MIRANDA L F， et al. Recommending move method refactorings using dependency sets［C］// Proceedings of the 20th Working Conference on Reverse Engineering. Piscataway： IEEE， 2013： 232-241. 10.1109/wcre.2013.6671298
7	MÄNTYLÄ M V， LASSENIUS C. Subjective evaluation of software evolvability using code smells： an empirical study［J］. Empirical Software Engineering， 2006， 11（3）： 395-431. 10.1007/s10664-006-9002-8
8	MAIGA A， ALI N， BHATTACHARYA N， et al. Support vector machines for anti-pattern detection［C］// Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. Piscataway： IEEE， 2012： 278-281. 10.1145/2351676.2351723
9	KREIMER J. Adaptive detection of design flaws［J］. Electronic Notes in Theoretical Computer Science， 2005， 141（4）： 117-136. 10.1016/j.entcs.2005.02.059
10	FONTANA F A， MÄNTYLÄ M V， ZANONI M， et al. Comparing and experimenting machine learning techniques for code smell detection［J］. Empirical Software Engineering， 2016， 21（3）： 1143-1191. 10.1007/s10664-015-9378-4
11	DI NUCCI D， PALOMBA F， TAMBURRI D A， et al. Detecting code smells using machine learning techniques： are we there yet？［C］// Proceedings of the 2018 IEEE 25th International Conference on Software Analysis， Evolution and Reengineering. Piscataway： IEEE， 2018： 612-621. 10.1109/saner.2018.8330266
12	BENGIO Y， COURVILLE A， VINCENT P. Representation learning： a review and new perspectives［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2013， 35（8）： 1798-1828. 10.1109/tpami.2013.50
13	GUO X L， SHI C Y， JIANG H. Deep semantic-based feature envy identification［C］// Proceedings of the 11th Asia-Pacific Symposium on Internetware. New York： ACM， 2019： No.19. 10.1145/3361242.3361257
14	卜依凡，刘辉，李光杰. 一种基于深度学习的上帝类检测方法［J］. 软件学报， 2019， 30（5）： 1360-1374.
	BU Y F， LIU H， LI G J. God class detection approach based on deep learning［J］. Journal of Software， 2019， 30（5）： 1360-1374.
15	KESSENTINI W， KESSENTINI M， SAHRAOUI H， et al. A cooperative parallel search-based software engineering approach for code-smells detection［J］. IEEE Transactions on Software Engineering， 2014， 40（9）： 841-861. 10.1109/tse.2014.2331057
16	FU S Z， SHEN B J. Code bad smell detection through evolutionary data mining［C］// Proceedings of the 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. Piscataway： IEEE， 2015： 1-9. 10.1109/esem.2015.7321194
17	VIDAL S， VAZQUEZ H， DIAZ-PACE J A， et al. JSpIRIT： a flexible tool for the analysis of code smells［C］// Proceedings of the 34th International Conference of the Chilean Computer Science Society. Piscataway： IEEE， 2015： 1-6. 10.1109/sccc.2015.7416572
18	HADJ-KACEM M， BOUASSIDA N. A hybrid approach to detect code smells using deep learning［C］// Proceedings of the 13th International Conference on Evaluation of Novel Approaches to Software Engineering. Setúbal： SciTePress， 2018：137-146. 10.5220/0006709801370146
19	KIM D K. Finding bad code smells with neural network models［J］. International Journal of Electrical and Computer Engineering， 2017， 7（6）： 3613-3621. 10.11591/ijece.v7i6.pp3613-3621
20	DAS A K， YADAV S， DHAL S. Detecting code smells using deep learning［C］// Proceedings of the 2019 IEEE Region 10 Conference. Piscataway： IEEE， 2019： 2081-2086. 10.1109/tencon.2019.8929628
21	LANZA M， MARINESCU R. Object-Oriented Metrics in Practice： Using Software Metrics to Characterize， Evaluate， and Improve the Design of Object-Oriented Systems［M］. Berlin： Springer， 2006： 115-167.
22	CHAWLA N V， BOWYER K W， HALL L O， et al. SMOTE： synthetic minority over-sampling technique［J］. Journal of Artificial Intelligence Research， 2002， 16： 321-357. 10.1613/jair.953
23	BAHDANAU D， CHO K， BENGIO Y. Neural machine translation by jointly learning to align and translate ［EB/OL］. ［2020-12-08］. . 10.3115/v1/w14-4009

[1]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[2]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[3]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[4]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[5]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[6]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[7]	王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918.
[8]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[9]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[10]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[11]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[12]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[13]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.
[14]	李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594.
[15]	莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617.

基于深度学习的耦合度相关代码坏味检测方法

Coupling related code smell detection method based on deep learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 23

相关文章 15

编辑推荐

Metrics