Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (3): 783-790.DOI: 10.11772/j.issn.1001-9081.2021040759
Special Issue: 人工智能; 2021年中国计算机学会人工智能会议(CCFAI 2021)
• 2021 CCF Conference on Artificial Intelligence (CCFAI 2021) • Previous Articles Next Articles
Yimin CAO, Lei CAI, Jingyang GAO()
Received:
2021-05-12
Revised:
2021-06-03
Accepted:
2021-06-09
Online:
2021-11-09
Published:
2022-03-10
Contact:
Jingyang GAO
About author:
CAO Yimin, born in 1997, M. S. candidate. His research interests include bioinformatics, deep learning, data mining.Supported by:
通讯作者:
高敬阳
作者简介:
曹一珉(1997—),男,河南信阳人,硕士研究生,主要研究方向:生物信息学、深度学习、数据挖掘基金资助:
CLC Number:
Yimin CAO, Lei CAI, Jingyang GAO. Gene data generation method based on generative adversarial network[J]. Journal of Computer Applications, 2022, 42(3): 783-790.
曹一珉, 蔡磊, 高敬阳. 基于生成对抗网络的基因数据生成方法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 783-790.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021040759
像素点 颜色 | 匹配 模式 | 是否 缺失 | 像素点 颜色 | 匹配 模式 | 是否 缺失 |
---|---|---|---|---|---|
红色 | 缺失 | 是 | 蓝色 | 软切 | 否 |
黑色 | 插入 | 否 | 绿色 | 正常 | 否 |
Tab. 1 Significance of four pixel colors in gene image
像素点 颜色 | 匹配 模式 | 是否 缺失 | 像素点 颜色 | 匹配 模式 | 是否 缺失 |
---|---|---|---|---|---|
红色 | 缺失 | 是 | 蓝色 | 软切 | 否 |
黑色 | 插入 | 否 | 绿色 | 正常 | 否 |
网络 | 学习率 | 优化器 | Batch_size |
---|---|---|---|
GeneGAN | 0.000 1 | Adam | 64 |
CNN | 1E-8 | SGD | 64 |
Tab. 2 Network structure parameters
网络 | 学习率 | 优化器 | Batch_size |
---|---|---|---|
GeneGAN | 0.000 1 | Adam | 64 |
CNN | 1E-8 | SGD | 64 |
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶100 | 46.70 | 61.28 | 53.01 |
1∶50 | 47.31 | 65.73 | 55.02 |
1∶25 | 49.17 | 69.13 | 57.46 |
Tab. 3 Experimental results of raw data with different proportions of positive and negative samples
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶100 | 46.70 | 61.28 | 53.01 |
1∶50 | 47.31 | 65.73 | 55.02 |
1∶25 | 49.17 | 69.13 | 57.46 |
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶15 | 49.91 | 70.26 | 58.36 |
1∶1 | 50.43 | 72.44 | 59.46 |
Tab. 4 Experimental results of traditional amplification data with different proportions of positive and negative samples
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶15 | 49.91 | 70.26 | 58.36 |
1∶1 | 50.43 | 72.44 | 59.46 |
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶15 | 50.73 | 71.62 | 59.39 |
1∶1 | 53.17 | 78.31 | 63.33 |
Tab. 5 Experimental results of original GAN extended data with different proportions of positive and negative samples
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶15 | 50.73 | 71.62 | 59.39 |
1∶1 | 53.17 | 78.31 | 63.33 |
数据 | 正负样本比例 | Precision/% | Recall/% | F1/% | 耗时 /min |
---|---|---|---|---|---|
原始数据 | 1∶25 | 49.17 | 69.13 | 57.46 | 152.1 |
GAN扩增数据 | 1∶15 | 50.73 | 71.62 | 59.39 | 154.4 |
DCGAN扩增数据 | 51.44 | 73.69 | 60.58 | 150.8 | |
WGAN-GP扩增数据 | 51.06 | 72.14 | 59.79 | 151.7 | |
GeneGAN扩增数据 | 51.84 | 75.81 | 61.57 | 152.4 | |
GAN扩增数据 | 1∶1 | 53.17 | 78.31 | 63.34 | 147.8 |
DCGAN扩增数据 | 53.91 | 79.82 | 64.35 | 142.1 | |
WGAN-GP扩增数据 | 53.62 | 79.91 | 64.18 | 143.5 | |
GeneGAN扩增数据 | 55.28 | 82.78 | 66.29 | 144.5 |
Tab. 6 Experimental results of four kinds of GAN amplification data with different proportions of positive and negative samples
数据 | 正负样本比例 | Precision/% | Recall/% | F1/% | 耗时 /min |
---|---|---|---|---|---|
原始数据 | 1∶25 | 49.17 | 69.13 | 57.46 | 152.1 |
GAN扩增数据 | 1∶15 | 50.73 | 71.62 | 59.39 | 154.4 |
DCGAN扩增数据 | 51.44 | 73.69 | 60.58 | 150.8 | |
WGAN-GP扩增数据 | 51.06 | 72.14 | 59.79 | 151.7 | |
GeneGAN扩增数据 | 51.84 | 75.81 | 61.57 | 152.4 | |
GAN扩增数据 | 1∶1 | 53.17 | 78.31 | 63.34 | 147.8 |
DCGAN扩增数据 | 53.91 | 79.82 | 64.35 | 142.1 | |
WGAN-GP扩增数据 | 53.62 | 79.91 | 64.18 | 143.5 | |
GeneGAN扩增数据 | 55.28 | 82.78 | 66.29 | 144.5 |
方法 | Precision | Recall | F1 |
---|---|---|---|
SVIM | 49.20 | 81.79 | 61.44 |
Sniffles | 54.39 | 77.86 | 64.05 |
Pbhoney | 59.18 | 41.56 | 48.83 |
GeneGAN | 55.28 | 82.78 | 66.29 |
Tab. 7 Experimental results comparison of different feature extraction methods
方法 | Precision | Recall | F1 |
---|---|---|---|
SVIM | 49.20 | 81.79 | 61.44 |
Sniffles | 54.39 | 77.86 | 64.05 |
Pbhoney | 59.18 | 41.56 | 48.83 |
GeneGAN | 55.28 | 82.78 | 66.29 |
1 | MICHAEL R S, CAMPBELL P J, FUTREAL P A. The cancer genome[J]. Nature, 2009, 458(7239): 719-724. 10.1038/nature07943 |
2 | PAK C H, DANKO T, ZHANG Y, et al. Human neuropsychiatric disease modeling using conditional deletion reveals synaptic transmission defects caused by heterozygous mutations in NRXN1[J]. Cell Stem Cell, 2015, 17(3): 316-328. 10.1016/j.stem.2015.07.017 |
3 | International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome[J]. Nature, 2004, 431(7011): 931. 10.1038/nature03001 |
4 | KALINSKY K, HEGUY A, BHANOT U K, et al. PIK3CA mutations rarely demonstrate genotypic intratumoral heterogeneity and are selected for in breast cancer progression[J]. Breast Cancer Research and Treatment, 2011, 129(2): 635. 10.1007/s10549-011-1601-4 |
5 | EMILE J F, DIAMOND E L, HÉLIAS-RODZEWICZ Z, et al. Recurrent RAS and PIK3CA mutations in Erdheim-Chester disease[J]. Blood: The Journal of the American Society of Hematology, 2014, 124(19): 3016-3019. 10.1182/blood-2014-04-570937 |
6 | MOLEY J F, BROTHER M B, WELLS S A, et al. Low frequency of ras gene mutations in neuroblastomas, pheochromocytomas, and medullary thyroid cancers[J]. Cancer Research, 1991, 51(6): 1596-1599. 10.1002/1097-0142(19910315)67:6<1713::AID-CNCR2820670639>3.0.CO; |
7 | BAKER S J, PREISINGER A C, JESSUP J M, et al. p53 gene mutations occur in combination with 17p allelic deletions as late events in colorectal tumorigenesis[J]. Cancer Research, 1990, 50(23): 7717-7722. |
8 | SETIO A A A, CIOMPI F, LITJENS G, et al. Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks[J]. IEEE Transactions on Medical Imaging, 2016, 35(5): 1160-1169. 10.1109/tmi.2016.2536809 |
9 | ALDOJ N, LUKAS S, DEWEY M, et al. Semi-automatic classification of prostate cancer on multi-parametric MR imaging using a multi-channel 3D convolutional neural network[J]. European Radiology, 2020, 30(2): 1243-1253. 10.1007/s00330-019-06417-z |
10 | GOODFELLOW I J, POUGET ABADIE J, MIRZA M, et al. Generative adversarial networks [EB/OL]. [2020-12-19]. . 10.1145/3422622 |
11 | WOLTERINK J M, DINKLA A M, SAVENIJE M H F, et al. Deep MR to CT synthesis using unpaired data[C]// Proceedings of the 2017 International Workshop on Simulation and Synthesis in Medical Imaging. Cham: Springer, 2017: 14-23. 10.1007/978-3-319-68127-6_2 |
12 | CALIMERI F, MARZULLO A, STAMILE C, et al. Biomedical data augmentation using generative adversarial neural networks[C]// Proceedings of the 2017 International Conference on Artificial Neural Networks. Cham: Springer, 2017: 626-634. 10.1007/978-3-319-68612-7_71 |
13 | CAI L, WU Y, GAO J. DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network[J]. BMC Bioinformatics, 2019, 20(1): 665. 10.1186/s12859-019-3299-y |
14 | POPLIN R, CHANG P C, ALEXANDER D, et al. A universal SNP and small-indel variant caller using deep neural networks[J]. Nature Biotechnology, 2018, 36(10): 983-987. 10.1038/nbt.4235 |
15 | RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks [EB/OL]. [2020-12-19]. . |
16 | GOODFELLOW I J, POUGET ABADIE J, MIRZA M, et al. Generative adversarial networks [EB/OL]. [2020-12-19]. . 10.1145/3422622 |
17 | RATLIFF L J, BURDEN S A, SASTRY S S. Characterization and computation of local Nash equilibria in continuous games[C]// Proceedings of the 2013 51st Annual Allerton Conference on Communication, Control, and Computing. Piscataway: IEEE, 2013: 917-924. 10.1109/allerton.2013.6736623 |
18 | GOODFELLOW I. NIPS 2016 tutorial: generative adversarial networks [EB/OL]. [2020-12-19]. . |
19 | 曹仰杰, 贾丽丽, 陈永霞, 等. 生成式对抗网络及其计算机视觉应用研究综述[J]. 中国图象图形学报, 2018, 23(10): 1433-1449. 10.11834/jig.180103 |
CAO Y J, JIA L L, CHEN Y X,et al. Review of computer vision based on generative adversarial networks[J]. Journal of Image and Graphics,2018, 23(10):1433-1449. 10.11834/jig.180103 | |
20 | ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein GAN [EB/OL]. [2020-12-19]. . |
21 | ARJOVSKY M, BOTTOU L. Towards principled methods for training generative adversarial networks [EB/OL]. [2020-12-19]. . |
22 | 邹秀芳, 朱定局. 生成对抗网络研究综述[J]. 计算机系统应用, 2019, 28(11): 1-9. |
ZOU X F, ZHU D J. Review on generative adversarial network[J]. Computer Systems & Applications, 2019, 28(11): 1-9. | |
23 | 柴梦婷, 朱远平. 生成式对抗网络研究与应用进展[J]. 计算机工程, 2019, 45(9): 222-234. 10.19678/j.issn.1000-3428.0051964 |
CHAI M T, ZHU Y P. Research and application progress of generative countermeasure network[J] Computer Engineering, 2019, 45(9): 222-234. 10.19678/j.issn.1000-3428.0051964 | |
24 | GULRAJANI I, AHMED F, ARJOVSKY M, et al. Improved training of Wasserstein GANs [EB/OL]. [2020-12-19]. . |
25 | 林懿伦, 戴星原, 李力, 等. 人工智能研究的新前线: 生成式对抗网络[J]. 自动化学报, 2018, 44(5): 775-792. 10.16383/j.aas.2018.y000002 |
LIN Y L, DAI X Y, LI L, et al. The new frontier of ai research: generative adversarial networks[J]. Acta Automatica Sinica, 2018, 44(5): 775-792. 10.16383/j.aas.2018.y000002 | |
26 | HELLER D, VINGRON M. SVIM: structural variant identification using mapped long reads[J]. Bioinformatics, 2019, 35(17): 2907-2915. 10.1093/bioinformatics/btz041 |
27 | SEDLAZECK F J, RESCHENEDER P, SMOLKA M, et al. Accurate detection of complex structural variations using single-molecule sequencing[J]. Nature Methods, 2018, 15(6): 461-468. 10.1038/s41592-018-0001-7 |
28 | ENGLISH A C, SALERNO W J, REID J G. PBHoney: identifying genomic variants via long-read discordance and interrupted mapping[J]. BMC Bioinformatics, 2014, 15(1): 1-7. 10.1186/1471-2105-15-180 |
[1] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[2] | Jiong WANG, Taotao TANG, Caiyan JIA. PAGCL: positive augmentation graph contrastive learning recommendation method without negative sampling [J]. Journal of Computer Applications, 2024, 44(5): 1485-1492. |
[3] | Jie GUO, Jiayu LIN, Zuhong LIANG, Xiaobo LUO, Haitao SUN. Recommendation method based on knowledge‑awareness and cross-level contrastive learning [J]. Journal of Computer Applications, 2024, 44(4): 1121-1127. |
[4] | Haoran WANG, Dan YU, Yuli YANG, Yao MA, Yongle CHEN. Domain transfer intrusion detection method for unknown attacks on industrial control systems [J]. Journal of Computer Applications, 2024, 44(4): 1158-1165. |
[5] | Sunjie YU, Hui ZENG, Shiyu XIONG, Hongzhou SHI. Incentive mechanism for federated learning based on generative adversarial network [J]. Journal of Computer Applications, 2024, 44(2): 344-352. |
[6] | Andi GUO, Zhen JIA, Tianrui LI. High-precision entity and relation extraction in medical domain based on pseudo-entity data augmentation [J]. Journal of Computer Applications, 2024, 44(2): 393-402. |
[7] | Yifei SONG, Yi LIU. Fast adversarial training method based on data augmentation and label noise [J]. Journal of Computer Applications, 2024, 44(12): 3798-3807. |
[8] | Xinrong HU, Jingxue CHEN, Zijian HUANG, Bangchao WANG, Xun YAO, Junping LIU, Qiang ZHU, Jie YANG. Graph convolution network-based masked data augmentation [J]. Journal of Computer Applications, 2024, 44(11): 3335-3344. |
[9] | Hui ZHOU, Yuling CHEN, Xuewei WANG, Yangwen ZHANG, Jianjiang HE. Deep shadow defense scheme of federated learning based on generative adversarial network [J]. Journal of Computer Applications, 2024, 44(1): 223-232. |
[10] | Anyang LIU, Huaici ZHAO, Wenlong CAI, Zechao XU, Ruideng XIE. Adaptive image deblurring generative adversarial network algorithm based on active discrimination mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2288-2294. |
[11] | Shaoquan CHEN, Jianping CAI, Lan SUN. Differential privacy generative adversarial network algorithm with dynamic gradient threshold clipping [J]. Journal of Computer Applications, 2023, 43(7): 2065-2072. |
[12] | Xin JIN, Yangchuan LIU, Yechen ZHU, Zijian ZHANG, Xin GAO. Sinogram inpainting for sparse-view cone-beam computed tomography image reconstruction based on residual encoder-decoder generative adversarial network [J]. Journal of Computer Applications, 2023, 43(6): 1950-1957. |
[13] | Jiagao WU, Shiwen ZHANG, Yudong JIANG, Linfeng LIU. Social-interaction GAN for pedestrian trajectory prediction based on state-refinement long short-term memory and attention mechanism [J]. Journal of Computer Applications, 2023, 43(5): 1565-1570. |
[14] | Jinwen GUO, Xinghua MA, Gongning LUO, Wei WANG, Yang CAO, Kuanquan WANG. Guidewire artifact removal method of structure-enhanced IVOCT based on Transformer [J]. Journal of Computer Applications, 2023, 43(5): 1596-1605. |
[15] | Hao WANG, Zicheng WANG, Chao ZHANG, Yunsheng MA. Generative adversarial network based data uncertainty quantification method [J]. Journal of Computer Applications, 2023, 43(4): 1094-1101. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||