Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (3): 783-790.DOI: 10.11772/j.issn.1001-9081.2021040759
• 2021 CCF Conference on Artificial Intelligence (CCFAI 2021) • Previous Articles
Yimin CAO, Lei CAI, Jingyang GAO()
Received:
2021-05-12
Revised:
2021-06-03
Accepted:
2021-06-09
Online:
2021-11-09
Published:
2022-03-10
Contact:
Jingyang GAO
About author:
CAO Yimin, born in 1997, M. S. candidate. His research interests include bioinformatics, deep learning, data mining.Supported by:
通讯作者:
高敬阳
作者简介:
曹一珉(1997—),男,河南信阳人,硕士研究生,主要研究方向:生物信息学、深度学习、数据挖掘基金资助:
CLC Number:
Yimin CAO, Lei CAI, Jingyang GAO. Gene data generation method based on generative adversarial network[J]. Journal of Computer Applications, 2022, 42(3): 783-790.
曹一珉, 蔡磊, 高敬阳. 基于生成对抗网络的基因数据生成方法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 783-790.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021040759
像素点 颜色 | 匹配 模式 | 是否 缺失 | 像素点 颜色 | 匹配 模式 | 是否 缺失 |
---|---|---|---|---|---|
红色 | 缺失 | 是 | 蓝色 | 软切 | 否 |
黑色 | 插入 | 否 | 绿色 | 正常 | 否 |
Tab. 1 Significance of four pixel colors in gene image
像素点 颜色 | 匹配 模式 | 是否 缺失 | 像素点 颜色 | 匹配 模式 | 是否 缺失 |
---|---|---|---|---|---|
红色 | 缺失 | 是 | 蓝色 | 软切 | 否 |
黑色 | 插入 | 否 | 绿色 | 正常 | 否 |
网络 | 学习率 | 优化器 | Batch_size |
---|---|---|---|
GeneGAN | 0.000 1 | Adam | 64 |
CNN | 1E-8 | SGD | 64 |
Tab. 2 Network structure parameters
网络 | 学习率 | 优化器 | Batch_size |
---|---|---|---|
GeneGAN | 0.000 1 | Adam | 64 |
CNN | 1E-8 | SGD | 64 |
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶100 | 46.70 | 61.28 | 53.01 |
1∶50 | 47.31 | 65.73 | 55.02 |
1∶25 | 49.17 | 69.13 | 57.46 |
Tab. 3 Experimental results of raw data with different proportions of positive and negative samples
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶100 | 46.70 | 61.28 | 53.01 |
1∶50 | 47.31 | 65.73 | 55.02 |
1∶25 | 49.17 | 69.13 | 57.46 |
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶15 | 49.91 | 70.26 | 58.36 |
1∶1 | 50.43 | 72.44 | 59.46 |
Tab. 4 Experimental results of traditional amplification data with different proportions of positive and negative samples
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶15 | 49.91 | 70.26 | 58.36 |
1∶1 | 50.43 | 72.44 | 59.46 |
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶15 | 50.73 | 71.62 | 59.39 |
1∶1 | 53.17 | 78.31 | 63.33 |
Tab. 5 Experimental results of original GAN extended data with different proportions of positive and negative samples
正负样本比例 | 精确率 | 召回率 | F1值 |
---|---|---|---|
1∶15 | 50.73 | 71.62 | 59.39 |
1∶1 | 53.17 | 78.31 | 63.33 |
数据 | 正负样本比例 | Precision/% | Recall/% | F1/% | 耗时 /min |
---|---|---|---|---|---|
原始数据 | 1∶25 | 49.17 | 69.13 | 57.46 | 152.1 |
GAN扩增数据 | 1∶15 | 50.73 | 71.62 | 59.39 | 154.4 |
DCGAN扩增数据 | 51.44 | 73.69 | 60.58 | 150.8 | |
WGAN-GP扩增数据 | 51.06 | 72.14 | 59.79 | 151.7 | |
GeneGAN扩增数据 | 51.84 | 75.81 | 61.57 | 152.4 | |
GAN扩增数据 | 1∶1 | 53.17 | 78.31 | 63.34 | 147.8 |
DCGAN扩增数据 | 53.91 | 79.82 | 64.35 | 142.1 | |
WGAN-GP扩增数据 | 53.62 | 79.91 | 64.18 | 143.5 | |
GeneGAN扩增数据 | 55.28 | 82.78 | 66.29 | 144.5 |
Tab. 6 Experimental results of four kinds of GAN amplification data with different proportions of positive and negative samples
数据 | 正负样本比例 | Precision/% | Recall/% | F1/% | 耗时 /min |
---|---|---|---|---|---|
原始数据 | 1∶25 | 49.17 | 69.13 | 57.46 | 152.1 |
GAN扩增数据 | 1∶15 | 50.73 | 71.62 | 59.39 | 154.4 |
DCGAN扩增数据 | 51.44 | 73.69 | 60.58 | 150.8 | |
WGAN-GP扩增数据 | 51.06 | 72.14 | 59.79 | 151.7 | |
GeneGAN扩增数据 | 51.84 | 75.81 | 61.57 | 152.4 | |
GAN扩增数据 | 1∶1 | 53.17 | 78.31 | 63.34 | 147.8 |
DCGAN扩增数据 | 53.91 | 79.82 | 64.35 | 142.1 | |
WGAN-GP扩增数据 | 53.62 | 79.91 | 64.18 | 143.5 | |
GeneGAN扩增数据 | 55.28 | 82.78 | 66.29 | 144.5 |
方法 | Precision | Recall | F1 |
---|---|---|---|
SVIM | 49.20 | 81.79 | 61.44 |
Sniffles | 54.39 | 77.86 | 64.05 |
Pbhoney | 59.18 | 41.56 | 48.83 |
GeneGAN | 55.28 | 82.78 | 66.29 |
Tab. 7 Experimental results comparison of different feature extraction methods
方法 | Precision | Recall | F1 |
---|---|---|---|
SVIM | 49.20 | 81.79 | 61.44 |
Sniffles | 54.39 | 77.86 | 64.05 |
Pbhoney | 59.18 | 41.56 | 48.83 |
GeneGAN | 55.28 | 82.78 | 66.29 |
1 | MICHAEL R S, CAMPBELL P J, FUTREAL P A. The cancer genome[J]. Nature, 2009, 458(7239): 719-724. 10.1038/nature07943 |
2 | PAK C H, DANKO T, ZHANG Y, et al. Human neuropsychiatric disease modeling using conditional deletion reveals synaptic transmission defects caused by heterozygous mutations in NRXN1[J]. Cell Stem Cell, 2015, 17(3): 316-328. 10.1016/j.stem.2015.07.017 |
3 | International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome[J]. Nature, 2004, 431(7011): 931. 10.1038/nature03001 |
4 | KALINSKY K, HEGUY A, BHANOT U K, et al. PIK3CA mutations rarely demonstrate genotypic intratumoral heterogeneity and are selected for in breast cancer progression[J]. Breast Cancer Research and Treatment, 2011, 129(2): 635. 10.1007/s10549-011-1601-4 |
5 | EMILE J F, DIAMOND E L, HÉLIAS-RODZEWICZ Z, et al. Recurrent RAS and PIK3CA mutations in Erdheim-Chester disease[J]. Blood: The Journal of the American Society of Hematology, 2014, 124(19): 3016-3019. 10.1182/blood-2014-04-570937 |
6 | MOLEY J F, BROTHER M B, WELLS S A, et al. Low frequency of ras gene mutations in neuroblastomas, pheochromocytomas, and medullary thyroid cancers[J]. Cancer Research, 1991, 51(6): 1596-1599. 10.1002/1097-0142(19910315)67:6<1713::AID-CNCR2820670639>3.0.CO; |
7 | BAKER S J, PREISINGER A C, JESSUP J M, et al. p53 gene mutations occur in combination with 17p allelic deletions as late events in colorectal tumorigenesis[J]. Cancer Research, 1990, 50(23): 7717-7722. |
8 | SETIO A A A, CIOMPI F, LITJENS G, et al. Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks[J]. IEEE Transactions on Medical Imaging, 2016, 35(5): 1160-1169. 10.1109/tmi.2016.2536809 |
9 | ALDOJ N, LUKAS S, DEWEY M, et al. Semi-automatic classification of prostate cancer on multi-parametric MR imaging using a multi-channel 3D convolutional neural network[J]. European Radiology, 2020, 30(2): 1243-1253. 10.1007/s00330-019-06417-z |
10 | GOODFELLOW I J, POUGET ABADIE J, MIRZA M, et al. Generative adversarial networks [EB/OL]. [2020-12-19]. . 10.1145/3422622 |
11 | WOLTERINK J M, DINKLA A M, SAVENIJE M H F, et al. Deep MR to CT synthesis using unpaired data[C]// Proceedings of the 2017 International Workshop on Simulation and Synthesis in Medical Imaging. Cham: Springer, 2017: 14-23. 10.1007/978-3-319-68127-6_2 |
12 | CALIMERI F, MARZULLO A, STAMILE C, et al. Biomedical data augmentation using generative adversarial neural networks[C]// Proceedings of the 2017 International Conference on Artificial Neural Networks. Cham: Springer, 2017: 626-634. 10.1007/978-3-319-68612-7_71 |
13 | CAI L, WU Y, GAO J. DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network[J]. BMC Bioinformatics, 2019, 20(1): 665. 10.1186/s12859-019-3299-y |
14 | POPLIN R, CHANG P C, ALEXANDER D, et al. A universal SNP and small-indel variant caller using deep neural networks[J]. Nature Biotechnology, 2018, 36(10): 983-987. 10.1038/nbt.4235 |
15 | RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks [EB/OL]. [2020-12-19]. . |
16 | GOODFELLOW I J, POUGET ABADIE J, MIRZA M, et al. Generative adversarial networks [EB/OL]. [2020-12-19]. . 10.1145/3422622 |
17 | RATLIFF L J, BURDEN S A, SASTRY S S. Characterization and computation of local Nash equilibria in continuous games[C]// Proceedings of the 2013 51st Annual Allerton Conference on Communication, Control, and Computing. Piscataway: IEEE, 2013: 917-924. 10.1109/allerton.2013.6736623 |
18 | GOODFELLOW I. NIPS 2016 tutorial: generative adversarial networks [EB/OL]. [2020-12-19]. . |
19 | 曹仰杰, 贾丽丽, 陈永霞, 等. 生成式对抗网络及其计算机视觉应用研究综述[J]. 中国图象图形学报, 2018, 23(10): 1433-1449. 10.11834/jig.180103 |
CAO Y J, JIA L L, CHEN Y X,et al. Review of computer vision based on generative adversarial networks[J]. Journal of Image and Graphics,2018, 23(10):1433-1449. 10.11834/jig.180103 | |
20 | ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein GAN [EB/OL]. [2020-12-19]. . |
21 | ARJOVSKY M, BOTTOU L. Towards principled methods for training generative adversarial networks [EB/OL]. [2020-12-19]. . |
22 | 邹秀芳, 朱定局. 生成对抗网络研究综述[J]. 计算机系统应用, 2019, 28(11): 1-9. |
ZOU X F, ZHU D J. Review on generative adversarial network[J]. Computer Systems & Applications, 2019, 28(11): 1-9. | |
23 | 柴梦婷, 朱远平. 生成式对抗网络研究与应用进展[J]. 计算机工程, 2019, 45(9): 222-234. 10.19678/j.issn.1000-3428.0051964 |
CHAI M T, ZHU Y P. Research and application progress of generative countermeasure network[J] Computer Engineering, 2019, 45(9): 222-234. 10.19678/j.issn.1000-3428.0051964 | |
24 | GULRAJANI I, AHMED F, ARJOVSKY M, et al. Improved training of Wasserstein GANs [EB/OL]. [2020-12-19]. . |
25 | 林懿伦, 戴星原, 李力, 等. 人工智能研究的新前线: 生成式对抗网络[J]. 自动化学报, 2018, 44(5): 775-792. 10.16383/j.aas.2018.y000002 |
LIN Y L, DAI X Y, LI L, et al. The new frontier of ai research: generative adversarial networks[J]. Acta Automatica Sinica, 2018, 44(5): 775-792. 10.16383/j.aas.2018.y000002 | |
26 | HELLER D, VINGRON M. SVIM: structural variant identification using mapped long reads[J]. Bioinformatics, 2019, 35(17): 2907-2915. 10.1093/bioinformatics/btz041 |
27 | SEDLAZECK F J, RESCHENEDER P, SMOLKA M, et al. Accurate detection of complex structural variations using single-molecule sequencing[J]. Nature Methods, 2018, 15(6): 461-468. 10.1038/s41592-018-0001-7 |
28 | ENGLISH A C, SALERNO W J, REID J G. PBHoney: identifying genomic variants via long-read discordance and interrupted mapping[J]. BMC Bioinformatics, 2014, 15(1): 1-7. 10.1186/1471-2105-15-180 |
[1] | Dingkang YANG, Shuai HUANG, Shunli WANG, Peng ZHAI, Yidan LI, Lihua ZHANG. EE-GAN:facial expression recognition method based on generative adversarial network and network integration [J]. Journal of Computer Applications, 2022, 42(3): 750-756. |
[2] | Xinyu CHEN, Mingzhe LIU, Jun REN, Ying TANG. Parameter asynchronous updating algorithm based on multi-column convolutional neural network [J]. Journal of Computer Applications, 2022, 42(2): 395-403. |
[3] | Shuang DENG, Xiaohai HE, Linbo QING, Honggang CHEN, Qizhi TENG. Weakly supervised fine-grained classification method of Alzheimer’s disease based on improved visual geometry group network [J]. Journal of Computer Applications, 2022, 42(1): 302-309. |
[4] | Hengxin LI, Kan CHANG, Yufei TAN, Mingyang LING, Tuanfa QIN. Color image demosaicking network based on inter-channel correlation and enhanced information distillation [J]. Journal of Computer Applications, 2022, 42(1): 245-251. |
[5] | BIAN Lingzhi, WANG Zhijie. Credit scoring model based on enhanced multi-dimensional and multi-grained cascade forest [J]. Journal of Computer Applications, 2021, 41(9): 2539-2544. |
[6] | GUAN Qijie, ZHANG Ting, LI Deya, ZHOU Shaojing, DU Yi. Indefinite reconstruction method of spatial data based on multi-resolution generative adversarial network [J]. Journal of Computer Applications, 2021, 41(8): 2306-2311. |
[7] | SUN Xiao, XU Jindong. Remote sensing image dehazing method based on cascaded generative adversarial network [J]. Journal of Computer Applications, 2021, 41(8): 2440-2444. |
[8] | TANG Guihua, SUN Lei, MAO Xiuqing, DAI Leyu, HU Yongjin. Generative adversarial network synthesized face detection based on deep alignment network [J]. Journal of Computer Applications, 2021, 41(7): 1922-1927. |
[9] | NIU Kangli, CHEN Yuzhang, SHEN Junfeng, ZENG Zhangfan, PAN Yongcai, WANG Yichong. Dual-channel night vision image restoration method based on deep learning [J]. Journal of Computer Applications, 2021, 41(6): 1775-1784. |
[10] | WANG Xianwu, ZHANG Ting, JI Xin, DU Yi. 3D shale digital core reconstruction method based on deep convolutional generative adversarial network with gradient penalty [J]. Journal of Computer Applications, 2021, 41(6): 1805-1811. |
[11] | LI Yanzhi, FAN Yong, GAO Lin. Anomaly detection of oil drilling water flow based on shape flow [J]. Journal of Computer Applications, 2021, 41(6): 1842-1848. |
[12] | HUANG Li, LU Long. Segmentation of ischemic stroke lesion based on long-distance dependency encoding and deep residual U-Net [J]. Journal of Computer Applications, 2021, 41(6): 1820-1827. |
[13] | JIA Chengxun, LAI Hua, YU Zhengtao, WEN Yonghua, YU Zhiqiang. Chinese-Vietnamese pseudo-parallel corpus generation based on monolingual language model [J]. Journal of Computer Applications, 2021, 41(6): 1652-1658. |
[14] | GUO Maozu, YANG Qiannan, ZHAO Lingling. Image generation based on conditional-Wassertein generative adversarial network [J]. Journal of Computer Applications, 2021, 41(5): 1432-1437. |
[15] | SUN Heli, SUN Yuzhu, ZHANG Xiaoyun. Event description generation based on generative adversarial network [J]. Journal of Computer Applications, 2021, 41(5): 1256-1261. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||