《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (11): 3658-3665.DOI: 10.11772/j.issn.1001-9081.2024111664
• 先进计算 • 上一篇
张家豪, 王琪, 刘明铭(
), 王晓峰, 黄彪, 刘盼, 叶至
收稿日期:2024-11-27
修回日期:2025-01-07
接受日期:2025-01-24
发布日期:2025-02-14
出版日期:2025-11-10
通讯作者:
刘明铭
作者简介:张家豪(1998—),男,河南商丘人,硕士研究生,主要研究方向:机器学习、生物特征识别基金资助:
Jiahao ZHANG, Qi WANG, Mingming LIU(
), Xiaofeng WANG, Biao HUANG, Pan LIU, Zhi YE
Received:2024-11-27
Revised:2025-01-07
Accepted:2025-01-24
Online:2025-02-14
Published:2025-11-10
Contact:
Mingming LIU
About author:ZHANG Jiahao, born in 1998, M. S. candidate. His research interests include machine learning, biometric identification.Supported by:摘要:
识别药物-靶标相互作用(DTI)是药物再利用和创新药物发现中不可或缺的关键步骤,目前已经有许多基于序列的计算方法被广泛应用于DTI预测;然而,在以往的基于序列的研究中,特征提取通常只关注序列本身,忽视了异构信息网络,如药物-药物相互作用网络、药物-靶标相互作用网络等。因此,提出一种基于序列和多视角网络进行DTI预测的新方法SMN-DTI(prediction of Drug-Target Interactions based on Sequence and Multi-view Networks)。该方法使用变分自编码器(VAE)学习药物SMILES(Simplified Molecular-Input Line-Entry System)字符串和靶标氨基酸序列的嵌入矩阵;随后,利用具有两级注意力机制的异构图注意力网络(HAN)从节点和语义2个视角的网络中聚集来自药物或靶标的不同邻居的信息,并得到最终的嵌入。在2个广泛用于DTI预测的基准数据集Hetero-seq-A和Hetero-seq-B上对SMN-DTI和基准方法进行评估的结果表明,在3种不同正负样本比例下SMN-DTI均取得了最优的特征曲线下面积(AUC)和精确召回曲线下面积(AUPR)。可见,SMN-DTI比目前主流的先进预测方法具有更好的性能。
中图分类号:
张家豪, 王琪, 刘明铭, 王晓峰, 黄彪, 刘盼, 叶至. 基于序列和多视角网络的药物-靶标相互作用预测[J]. 计算机应用, 2025, 45(11): 3658-3665.
Jiahao ZHANG, Qi WANG, Mingming LIU, Xiaofeng WANG, Biao HUANG, Pan LIU, Zhi YE. Prediction of drug-target interactions based on sequence and multi-view networks[J]. Journal of Computer Applications, 2025, 45(11): 3658-3665.
| 数据集 | 药物数 | 靶标数 | 药物-靶标 相互作用数 | 药物-药物 相互作用数 | 药物-药物结构 相似度矩阵 | 靶标-靶标相互 作用数 | 靶标-靶标序列 相似度矩阵 |
|---|---|---|---|---|---|---|---|
| Hetero-Seq-A | 708 | 1 512 | 1 923 | 10 036 | 708×708 | 7 363 | 1 512×1 512 |
| Hetero-Seq-B | 1 094 | 1 556 | 11 819 | 108 206 | 1 094×1 094 | 138 486 | 1 556×1 556 |
表1 实验数据集统计信息
Tab. 1 Statistics of experimental datasets
| 数据集 | 药物数 | 靶标数 | 药物-靶标 相互作用数 | 药物-药物 相互作用数 | 药物-药物结构 相似度矩阵 | 靶标-靶标相互 作用数 | 靶标-靶标序列 相似度矩阵 |
|---|---|---|---|---|---|---|---|
| Hetero-Seq-A | 708 | 1 512 | 1 923 | 10 036 | 708×708 | 7 363 | 1 512×1 512 |
| Hetero-Seq-B | 1 094 | 1 556 | 11 819 | 108 206 | 1 094×1 094 | 138 486 | 1 556×1 556 |
| 方法 | 正负样本比例为1∶1 | 正负样本比例为1∶5 | 正负样本比例为1∶10 | |||
|---|---|---|---|---|---|---|
| AUC | AUPR | AUC | AUPR | AUC | AUPR | |
| DeepDTA | 0.903 9 | 0.932 6 | 0.816 7 | 0.925 0 | 0.754 5 | |
| DeepConv-DTI | 0.906 3 | 0.905 0 | 0.924 2 | 0.791 3 | 0.924 0 | 0.712 8 |
| GraphDTA | 0.906 0 | 0.882 9 | 0.920 4 | 0.801 8 | 0.914 6 | 0.753 1 |
| Co-VAE | 0.907 5 | 0.907 8 | 0.931 2 | 0.822 5 | 0.932 8 | 0.773 9 |
| HyperAttentionDTI | 0.899 0 | 0.899 0 | 0.923 8 | 0.786 9 | 0.929 0 | 0.728 5 |
| MFR-DTA | 0.904 7 | 0.901 2 | 0.923 0 | 0.819 3 | 0.915 7 | 0.745 8 |
| IMAEN | 0.899 9 | 0.898 4 | 0.926 0 | 0.805 6 | 0.927 5 | 0.752 3 |
| SMN-DTI(CNN/CNN) | 0.889 9 | 0.900 2 | 0.924 1 | 0.789 8 | 0.924 9 | 0.718 5 |
| SMN-DTI(CNN/VAE) | 0.905 9 | 0.926 7 | 0.826 5 | 0.933 1 | 0.785 1 | |
| SMN-DTI(VAE/CNN) | 0.903 7 | 0.916 7 | ||||
| SMN-DTI | 0.916 5 | 0.934 6 | 0.936 0 | 0.861 1 | 0.940 6 | 0.813 1 |
表2 不同正负样本比例时各方法在Hetero-Seq-A数据集上的AUC和AUPR对比
Tab. 2 Comparison of AUC and AUPR of various methods on Hetero-Seq-A dataset under different positive-and-negative sample ratios
| 方法 | 正负样本比例为1∶1 | 正负样本比例为1∶5 | 正负样本比例为1∶10 | |||
|---|---|---|---|---|---|---|
| AUC | AUPR | AUC | AUPR | AUC | AUPR | |
| DeepDTA | 0.903 9 | 0.932 6 | 0.816 7 | 0.925 0 | 0.754 5 | |
| DeepConv-DTI | 0.906 3 | 0.905 0 | 0.924 2 | 0.791 3 | 0.924 0 | 0.712 8 |
| GraphDTA | 0.906 0 | 0.882 9 | 0.920 4 | 0.801 8 | 0.914 6 | 0.753 1 |
| Co-VAE | 0.907 5 | 0.907 8 | 0.931 2 | 0.822 5 | 0.932 8 | 0.773 9 |
| HyperAttentionDTI | 0.899 0 | 0.899 0 | 0.923 8 | 0.786 9 | 0.929 0 | 0.728 5 |
| MFR-DTA | 0.904 7 | 0.901 2 | 0.923 0 | 0.819 3 | 0.915 7 | 0.745 8 |
| IMAEN | 0.899 9 | 0.898 4 | 0.926 0 | 0.805 6 | 0.927 5 | 0.752 3 |
| SMN-DTI(CNN/CNN) | 0.889 9 | 0.900 2 | 0.924 1 | 0.789 8 | 0.924 9 | 0.718 5 |
| SMN-DTI(CNN/VAE) | 0.905 9 | 0.926 7 | 0.826 5 | 0.933 1 | 0.785 1 | |
| SMN-DTI(VAE/CNN) | 0.903 7 | 0.916 7 | ||||
| SMN-DTI | 0.916 5 | 0.934 6 | 0.936 0 | 0.861 1 | 0.940 6 | 0.813 1 |
| 方法 | 正负样本比例为1∶1 | 正负样本比例为1∶5 | 正负样本比例为1∶10 | |||
|---|---|---|---|---|---|---|
| AUC | AUPR | AUC | AUPR | AUC | AUPR | |
| DeepDTA | 0.949 2 | 0.940 4 | 0.956 7 | 0.861 4 | 0.945 2 | 0.802 1 |
| DeepConv-DTI | 0.938 8 | 0.931 6 | 0.951 4 | 0.829 3 | 0.949 6 | 0.745 2 |
| GraphDTA | 0.930 1 | 0.914 0 | 0.940 5 | 0.832 6 | 0.933 4 | 0.764 6 |
| Co-VAE | 0.950 6 | 0.941 8 | 0.955 6 | 0.858 8 | 0.951 3 | 0.799 8 |
| HyperAttentionDTI | 0.952 9 | 0.949 6 | 0.956 2 | 0.887 7 | 0.949 9 | 0.802 5 |
| MFR-DTA | 0.949 8 | 0.937 2 | 0.954 3 | 0.864 9 | 0.947 2 | 0.801 7 |
| IMAEN | 0.951 1 | 0.948 0 | 0.958 0 | 0.874 5 | 0.948 7 | 0.796 9 |
| SMN-DTI(CNN/CNN) | 0.947 4 | 0.943 9 | 0.960 0 | 0.875 4 | 0.954 3 | 0.805 3 |
| SMN-DTI(CNN/VAE) | 0.953 5 | 0.962 8 | 0.887 2 | 0.826 2 | ||
| SMN-DTI(VAE/CNN) | 0.954 2 | 0.959 2 | ||||
| SMN-DTI | 0.962 4 | 0.963 6 | 0.966 9 | 0.901 9 | 0.963 1 | 0.844 3 |
表3 不同正负样本比例时各方法在Hetero-Seq-B数据集上的AUC和AUPR对比
Tab. 3 Comparison of AUC and AUPR of various methods on Hetero-Seq-B dataset under different positive-and-negative sample ratios
| 方法 | 正负样本比例为1∶1 | 正负样本比例为1∶5 | 正负样本比例为1∶10 | |||
|---|---|---|---|---|---|---|
| AUC | AUPR | AUC | AUPR | AUC | AUPR | |
| DeepDTA | 0.949 2 | 0.940 4 | 0.956 7 | 0.861 4 | 0.945 2 | 0.802 1 |
| DeepConv-DTI | 0.938 8 | 0.931 6 | 0.951 4 | 0.829 3 | 0.949 6 | 0.745 2 |
| GraphDTA | 0.930 1 | 0.914 0 | 0.940 5 | 0.832 6 | 0.933 4 | 0.764 6 |
| Co-VAE | 0.950 6 | 0.941 8 | 0.955 6 | 0.858 8 | 0.951 3 | 0.799 8 |
| HyperAttentionDTI | 0.952 9 | 0.949 6 | 0.956 2 | 0.887 7 | 0.949 9 | 0.802 5 |
| MFR-DTA | 0.949 8 | 0.937 2 | 0.954 3 | 0.864 9 | 0.947 2 | 0.801 7 |
| IMAEN | 0.951 1 | 0.948 0 | 0.958 0 | 0.874 5 | 0.948 7 | 0.796 9 |
| SMN-DTI(CNN/CNN) | 0.947 4 | 0.943 9 | 0.960 0 | 0.875 4 | 0.954 3 | 0.805 3 |
| SMN-DTI(CNN/VAE) | 0.953 5 | 0.962 8 | 0.887 2 | 0.826 2 | ||
| SMN-DTI(VAE/CNN) | 0.954 2 | 0.959 2 | ||||
| SMN-DTI | 0.962 4 | 0.963 6 | 0.966 9 | 0.901 9 | 0.963 1 | 0.844 3 |
| 药物元路径 | 靶标元路径 | AUC | AUPR |
|---|---|---|---|
| 0.953 0 | 0.953 2 | ||
| 0.927 7 | 0.919 3 | ||
| 0.958 8 | 0.960 7 | ||
| 0.962 4 | 0.963 6 |
表4 不同元路径的实验结果
Tab. 4 Experimental results of different meta-paths
| 药物元路径 | 靶标元路径 | AUC | AUPR |
|---|---|---|---|
| 0.953 0 | 0.953 2 | ||
| 0.927 7 | 0.919 3 | ||
| 0.958 8 | 0.960 7 | ||
| 0.962 4 | 0.963 6 |
| [1] | KAPETANOVIC I M. Computer-Aided Drug Discovery and Development (CADDD): in silico-chemico-biological approach[J]. Chemico-Biological Interactions, 2008, 171(2): 165-176. |
| [2] | HE Z, ZHANG J, SHI X H, et al. Predicting drug-target interaction networks based on functional groups and biological features[J]. PLoS ONE, 2010, 5(3): No.e9603. |
| [3] | COBANOGLU M C, LIU C, HU F Z, et al. Predicting drug-target interactions using probabilistic matrix factorization[J]. Journal of Chemical Information and Modeling, 2013, 53(12): 3399-3409. |
| [4] | ÖZTÜRK H, ÖZGÜR A, OZKIRIMLI E. DeepDTA: deep drug-target binding affinity prediction[J]. Bioinformatics, 2018, 34(17): i821-i829. |
| [5] | TSUBAKI M, TOMII K, SESE J. Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences[J]. Bioinformatics, 2019, 35(2): 309-318. |
| [6] | ZHAO Q, XIAO F, YANG M, et al. AttentionDTA: prediction of drug-target binding affinity using attention model[C]// Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine. Piscataway: IEEE, 2019: 64-69. |
| [7] | NGUYEN T, LE H, QUINN T P, et al. GraphDTA: predicting drug-target binding affinity with graph neural networks[J]. Bioinformatics, 2021, 37(8): 1140-1147. |
| [8] | LI T, ZHAO X M, LI L. Co-VAE: drug-target binding affinity prediction by co-regularized variational autoencoders[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 8861-8873. |
| [9] | KINGMA D P, WELLING M. Auto-encoding variational Bayes[EB/OL]. [2023-11-01].. |
| [10] | ZHANG P, WEI Z, CHE C, et al. DeepMGT-DTI: transformer network incorporating multilayer graph information for Drug-Target interaction prediction[J]. Computers in Biology and Medicine, 2022, 142: No.105214. |
| [11] | WANG X, JI H, SHI C, et al. Heterogeneous graph attention network[C]// Proceedings of the 2019 World Wide Web Conference. New York: ACM, 2019: 2022-2032. |
| [12] | LI J, WANG J R, LV H, et al. IMCHGAN: inductive matrix completion with heterogeneous graph attention networks for drug-target interactions prediction[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2022, 19(2): 655-665. |
| [13] | VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. [2024-07-01].. |
| [14] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
| [15] | ZHAO Q, ZHAO H, ZHENG K, et al. HyperAttentionDTI: improving drug-protein interaction prediction by sequence-based deep learning with attention mechanism[J]. Bioinformatics, 2022, 38(3): 655-662. |
| [16] | ZHANG Y, HU Y, LI H, et al. Drug-protein interaction prediction via variational autoencoders and attention mechanisms[J]. Frontiers in Genetics, 2022, 13: No.1032779. |
| [17] | HIGGINS I, MATTHEY L, PAL A, et al. β-VAE: learning basic visual concepts with a constrained variational framework[EB/OL]. [2023-11-01].. |
| [18] | DAUPHIN Y N, FAN A, AULI M, et al. Language modeling with gated convolutional networks[C]// Proceedings of the 34th International Conference on Machine Learning. New York: JMLR.org, 2017: 933-941. |
| [19] | LUO Y, ZHAO X, ZHOU J, et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information[J]. Nature Communications, 2017, 8: No.573. |
| [20] | SUN Y, HAN J, YAN X, et al. PathSim: meta path-based top-k similarity search in heterogeneous information networks[J]. Proceedings of the VLDB Endowment, 2011, 4(11): 992-1003. |
| [21] | LI M, CAI X, XU S, et al. Metapath-aggregated heterogeneous graph neural network for drug-target interaction prediction[J]. Briefings in Bioinformatics, 2023, 24(1): No.bbac578. |
| [22] | MAAS A L, HANNUN A Y, NG A Y. Rectifier nonlinearities improve neural network acoustic models[EB/OL]. [2023-12-03].. |
| [23] | CLEVERT D A, UNTERTHINER T, HOCHREITER S. Fast and accurate deep network learning by Exponential Linear Units (ELUs)[EB/OL]. [2023-12-03].. |
| [24] | SHAO K H, ZHANG Y, WEN Y, et al. DTI-HETA: prediction of drug-target interactions based on GCN and GAT on heterogeneous graph[J]. Briefings in Bioinformatics, 2022, 23(3): No.bbac109. |
| [25] | ZHENG Y, PENG H, ZHANG X, et al. Predicting drug targets from heterogeneous spaces using anchor graph hashing and ensemble learning[C]// Proceedings of the 2018 International Joint Conference on Neural Networks. Piscataway: IEEE, 2018: 1-7. |
| [26] | LEE I, KEUM J, NAM H. DeepConv-DTI: prediction of drug-target interactions via deep learning with convolution on protein sequences[J]. PLoS Computational Biology, 2019, 15(6): No.e1007129. |
| [27] | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. [2023-12-03].. |
| [28] | HUA Y, SONG X, FENG Z, et al. MFR-DTA: a multi-functional and robust model for predicting drug-target binding affinity and region[J]. Bioinformatics, 2023, 39(2): No.btad056. |
| [29] | ZHANG J, LIU Z, PAN Y, et al. IMAEN: an interpretable molecular augmentation model for drug-target interaction prediction[J]. Expert Systems with Applications, 2024, 238(Pt C): No.121882. |
| [30] | PASZKE A, GROSS S, MASSA F, et al. PyTorch: an imperative style, high-performance deep learning library[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 8026-8037. |
| [31] | WANG M, ZHENG D, YE Z, et al. Deep graph library: a graph-centric, highly-performant package for graph neural networks[EB/OL]. [2023-11-15].. |
| [1] | 邓伊琳, 余发江. 基于LSTM和可分离自注意力机制的伪随机数生成器[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2893-2901. |
| [2] | 吕景刚, 彭绍睿, 高硕, 周金. 复频域注意力和多尺度频域增强驱动的语音增强网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2957-2965. |
| [3] | 李维刚, 邵佳乐, 田志强. 基于双注意力机制和多尺度融合的点云分类与分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3003-3010. |
| [4] | 王翔, 陈志祥, 毛国君. 融合局部和全局相关性的多变量时间序列预测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2806-2816. |
| [5] | 周金, 李玉芝, 张徐, 高硕, 张立, 盛家川. 复杂电磁环境下的调制识别网络[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2672-2682. |
| [6] | 吴海峰, 陶丽青, 程玉胜. 集成特征注意力和残差连接的偏标签回归算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2530-2536. |
| [7] | 敬超, 全育涛, 陈艳. 基于多层感知机-注意力模型的功耗预测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2646-2655. |
| [8] | 林进浩, 罗川, 李天瑞, 陈红梅. 基于跨尺度注意力网络的胸部疾病分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2712-2719. |
| [9] | 申奥, 黄瑞章, 薛菁菁, 陈艳平, 秦永彬. 基于分布增强的深度变分文本聚类模型[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2457-2463. |
| [10] | 梁辰, 王奕森, 魏强, 杜江. 基于Tsransformer-GCN的源代码漏洞检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2296-2303. |
| [11] | 王艺涵, 路翀, 陈忠源. 跨模态文本信息增强的多模态情感分析模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2237-2244. |
| [12] | 刘皓宇, 孔鹏伟, 王耀力, 常青. 基于多视角信息的行人检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2325-2332. |
| [13] | 赵小强, 柳勇勇, 惠永永, 刘凯. 基于改进时域卷积网络与多头自注意力机制的间歇过程质量预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2245-2252. |
| [14] | 王慧斌, 胡展傲, 胡节, 徐袁伟, 文博. 基于分段注意力机制的时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2262-2268. |
| [15] | 宋源, 陈锌, 李亚荣, 李永伟, 刘扬, 赵振. 基于听觉调制孪生网络的单通道语音分离模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 2025-2033. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||