Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (11): 3658-3665.DOI: 10.11772/j.issn.1001-9081.2024111664

• Advanced computing • Previous Articles    

Prediction of drug-target interactions based on sequence and multi-view networks

Jiahao ZHANG, Qi WANG, Mingming LIU(), Xiaofeng WANG, Biao HUANG, Pan LIU, Zhi YE   

  1. College of Software,Nankai University,Tianjin 300350,China
  • Received:2024-11-27 Revised:2025-01-07 Accepted:2025-01-24 Online:2025-02-14 Published:2025-11-10
  • Contact: Mingming LIU
  • About author:ZHANG Jiahao, born in 1998, M. S. candidate. His research interests include machine learning, biometric identification.
    WANG Qi, born in 1999, M. S. candidate. His research interests include biomedicine, link prediction.
    WANG Xiaofeng, born in 1999, M. S. candidate. His research interests include cancer type prediction, deep learning.
    HUANG Biao, born in 2001, M. S. candidate. His research interests include machine learning, bioinformatics.
    LIU Pan, born in 2001, M. S. candidate. Her research interests include machine learning, biometric identification.
    YE Zhi, born in 2000, M. S. candidate. His research interests include machine learning, biometric identification.
  • Supported by:
    Natural Science Foundation of Tianjin(22JCYBJC01020)

基于序列和多视角网络的药物-靶标相互作用预测

张家豪, 王琪, 刘明铭(), 王晓峰, 黄彪, 刘盼, 叶至   

  1. 南开大学 软件学院,天津 300350
  • 通讯作者: 刘明铭
  • 作者简介:张家豪(1998—),男,河南商丘人,硕士研究生,主要研究方向:机器学习、生物特征识别
    王琪(1999—),男,甘肃平凉人,硕士研究生,主要研究方向:生物医学、链接预测
    王晓峰(1999—),男,山西临汾人,硕士研究生,主要研究方向:癌症类型预测、深度学习
    黄彪(2001—),男,湖北天门人,硕士研究生,主要研究方向:机器学习、生物信息
    刘盼(2001—),女,湖南邵阳人,硕士研究生,主要研究方向:机器学习、生物特征识别
    叶至(2000—),男,广西贺州人,硕士研究生,主要研究方向:机器学习、生物特征识别。
  • 基金资助:
    天津自然科学基金资助项目(22JCYBJC01020)

Abstract:

Identifying Drug-Target Interactions (DTI) is a crucial step in drug repurposing and novel drug discovery. Currently, many sequence-based computational methods have been widely used for DTI prediction. However, previous sequence-based studies typically focus solely on the sequence itself for feature extraction, neglecting heterogeneous information networks such as drug-drug interaction networks and drug-target interaction networks. Therefore, a novel method for DTI prediction based on sequence and multi-view networks was proposed, namely SMN-DTI (prediction of Drug-Target Interactions based on Sequence and Multi-view Networks). The Variational AutoEncoder (VAE) was used to learn the embedding matrices of drug SMILES (Simplified Molecular-Input Line-Entry System) strings and target amino acid sequences in this method. Subsequently, a Heterogeneous graph Attention Network (HAN) with two-level attention mechanism was used to aggregate information from different neighbors of drugs or targets in the networks from both node and semantic perspectives, obtaining the final embeddings. Two benchmark datasets widely used for DTI prediction, Hetero-seq-A and Hetero-seq-B, were used to evaluate SMN-DTI and the baseline methods. The results show that SMN-DTI achieves the best performance in Area Under the receiver operating Characteristic curve (AUC) and the Area Under the Precision-Recall curve (AUPR) under three different positive-and-negative sample ratios. It can be seen that SMN-DTI outperforms current mainstream advanced prediction methods.

Key words: Drug-Target Interaction (DTI) prediction, Variational AutoEncoder (VAE), Heterogeneous graph Attention Network (HAN), multi-view network, attention mechanism

摘要:

识别药物-靶标相互作用(DTI)是药物再利用和创新药物发现中不可或缺的关键步骤,目前已经有许多基于序列的计算方法被广泛应用于DTI预测;然而,在以往的基于序列的研究中,特征提取通常只关注序列本身,忽视了异构信息网络,如药物-药物相互作用网络、药物-靶标相互作用网络等。因此,提出一种基于序列和多视角网络进行DTI预测的新方法SMN-DTI(prediction of Drug-Target Interactions based on Sequence and Multi-view Networks)。该方法使用变分自编码器(VAE)学习药物SMILES(Simplified Molecular-Input Line-Entry System)字符串和靶标氨基酸序列的嵌入矩阵;随后,利用具有两级注意力机制的异构图注意力网络(HAN)从节点和语义2个视角的网络中聚集来自药物或靶标的不同邻居的信息,并得到最终的嵌入。在2个广泛用于DTI预测的基准数据集Hetero-seq-A和Hetero-seq-B上对SMN-DTI和基准方法进行评估的结果表明,在3种不同正负样本比例下SMN-DTI均取得了最优的特征曲线下面积(AUC)和精确召回曲线下面积(AUPR)。可见,SMN-DTI比目前主流的先进预测方法具有更好的性能。

关键词: 药物-靶标相互作用预测, 变分自编码器, 异构图注意力网络, 多视角网络, 注意力机制

CLC Number: