基于图编码与改进流注意力的编码sORFs预测方法DeepsORF

doi:10.11772/j.issn.1001-9081.2024020177

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (2): 546-555.DOI: 10.11772/j.issn.1001-9081.2024020177

• 先进计算 • 上一篇

基于图编码与改进流注意力的编码sORFs预测方法DeepsORF

谢冬梅¹, 边昕烨¹, 于连飞¹, 刘文博¹, 王子灵¹, 曲志坚¹(), 于家峰²

^1.山东理工大学计算机科学与技术学院，山东淄博 255049
^2.德州学院生物物理研究院（山东省生物物理重点实验室），山东德州 253023

收稿日期:2024-02-26 修回日期:2024-04-14 接受日期:2024-04-16 发布日期:2024-06-04 出版日期:2025-02-10
通讯作者: 曲志坚
作者简介:谢冬梅（1998—），女，山东淄博人，硕士研究生，CCF会员，主要研究方向：深度学习、生物信息学
边昕烨（1998—），女，山东淄博人，硕士研究生，主要研究方向：深度学习、蛋白质组学
于连飞（1998—），男，河南周口人，硕士研究生，主要研究方向：深度学习、大数据分析
刘文博（1998—），男，山东菏泽人，硕士研究生，CCF会员，主要研究方向：深度学习、大数据分析
王子灵（1996—），女，山东滨州人，硕士研究生，主要研究方向：深度学习、生物信息学
于家峰（1979—），男，山东淄博人，教授，博士，主要研究方向：生物序列分析、生物信息学。
基金资助:
山东省高等学校青年创新团队发展计划项目(2019KJN048)

DeepsORF： coding sORFs prediction method based on graph coding with improved flow attention

Dongmei XIE¹, Xinye BIAN¹, Lianfei YU¹, Wenbo LIU¹, Ziling WANG¹, Zhijian QU¹(), Jiafeng YU²

^1.School of Computer Science and Technology，Shandong University of Technology，Zibo Shandong 255049，China
^2.Institute of Biophysics，Dezhou University （Shandong Key Laboratory of Biophysics），Dezhou Shandong 253023，China

Received:2024-02-26 Revised:2024-04-14 Accepted:2024-04-16 Online:2024-06-04 Published:2025-02-10
Contact: Zhijian QU
About author:XIE Dongmei， born in 1998， M. S. candidate. Her research interests include deep learning， bioinformatics.
BIAN Xinye， born in 1998， M. S. candidate. Her research interests include deep learning， proteomics.
YU Lianfei， born in 1998， M. S. candidate. His research interests include deep learning， big data analysis.
LIU Wenbo， born in 1998， M. S. candidate. His research interests include deep learning， big data analysis.
WANG Ziling， born in 1996， M. S. candidate. Her research interests include deep learning， bioinformatics.
YU Jiafeng， born in 1979， Ph. D.， professor. His research interests include biological sequence analysis， bioinformatics.
Supported by:
Youth Innovation Team Development Program of Shandong Province Higher Education Institutions(2019KJN048)

摘要/Abstract

摘要：

小开放阅读框（sORFs）在多种生物学过程中发挥着关键作用，且准确识别编码sORFs和非编码sORFs是基因组学中一项重要且有挑战性的任务。针对目前大多数编码sORFs预测算法严重依赖基于先验生物知识的手工特征且缺乏通用性的问题以及原始sORFs的序列长度长短不一而无法直接输入预测模型的问题，提出一种基于sORF-Graph图编码方式的端到端的深度学习框架DeepsORF预测编码sORFs。首先，通过sORF-Graph将所有sORFs序列编码成对应的图，并将序列信息编码成图元素特征，从而对输入序列进行标准化处理；其次，引入基于卷积与残差的流注意力机制捕获sORFs中碱基远距离之间的相互作用，以更有效地表达sORFs的特征，并提高模型的预测精度。实验结果证明，DeepsORF框架在6个独立测试集上的性能均得到提升，与csORF-finder方法相比，DeepsORF在D.melanogaster nonCDS-sORFs测试集上的准确率、马修斯相关系数（MCC）以及精确率分别提升了9.97、19.49与13.07个百分点，验证了DeepsORF模型在识别编码sORFs和非编码sORFs任务中的有效性以及良好泛化能力。

关键词: 小开放阅读框, 编码sORFs, 端到端, 图编码, 流注意力

Abstract:

Small Open Reading Frames （sORFs） plays a critical role in various biological processes， and identifying coding and non-coding sORFs accurately is a significant and challenging task in genomics. Due to the severe reliance of most existing algorithms for predicting coding sORFs on manual features based on prior biological knowledge， and the lack of universality of the algorithms， as well as the variable lengths of original sORFs sequences that prevent direct input into prediction models， an sORF-Graph graph encoding method-based end-to-end deep learning framework， DeepsORF， was developed for predicting coding sORFs. Firstly， all sORFs sequences were encoded into the corresponding graphs through sORF-Graph， and the input sequences were standardized by encoding sequence information into graph element features. Then， a convolutional and residual flow attention mechanism was introduced to capture the interactions among long distant bases within sORFs， thereby enhancing the expression of sORFs features and improving the model’s prediction accuracy. Experimental results demonstrate that DeepsORF framework enhances performance on all of six independent test sets. Compared with csORF-finder method， DeepsORF achieves increases of 9.97， 19.49， and 13.07 percentage points in accuracy， Matthew Correlation Coefficient （MCC）， and precision， respectively， on D.melanogaster nonCDS-sORFs test set， validating the effectiveness and good generalization ability of DeepsORF model in the task of identifying coding and non-coding sORFs.

Key words: small Open Reading Frames (sORFs), coding sORFs, end-to-end, graph encoding, flow attention

中图分类号:

TP183

谢冬梅, 边昕烨, 于连飞, 刘文博, 王子灵, 曲志坚, 于家峰. 基于图编码与改进流注意力的编码sORFs预测方法DeepsORF[J]. 计算机应用, 2025, 45(2): 546-555.

Dongmei XIE, Xinye BIAN, Lianfei YU, Wenbo LIU, Ziling WANG, Zhijian QU, Jiafeng YU. DeepsORF： coding sORFs prediction method based on graph coding with improved flow attention[J]. Journal of Computer Applications, 2025, 45(2): 546-555.

图/表 12

参考文献 33

1	SIEBER P， PLATZER M， SCHUSTER S. The definition of open reading frame revisited［J］. Trends in Genetics， 2018， 34（3）： 167-170.
2	BASRAI M A， HIETER P， BOEKE J F. Small open reading frames： beautiful needles in the haystack［J］. Genome Research， 1997， 7（8）： 768-771.
3	ORR M W， MAO Y， STORZ G， et al. Alternative ORFs and small ORFs： shedding light on the dark proteome［J］. Nucleic Acids Res， 2020， 48（3）： 1029-1042.
4	GALINDO M I， PUEYO J I， FOUIX S， et al. Peptides encoded by short ORFs control development and define a new eukaryotic gene family［J］. PLoS Biology， 2007， 5（5）： No.e106.
5	COUSO J P， PATRAQUIM P. Classification and function of small open reading frames［J］. Nature Reviews Molecular Cell Biology， 2017， 18（9）： 575-589.
6	HANADA K， AKIYAMA K， SAKURAI T， et al. sORF finder： a program package to identify small open reading frames with high coding potential［J］. Bioinformatics， 2010， 26（3）： 399-400.
7	TONG X， LIU S. CPPred： coding potential prediction based on the global description of RNA sequence［J］. Nucleic Acids Research， 2019， 47（8）： No.e43.
8	ZHU M， GRIBSKOV M. MiPepid： MicroPeptide identification tool using machine learning［J］. BMC Bioinformatics， 2019， 20： No.559.
9	TONG X， HONG X， XIE J， et al. CPPred-sORF： coding potential prediction of sORF based on non-AUG［EB/OL］. ［2024-05-23］..
10	YU J， GUO L， DOU X， et al. Comprehensive evaluation of protein-coding sORFs prediction based on a random sequence strategy［J］. Frontiers in Bioscience-Landmark， 2021， 26（8）： 272-278.
11	YU J， JIANG W， ZHU S B， et al. Prediction of protein-coding small ORFs in multi-species using integrated sequence-derived features and the random forest model［J］. Methods， 2023， 210： 10-19.
12	ZHAO S， MENG J， WEKESA J S， et al. Identification of small open reading frames in plant lncRNA using class-imbalance learning［J］. Computers in Biology and Medicine， 2023， 157： No.106773.
13	ZHANG Y， JIA C， FULLWOOD M J， et al. DeepCPP： a deep neural network based on nucleotide bias information and minimum distribution similarity feature selection for RNA coding potential prediction［J］. Briefings in Bioinformatics， 2021， 22（2）： 2073-2084.
14	CAMARGO A P， SOURKOV V， PEREIRA G A G， et al. RNAsamba： neural network-based assessment of the protein-coding potential of RNA sequences［J］. NAR Genomics and Bioinformatics， 2020， 2（1）： No.lqz024.
15	ZHANG M， ZHAO J， LI C， et al. csORF-finder： an effective ensemble learning framework for accurate identification of multi-species coding short open reading frames［J］. Briefings in Bioinformatics， 2022， 23（6）： No.bbac392.
16	DENG L， JIANG Y， HU X， et al. ABLNCPP： attention mechanism-based bidirectional long short-term memory for noncoding RNA coding potential prediction［J］. Journal of Chemical Information and Modeling， 2023， 63（12）： 3955-3966.
17	KHANDUJA A， KUMAR M， MOHANTY D. ProsmORF-pred： a machine learning-based method for the identification of small ORFs in prokaryotic genomes［J］. Briefings in Bioinformatics， 2023， 24（3）： No.bbad101.
18	WANG X， GAO X， WANG G， et al. miProBERT： identification of microRNA promoters based on the pre-trained model BERT［J］. Briefings in Bioinformatics， 2023， 24（3）： No.bbad093.
19	LIU X， SONG C， HUANG F， et al. GraphCDR： a graph neural network method with contrastive learning for cancer drug response prediction［J］. Briefings in Bioinformatics， 2021， 23（1）： No.bbab457.
20	MA A， WANG X， LI J， et al. Single-cell biological network inference using a heterogeneous graph transformer［J］. Nature Communications， 2023， 14： No.964.
21	WU Y， GAO M， ZENG M， et al. BridgeDPI： a novel graph neural network for predicting drug-protein interactions［J］. Bioinformatics， 2022， 38（9）： 2571-2578.
22	华阳，李金星，冯振华，等. 注意力特征融合的蛋白质-药物相互作用预测［J］. 计算机研究与发展， 2022， 59（9）： 2051-2065.
	HUA Y， LI J X， FENG Z H， et al. Protein-drug interaction prediction based on attention feature fusion［J］. Journal of Computer Research and Development， 2022， 59（9）： 2051-2065.
23	陶斯涵，丁彦蕊. 引入序列信息的残基相互作用网络比对算法［J］. 软件学报， 2019， 30（11）： 3413-3426.
	TAO S H， DING Y R. Algorithm introduced sequence information for residue interaction network alignment［J］. Journal of Software， 2019， 30（11）： 3413-3426.
24	WANG R， NG Y K， ZHANG X， et al. A graph representation of gapped patterns in phage sequences for graph convolutional network［EB/OL］. ［2024-05-23］..
25	LI A， ZHANG J， ZHOU Z. PLEK： a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme［J］. BMC Bioinformatics， 2014， 15： No.311.
26	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
27	JI Y， ZHOU Z， LIU H， et al. DNABERT： pre-trained bidirectional encoder representations from transformers model for DNA-language in genome［J］. Bioinformatics， 2021， 37（15）： 2112-2120.
28	CHANG L， MA C， SUN K， et al. Enhanced road information representation in graph recurrent network for traffic speed prediction［J］. IET Intelligent Transport Systems， 2023， 17（7）： 1434-1453.
29	牟长宁，王海鹏，周丕宇，等. 基于图卷积神经网络的串联质谱从头测序［J］. 计算机应用， 2021， 41（9）： 2773-2779.
	MOU C N， WANG H P， ZHOU P Y， et al. De novo peptide sequencing by tandem mass spectrometry based on graph convolutional neural network［J］. Journal of Computer Applications， 2021， 41（9）： 2773-2779.
30	GUO Y， LUO X， CHEN L， et al. DNA-GCN： graph convolutional networks for predicting DNA-protein binding［C］// Proceedings of the 2021 International Conference on Intelligent Computing， LNCS 12838. Cham： Springer， 2021： 458-466.
31	WU H， WU J， XU J， et al. Flowformer： linearizing transformers with conservation flows［C］// Proceedings of the 39th International Conference on Machine Learning. New York： JMLR.org， 2022： 24226-24242.
32	YAO Z， ZHANG W， SONG P， et al. DeepFormer： a hybrid network based on convolutional neural network and flow-attention mechanism for identifying the function of DNA sequences［J］. Briefings in Bioinformatics， 2023， 24（2）： No.bbad095.
33	GUNEL B， DU J， CONNEAU A， et al. Supervised contrastive learning for pre-trained language model fine-tuning［EB/OL］. ［2024-04-02］..

数据集	注释	编码 sORFs数	非编码sORFs数
H.sapiens CDS-sORFs	训练集	7 232	7 232
H.sapiens CDS-sORFs	测试集	1 808	1 808
H.sapiens nonCDS-sORFs	训练集	7 750	7 750
H.sapiens nonCDS-sORFs	测试集	1 937	1 937
M.musculus CDS-sORFs	训练集	2 615	2 615
M.musculus CDS-sORFs	测试集	654	654
M.musculus nonCDS-sORFs	训练集	3 066	3 066
M.musculus nonCDS-sORFs	测试集	767	767
D.melanogaster CDS-sORFs	训练集	682	682
D.melanogaster CDS-sORFs	测试集	171	171
D.melanogaster nonCDS-sORFs	训练集	6 978	6 978
D.melanogaster nonCDS-sORFs	测试集	1 745	1 745

数据集	注释	编码 sORFs数	非编码sORFs数
H.sapiens CDS-sORFs	训练集	7 232	7 232
H.sapiens CDS-sORFs	测试集	1 808	1 808
H.sapiens nonCDS-sORFs	训练集	7 750	7 750
H.sapiens nonCDS-sORFs	测试集	1 937	1 937
M.musculus CDS-sORFs	训练集	2 615	2 615
M.musculus CDS-sORFs	测试集	654	654
M.musculus nonCDS-sORFs	训练集	3 066	3 066
M.musculus nonCDS-sORFs	测试集	767	767
D.melanogaster CDS-sORFs	训练集	682	682
D.melanogaster CDS-sORFs	测试集	171	171
D.melanogaster nonCDS-sORFs	训练集	6 978	6 978
D.melanogaster nonCDS-sORFs	测试集	1 745	1 745

数据集	指标	方法
数据集	指标	CPPred	MiPepid	RNAsamba	DeepCPP	PsORFs	csORF-finder	codingCapacity	ABLNCPP	DeepsORF
H.sapiens CDS-sORFs	SN	0.612 3	0.954 6	0.682 5	0.588 5	0.782 6	0.846 2	0.787 6	0.815 8	0.857 9
	SP	0.675 3	0.068 0	0.709 1	0.679 2	0.827 4	0.820 2	0.829 6	0.822 5	0.870 0
	ACC	0.643 8	0.511 3	0.695 8	0.633 8	0.805 0	0.833 2	0.806 6	0.819 1	0.863 9
	MCC	0.288 2	0.049 0	0.391 7	0.268 8	0.610 7	0.666 7	0.617 8	0.638 3	0.727 9
	Precision	0.653 5	0.506 0	0.701 1	0.647 2	0.819 3	0.824 8	0.822 2	0.821 3	0.868 4
H.sapiens nonCDS-sORFs	SN	0.594 7	0.642 2	0.680 9	0.490 4	0.759 4	0.842 5	0.767 7	0.788 1	0.862 7
	SP	0.733 6	0.698 5	0.805 9	0.820 9	0.817 2	0.811 0	0.850 3	0.866 3	0.916 4
	ACC	0.664 2	0.670 4	0.743 4	0.655 7	0.788 3	0.826 8	0.809 0	0.827 2	0.889 5
	MCC	0.331 6	0.341 3	0.490 7	0.329 8	0.577 6	0.653 9	0.620 1	0.656 4	0.780 2
	Precision	0.690 6	0.680 5	0.778 2	0.732 5	0.806 0	0.816 8	0.836 8	0.854 8	0.911 6
M.musculus CDS-sORFs	SN	0.712 5	0.919 0	0.639 1	0.587 2	0.874 6	0.922 0	0.859 3	0.761 5	0.889 9
	SP	0.675 8	0.097 9	0.776 8	0.691 1	0.839 4	0.816 5	0.830 3	0.692 7	0.879 2
	ACC	0.694 2	0.508 4	0.708 0	0.639 1	0.857 0	0.869 3	0.844 8	0.727 0	0.884 6
	MCC	0.388 6	0.029 5	0.419 9	0.279 8	0.714 5	0.742 7	0.689 9	0.455 2	0.769 2
	Precision	0.687 3	0.504 6	0.741 1	0.655 3	0.844 9	0.834 0	0.835 1	0.836 1	0.880 5
M.musculus nonCDS-sORFs	SN	0.719 7	0.721 0	0.760 1	0.653 2	0.773 1	0.860 5	0.809 6	0.818 3	0.890 5
	SP	0.702 7	0.585 4	0.796 6	0.794 0	0.857 9	0.854 0	0.873 5	0.901 7	0.945 2
	ACC	0.711 2	0.653 2	0.778 4	0.723 6	0.815 5	0.857 2	0.841 6	0.859 9	0.917 9
	MCC	0.422 5	0.309 2	0.557 1	0.451 7	0.633 3	0.714 5	0.684 6	0.722 5	0.837 0
	Precision	0.707 7	0.634 9	0.788 9	0.760 2	0.844 7	0.854 9	0.864 9	0.893 0	0.942 1
D.melanogaster CDS-sORFs	SN	0.643 3	0.970 8	0.707 6	0.584 8	0.842 1	0.812 9	0.842 1	0.733 7	0.853 8
	SP	0.725 1	0.017 5	0.701 8	0.672 5	0.888 9	0.742 7	0.871 3	0.736 5	0.883 0
	ACC	0.684 2	0.494 2	0.704 7	0.628 7	0.865 5	0.777 8	0.856 7	0.735 1	0.868 4
	MCC	0.369 7	-0.038 7	0.409 4	0.258 3	0.731 8	0.556 9	0.713 8	0.470 2	0.737 2
	Precision	0.700 6	0.497 0	0.703 5	0.641 0	0.883 4	0.759 6	0.867 5	0.738 1	0.879 5
D.melanogaster nonCDS-sORFs	SN	0.111 2	0.506	0.598 3	0.180 5	0.695 1	0.803 4	0.710 6	0.781 7	0.830 4
	SP	0.876 2	0.622 3	0.522 1	0.780 5	0.703 2	0.665 3	0.746 1	0.726 9	0.837 8
	ACC	0.493 7	0.564 2	0.560 2	0.480 5	0.699 1	0.734 4	0.728 4	0.754 3	0.834 1
	MCC	-0.019 6	0.129 2	0.120 7	-0.048 7	0.398 3	0.473 3	0.457 0	0.509 3	0.668 2
	Precision	0.473 2	0.572 6	0.555 9	0.451 3	0.700 8	0.705 9	0.736 8	0.741 3	0.836 6

数据集	指标	方法
数据集	指标	CPPred	MiPepid	RNAsamba	DeepCPP	PsORFs	csORF-finder	codingCapacity	ABLNCPP	DeepsORF
H.sapiens CDS-sORFs	SN	0.612 3	0.954 6	0.682 5	0.588 5	0.782 6	0.846 2	0.787 6	0.815 8	0.857 9
	SP	0.675 3	0.068 0	0.709 1	0.679 2	0.827 4	0.820 2	0.829 6	0.822 5	0.870 0
	ACC	0.643 8	0.511 3	0.695 8	0.633 8	0.805 0	0.833 2	0.806 6	0.819 1	0.863 9
	MCC	0.288 2	0.049 0	0.391 7	0.268 8	0.610 7	0.666 7	0.617 8	0.638 3	0.727 9
	Precision	0.653 5	0.506 0	0.701 1	0.647 2	0.819 3	0.824 8	0.822 2	0.821 3	0.868 4
H.sapiens nonCDS-sORFs	SN	0.594 7	0.642 2	0.680 9	0.490 4	0.759 4	0.842 5	0.767 7	0.788 1	0.862 7
	SP	0.733 6	0.698 5	0.805 9	0.820 9	0.817 2	0.811 0	0.850 3	0.866 3	0.916 4
	ACC	0.664 2	0.670 4	0.743 4	0.655 7	0.788 3	0.826 8	0.809 0	0.827 2	0.889 5
	MCC	0.331 6	0.341 3	0.490 7	0.329 8	0.577 6	0.653 9	0.620 1	0.656 4	0.780 2
	Precision	0.690 6	0.680 5	0.778 2	0.732 5	0.806 0	0.816 8	0.836 8	0.854 8	0.911 6
M.musculus CDS-sORFs	SN	0.712 5	0.919 0	0.639 1	0.587 2	0.874 6	0.922 0	0.859 3	0.761 5	0.889 9
	SP	0.675 8	0.097 9	0.776 8	0.691 1	0.839 4	0.816 5	0.830 3	0.692 7	0.879 2
	ACC	0.694 2	0.508 4	0.708 0	0.639 1	0.857 0	0.869 3	0.844 8	0.727 0	0.884 6
	MCC	0.388 6	0.029 5	0.419 9	0.279 8	0.714 5	0.742 7	0.689 9	0.455 2	0.769 2
	Precision	0.687 3	0.504 6	0.741 1	0.655 3	0.844 9	0.834 0	0.835 1	0.836 1	0.880 5
M.musculus nonCDS-sORFs	SN	0.719 7	0.721 0	0.760 1	0.653 2	0.773 1	0.860 5	0.809 6	0.818 3	0.890 5
	SP	0.702 7	0.585 4	0.796 6	0.794 0	0.857 9	0.854 0	0.873 5	0.901 7	0.945 2
	ACC	0.711 2	0.653 2	0.778 4	0.723 6	0.815 5	0.857 2	0.841 6	0.859 9	0.917 9
	MCC	0.422 5	0.309 2	0.557 1	0.451 7	0.633 3	0.714 5	0.684 6	0.722 5	0.837 0
	Precision	0.707 7	0.634 9	0.788 9	0.760 2	0.844 7	0.854 9	0.864 9	0.893 0	0.942 1
D.melanogaster CDS-sORFs	SN	0.643 3	0.970 8	0.707 6	0.584 8	0.842 1	0.812 9	0.842 1	0.733 7	0.853 8
	SP	0.725 1	0.017 5	0.701 8	0.672 5	0.888 9	0.742 7	0.871 3	0.736 5	0.883 0
	ACC	0.684 2	0.494 2	0.704 7	0.628 7	0.865 5	0.777 8	0.856 7	0.735 1	0.868 4
	MCC	0.369 7	-0.038 7	0.409 4	0.258 3	0.731 8	0.556 9	0.713 8	0.470 2	0.737 2
	Precision	0.700 6	0.497 0	0.703 5	0.641 0	0.883 4	0.759 6	0.867 5	0.738 1	0.879 5
D.melanogaster nonCDS-sORFs	SN	0.111 2	0.506	0.598 3	0.180 5	0.695 1	0.803 4	0.710 6	0.781 7	0.830 4
	SP	0.876 2	0.622 3	0.522 1	0.780 5	0.703 2	0.665 3	0.746 1	0.726 9	0.837 8
	ACC	0.493 7	0.564 2	0.560 2	0.480 5	0.699 1	0.734 4	0.728 4	0.754 3	0.834 1
	MCC	-0.019 6	0.129 2	0.120 7	-0.048 7	0.398 3	0.473 3	0.457 0	0.509 3	0.668 2
	Precision	0.473 2	0.572 6	0.555 9	0.451 3	0.700 8	0.705 9	0.736 8	0.741 3	0.836 6

数据集	方法	SN	SP	ACC	MCC	精确率
H.sapiens CDS-sORFs	Flow attention	0.844 6	0.850 7	0.847 6	0.695 3	0.849 7
H.sapiens CDS-sORFs	CR-Flow attention	0.857 9	0.870 0	0.863 9	0.727 9	0.868 4
H.sapiens nonCDS-sORFs	Flow attention	0.850 8	0.921 0	0.885 9	0.773 7	0.915 0
H.sapiens nonCDS-sORFs	CR-Flow attention	0.862 7	0.916 4	0.889 5	0.780 2	0.911 6
M.musculus CDS-sORFs	Flow attention	0.854 7	0.880 7	0.867 7	0.735 7	0.877 6
M.musculus CDS-sORFs	CR-Flow attention	0.889 9	0.879 2	0.884 6	0.769 2	0.880 5
M.musculus nonCDS-sORFs	Flow attention	0.900 9	0.936 1	0.918 5	0.837 5	0.933 8
M.musculus nonCDS-sORFs	CR-Flow attention	0.890 5	0.945 2	0.917 9	0.837 0	0.942 1
D.melanogaster CDS-sORFs	Flow attention	0.836 3	0.777 8	0.807 0	0.615 1	0.790 1
D.melanogaster CDS-sORFs	CR-Flow attention	0.853 8	0.883 0	0.868 4	0.737 2	0.879 5
D.melanogaster nonCDS-sORFs	Flow attention	0.813 2	0.825 2	0.819 2	0.638 4	0.823 1
D.melanogaster nonCDS-sORFs	CR-Flow attention	0.830 4	0.837 8	0.834 1	0.668 2	0.836 6

基于图编码与改进流注意力的编码sORFs预测方法DeepsORF

DeepsORF： coding sORFs prediction method based on graph coding with improved flow attention

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 33

相关文章 15

编辑推荐

Metrics

[1]	蒋铭, 王琳钦, 赖华, 高盛祥. 基于编辑约束的端到端越南语文本正则化方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 362-370.
[2]	付强, 徐振平, 盛文星, 叶青. 结合字节级别字节对编码的端到端中文语音识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 318-324.
[3]	赵晓焱, 匡燕, 王梦含, 袁培燕. 基于知识图谱的端到端内容共享机制[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 995-1001.
[4]	尹春勇, 李荧. 基于BCU-Tree与字典的高效用挖掘快速脱敏算法[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 413-422.
[5]	刘聪, 万根顺, 高建清, 付中华. 基于韵律特征辅助的端到端语音识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 380-384.
[6]	杨磊, 赵红东, 于快快. 基于多头注意力机制的端到端语音情感识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1869-1875.
[7]	郭帅, 苏旸. 基于数据流的加密流量分类方法[J]. 计算机应用, 2021, 41(5): 1386-1391.
[8]	纪腾其, 孟军, 赵思远, 胡鹤还. 基于表示学习和深度森林的长链非编码RNA编码短肽预测模型[J]. 《计算机应用》唯一官方网站, 2021, 41(12): 3614-3619.
[9]	吴赛赛, 梁晓贺, 谢能付, 周爱莲, 郝心宁. 面向领域实体关系联合抽取的标注方法[J]. 计算机应用, 2021, 41(10): 2858-2863.
[10]	胡学敏, 童秀迟, 郭琳, 张若晗, 孔力. 基于深度视觉注意神经网络的端到端自动驾驶模型[J]. 计算机应用, 2020, 40(7): 1926-1931.
[11]	陈修凯, 陆志华, 周宇. 基于卷积编解码器和门控循环单元的语音分离算法[J]. 计算机应用, 2020, 40(7): 2137-2141.
[12]	郝志峰, 柯妍蓉, 李烁, 蔡瑞初, 温雯, 王丽娟. 基于图编码网络的社交网络节点分类方法[J]. 计算机应用, 2020, 40(1): 188-195.
[13]	贾永超, 何小卫, 郑忠龙. 融合重检测机制的卷积回归网络目标跟踪算法[J]. 计算机应用, 2019, 39(8): 2247-2251.
[14]	文凯, 谭笑. 基于用户偏好与副本阈值的端到端缓存算法[J]. 计算机应用, 2019, 39(7): 2051-2055.
[15]	邱泽宇, 屈丹, 张连海. 基于WaveNet的端到端语音合成方法[J]. 计算机应用, 2019, 39(5): 1325-1329.