计算机应用 ›› 2010, Vol. 30 ›› Issue (11): 2959-2961.

• 数据库与数据挖掘 • 上一篇    下一篇

基于核Fisher判别分析的蛋白质氧链糖基化位点的预测

杨雪梅1,李世鹏2   

  1. 1. 咸阳师范学院
    2. 咸阳师范学院,数学与信息科学学院
  • 收稿日期:2010-06-01 修回日期:2010-07-24 发布日期:2010-11-05 出版日期:2010-11-01
  • 通讯作者: 杨雪梅
  • 基金资助:
    陕西省教育厅科学研究计划项目

Prediction of O-glycosylation sites in protein sequence by kernel Fisher discriminant analysis

  • Received:2010-06-01 Revised:2010-07-24 Online:2010-11-05 Published:2010-11-01
  • Contact: YANG XueMei

摘要: 以各种窗口长度的蛋白质样本序列为研究对象,实验样本用稀疏编码方式编码,使用核Fisher判别分析(KFDA)的方法来预测蛋白质氧链糖基化位点。首先通过非线性映射(由核函数隐含定义)将样本映射到特征空间,然后在特征空间中用Fisher判别分析进行分类。进一步,用多数投票策略对各种窗口下的分类器进行组合以综合多个窗口的优势。实验结果表明,使用组合KFDA的方法预测的效果优于FDA和PCA以及单个KFDA分类器的预测效果,预测准确率为86.5%。

关键词: 预测, 糖基化, 蛋白质, 核fisher判别分析(KFDA), 特征

Abstract: To predict the O-glycosylation sites in protein sequence, the method of Kernel Fisher Discriminant Analysis (KFDA) was proposed under various window sizes. Encoded by the sparse coding, the samples were first mapped onto a feature space implicitly defined by a kernel function, and then they were classified into two classes in the feature space by Fisher discriminant analysis. Furthermore, the majority-vote scheme was used to combine all the pre-classifiers to improve the prediction performance. The results indicate that the performance of ensembles of KFDA is better than that of FDA, PCA and pre-classifier. The prediction accuracy is about 86.5%.

Key words: prediction, glycosylation, protein, kernel fisher discriminant analysis, feature