|
Cis-regulatory motif finding algorithm in chromatin immunoprecipitation sequencing datasets
FENG Yanxia, ZHANG Zhihong, ZHANG Shaoqiang
Journal of Computer Applications
2018, 38 (6):
1826-1830.
DOI: 10.11772/j.issn.1001-9081.2017112749
Aiming at the motif finding problem in Chromatin Immunoprecipitation Sequencing (ChIP-Seq) datasets of Next-Generation Sequencing (NGS), a new motif finding algorithm based on Fisher's exact test, called FisherNet, was proposed. Firstly, Fisher's exact test was used to calculate the
P values of all
k-mers, some of which were selected as motif seeds. Secondly, the position weight matrix of the initial motif was constructed. Finally, the position weight matrix was employed to scan all
k-mers for obtaining the final motif. The ChIP-Seq datasets of mouse Embryonic Stem cells (mESC), mouse erythrocytes, human lymphoblastic lines and the ENCODE database were used for verifying. The verification results show that, the accuracy and calculation speed of the proposed algorithm are higher than those of other common motif finding algorithms, and it can find more than 80% of core motifs for known transcription factors and their co-factors. The proposed algorithm can be applied to large-scale sequencing datasets while ensuring high accuracy.
Reference |
Related Articles |
Metrics
|
|