Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Disease sample classification algorithm by Bayesian network with gene association analysis
Zhijie LI, Xuhong LIAO, Yuanxiang LI, Qinglan LI
Journal of Computer Applications    2024, 44 (11): 3449-3458.   DOI: 10.11772/j.issn.1001-9081.2024030398
Abstract71)   HTML2)    PDF (644KB)(130)       Save

As a specific type of big data in biology, similarity of gene expression data is not based on Euclidean distance but on whether gene expression values show a trend of both rise and fall together, although they are all ordinary real values. The current gene Bayesian network uses gene expression level values as node random variables and does not reflect the similarity of this kind of subspace pattern. Therefore, a Bayesian network disease Classification algorithm based on Gene Association analysis (BCGA) was proposed to learn Bayesian networks from labeled disease sample-gene expression data and predict the classification of new disease samples. Firstly, disease samples were discretized and filtered to select genes, and the dimensionally reduced gene expression values were sorted and replaced with gene column subscripts. Secondly, the subscript sequence of gene column was decomposed into a set of atomic sequences with a length of 2, and the frequent atomic sequence of this set was corresponding to the association of a pair of genes. Finally, causal relationships were measured through gene association entropy for Bayesian network structure learning. Besides, the parameter learning of BCGA became easy, and the conditional probability distribution of a gene node was able to be obtained by counting the atomic sequence occurrence frequency of the gene and its parent node gene. Experimental results on multiple tumor and non-tumor gene expression datasets show that BCGA significantly improves disease classification accuracy and effectively reduces analysis time compared to the existing similar algorithms. In addition, BCGA uses gene association entropy instead of conditional independence, and gene atomic sequences instead of gene expression values, which can better fit gene expression data better.

Table and Figures | Reference | Related Articles | Metrics
Graph embedding method integrated multiscale features
LI Zhijie LI Changhua YAO Peng LIU Xin
Journal of Computer Applications    2014, 34 (10): 2891-2894.   DOI: 10.11772/j.issn.1001-9081.2014.10.2891
Abstract211)      PDF (797KB)(305)       Save

In the domain of structural pattern recognition, the existing graph embedding methods lack versatility and have high computation complexity. A new graph embedding method integrated with multiscale features based on space syntax theory was proposed to solve this problem. This paper extracted the global, local and detail features to construct feature vector depicting the graph feature by multiscale histogram. The global features included vertex number, edge number, and intelligible degree. The local features referred to node topological feature, edge domain features dissimilarity and edge topological features dissimilarity. The detail features comprised numerical and symbolic attributes on vertex and edge. In this way, the structural pattern recognition was converted into statistical pattern recognition, thus Support Vector Machine (SVM) could be applied to achieve graph classification. The experimental results show that the proposed graph embedding method can achieve higher classifying accuracy in different graph datasets. Compared with other graph embedding methods, the proposed method can adequately render the graphs topology, merge the non-topological features in terms of the graphs domain property, and it has a favorable universality and low computation complexity.

Reference | Related Articles | Metrics