Gene prioritization tool - Cheetoh - features in a two-step process consisting of enrichment analysis and Conditional random field (CRF) model based inference. The CRF model used in Cheetoh is encoded with gene centric features/annotations on the nodes and network information (String 9) on the edges. Features are weighted by the Bayes factor generated from the enrichment process and features with Bayes factor less than 3 will be automatically filtered. A cutoff of Bayes factor of 5 is recommended.
Probability score is computed for each gene in the whole genome associated with the original input genes from the previous enrichment step and the top 1000 genes ordered by the overall scores are displayed as final results. For each source from the enrichment process, Cheetoh show the ranks of each gene in the result set as well as the combined ranks summarized from all sources and estimated P-values are based on random input permutations. The estimated P-values have been corrected by the Bonferroni correction. Note that if user modifies the original Bayes factor from the enrichment, the P-value estimation may not be accurate.
The results shown in the table are based on the original input genes if no modification is done to the features. The hierarchical features (GO Hierarchy - Molecular Function, GO Hierarchy - Biological Process, GO Hierarchy - Cellular Component) may result in general ranking, therefore please use the non-hierarchical features if specific ranking is preferred.
*[Reference]: Xie, B., Agam, G., Balasubramanian, S., Xu, J., Gilliam, T. C., Maltsev, N., & Börnigen, D. (2015). Disease Gene Prioritization Using Network and Feature. Journal of Computational Biology, 22(4), 313-323.