Computational assessment of feature combinations for pathogenic variant prediction
MetadataShow full item record
BACKGROUND: Although several methods have been proposed for predicting the effects of genetic variants and their role in disease, it is still a challenge to identify and prioritize pathogenic variants within sequencing studies. METHODS: Here, we compare different variant and gene-specific features as well as existing methods and investigate their best combination to explore potential performance gains. RESULTS: We found that combining the number of "biological process" Gene Ontology annotations of a gene with the methods PON-P2, and PROVEAN significantly improves prediction of pathogenic variants, outperforming all individual methods. A comprehensive analysis of the Gene Ontology feature suggests that it is not a variant-dependent annotation bias but reflects the multifunctional nature of disease genes. Furthermore, we identified a set of difficult variants where different prediction methods fail. CONCLUSION: Existing pathogenicity prediction methods can be further improved.