Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions

Soumya Raychaudhuri, et al., Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions, PLoS Genetics 5(6), 2009

Translating a set of disease regions into insights about pathogenic mechanisms requires not only the ability to identify the key disease genes within them, but also the biological relationships among those key genes. here we describe a statistical method, Gene Relationships Among Implicated Loci (GRAIL), that takes a list of disease regions and automatically assesses the degree of relatedness of implicated genes using 250,000 PubMed abstracts.

The GRAIL statistical framework consists of four steps. First, given a set of disease regions we identify the genes overlapping them; for SNPs we use LD (linkage disequilibrium) characteristics to define the region. Second, for each overlapping gene we score all other human genes by their relatedness to it. Third, for each gene we count the number of independent regions with at least one highly related gene. Fourth, for each disease region we select the single most connected gene as the key gene.

The most critical technical difference between GRAIL and other strategies is that it does not use any strict definitions of gene functions or interactions, but rather uses a metric of relatedness that allows for a relatively broad range of freedom with which to connect genes.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s