Mark B. Gerstein, et al., Architecture of the human regulatory network derived from ENCODE data, Nature 489, 2012
To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 transcription-related factors in over 450 distinct experiments.
Human transcriptional factors co-associate in a combinatorial and context-specific fashion; different combinations of factors bind near different targets, and the binding of one factor often affects the preferred binding partners of others. Moreover, transcription factors often show different co-association patterns in gene-proximal and distal regions.
Different parts of the hierarchical transcription factor network exhibit distinct properties. For instance, the middle level has the most information-flow bottlenecks and, offsetting this, tends to have the most information-flow bottlenecks and, offsetting this, tends to have the most regulatory collaboration between transcription factors. Conversely, higher-level transcription factors have the greatest connectivity with other networks (for examples, the phosphorylome).
The occurrence of the feed-forward loops is strongly enriched in the transcription factor network, as are a number of motifs in which two genes co-regulated by a factor are bridged by a protein-protein interaction or regulating miRNA.
Highly connected network elements (both transcription factors and targets) are under strong evolutionary selection and exhibit stronger allele-specific activity (this is particularly apparent when multiple factors are involved). Surprisingly, however, elements with allelic activity are under weaker selection than non-allelic ones.
The degree of allele-specific behavior of each transcription factor can be quantified by a statistic that we call ‘allelicity’. The allelicity of a transcription factor is defined as the fraction of single nucleotide polymorphisms (SNPs) that exhibit allele-specific binding out of all the SNPs that may potentially exhibit it.
We find that transcription factors with higher degrees of allelicity tend to have more target genes, indicating that these factors tend to vary more in their binding with sequence. Finally, we found that small insertions and deletions (indels) tended to cause disproportionally more of these allelic events than did SNPs.