Vivian Cheung and colleagues have published a landmark study on polymorphic cis- and trans-regulation of human gene expression that combines genetic association and transcriptome sequencing (RNA-Seq) to identify 1,000 polymorphic cis- and trans-regulators of gene expression.
Two key features distinguish this from other recent studies on genetics of gene expression (GOGE) in humans: first, linkage mapping followed by RNA-Seq allowed association-based fine mapping of regulatory regions with unprecedented resolution. Second, the authors supported their findings with a series of molecular assays, bridging the gap between genetic association and mechanistic validation.
Differential Findings: Model Organisms to Humans
Previous GOGE studies in humans and model organisms seem to disagree on the relative proportion of cis- versus trans- regulators. In humans, most reported regulators have been in cis with their target genes, and the discovery of cis-regulation of disease susceptibility genes (e.g. ORMDL3) reinforces the notion that these play an important role in gene expression variation. However, studies of GOGE in model organisms (yeast, fly, and mouse) identified mostly trans regulators. It seems unlikely that this doesn’t hold true for humans.
It’s a pretty big discrepancy, and one that the authors of this study attribute to sample size, which tends to be large in model organism studies and small in human ones. To address this, they studied gene expression in B-cells of individuals from large families. Linkage scans provided the initial regulatory regions for >1,600 expression phenotypes. The authors applied family-based and association-based analyses to narrow the candidate regions.
There were 107 linkage peaks that were proximal to (nearby) their target genes, and thus likely to act in cis. The authors refined this list with a series of further tests:
- 100 phenotypes had informative genotypes within 50kb for family-based testing
- 63 were significant in families by the quantitative transmission disequilibrium test (QTDT).
- 47 had significant population-based association in 86 unrelated individuals that were tested.
For 17 of the resulting phenotypes, the cis-variants explained 30% or more of the variation in gene expression.
There were 1,611 phenotypes with significant distal linkage peaks using the QTDT. The authors excluded 94 of these whose candidate regulatory regions were >20 mbp in size. That left 1,517 phenotypes to examine. Since these likely acted in trans, there was no obvious nearby gene for testing. One option would be to test all SNPs under the linkage peaks for association, but this would be a statistician’s nightmare multiple testing problem. Instead, the authors applied RNA-sequencing to 41 CEU samples to identify genes expressed in B-cells and test them for association with expression of their target genes. A key advantage of RNA-Seq over, say, microarray-based methods is the sensitivity to detect genes expressed at low levels; often, these include gene expression regulators like transcription factors. Their analysis revealed a total of 1,036 regulator-target gene pairs (103 with p<0.001, 518 with p<0.01, and 917 with p<0.05). Interestingly, the expression levels of 112 genes in the complete set were regulated by two unlinked trans regulators.
Molecular Validation Assays
The authors applied three types of molecular validation to confirm the regulator-target relationships implicated by their findings. For cis-regulatory variants, they used heterozygous SNPs in the RNA-Seq data to detect differential allelic expression (DAE). Of the 67 genes tested, 43 (64%) showed significant evidence of allele-biased expression. Altogether, ~65% of proximal linkages were cis-regulated; the remainder were either regulated in trans by nearby sequences, or did not achieve sufficient RNA-Seq coverage to detect allelic bias.
For trans-regulatory pairs, they performed gene knockdowns with siRNA, showing that the expression level of 72% of target genes tested changed significantly (usually 10-60%) when their regulator was knocked down by siRNA. The authors also examined physical interactions between their regulators and target genes using high-throughput sequencing of chromosome conformation capture (“Hi-C”) data from another study.
Some 75 of the 1,036 regulator-target pairs were supported by Hi-C data. These data support the genetic findings, and suggest that regulators and their target genes may be co-transcribed in physically associated “transcription factories”.
This study has shed new light on the complexity of gene regulation in humans. Importantly, the majority of regulators act in trans to their target genes. Yet only 34% of the identified trans regulators are transcription or signaling factors, suggesting that many other types of genes can influence gene expression. Many of the regulators, however, were in the same functional pathways as their target genes.
So how do polymorphisms in regulator genes act in trans? One possibility is that polymorphisms near a gene affect its expression, which in turn modulates the expression of its target gene. Another explanation is that sequence variants in the regulators alter the translation, stability, or structure of the regulator, which in turn influences how it regulates the target gene. Alternatively, regulator-target gene pairs may associate physically with one another in so-called “transcription factories” in which polymorphisms in the regulator affect co-transcription of both genes. All of these are intriguing possibilities, and unraveling them mechanistically with functional studies will undoubtedly improve our understanding of gene expression regulation in humans.
Cheung VG, Nayak RR, Wang IX, Elwyn S, Cousins SM, Morley M, & Spielman RS (2010). Polymorphic cis- and trans-regulation of human gene expression. PLoS biology, 8 (9) PMID: 20856902