You might have thought that the 1,000 Genomes Project would render the International HapMap obsolete. But just yesterday I heard a talk about how some groups are still leveraging the HapMap resource in numerous ways to better understand the relationship between genotype and phenotype. The speaker was Wei Zhang, a postdoc at the University of Chicago who’s published an astonishing 25 papers in the last 2 years.
One key advantage of the HapMap samples is the availability of transformed cell lines for all samples at Coriell. This allows researchers to assess various phenotypes with cell-based assays (e.g. gene expression, drug toxicity) and then mine the rich HapMap genotype dataset to perform genotype-phenotype associations. In a collaboration with Affymetrix, Zhang and his colleagues measured gene expression in 87 CEU samples and 89 YRI samples using the Human Exon 1.0 ST array, which captures ~1.4 million annotated exons from ~18,000 transcript clusters in the human genome. The data are available in the SCAN Database hosted at the University of Chicago.
Differentially Expressed Genes and SNP Association
The researchers found ~9,100 expressed genes in the CEU and YRI samples, including 383 that were differentially expressed between the populations (247 had higher expression in YRI than CEU, 136 had higher expression in CEU than YRI). Next, they used sample-level data in each population to correlate expression of those 383 genes with SNP genotypes. They successfully identified 75 genes with significant expression-genotype correlations, 11 of which were in cis (same chromosome within 2.5 Mb) and 64 of which were in trans.
Isoform Variation
Isoform variation was also detectable in the exon array data – by examining expressed genes with 3 or more exons, the researchers could compare probe intensities for each exon to see if any were differentially expressed. They identified a number of genes with differential isoform expression between YRI and CEU populations, and when they performed GO analysis, the most enriched gene category was, interestingly, genes that encode splicing factors.
SNPs, Gene Expression, and Pharmacogenetics
The Chicago group also performed a number of cell-based assays on the Hapmap samples to measure toxicity induced by a number of anti-cancer drugs. In this case their phenotype was IC50, the drug concentration at which growth was inhibited in 50% of cells. Such a drug study seems ideal for the HapMap samples since they happen to be transformed (i.e. continuously proliferating) cells. They measured IC50 for several types of anti-cancer agents (6 total), including DNA antimetabolites, platinating agents, and topoisomerase II (TopoII) inhibitors.
First, using the HapMap trio (mother-father-child) information in the CEU panel, Zhang and colleagues determined the “heritability” of IC50, which proved to be high (values in the 0.3-0.4 range) for all of the drugs. This provides more evidence for what seems to be an accepted fact: pharmaceutical response is a phenotype with a significant inheritable genetic component.
What they did next was very interesting: they performed an integrated analysis of HapMap genotypes, gene expression, and drug response to identify predictors of drug-induced toxicity. Zhang described their method as a “triangle approach”: first, SNPs were associated with drug response, then those SNPs were analyzed with the expression data to determine if any were also associated with gene expression. The correlated genes were then compared back to the response data, to see if any were also associated with drug response. As a result, they’re able to identify SNPs that influence gene expression which in turn influences repsonse to the drug. Genotype-mechanism-phenotype. I like it.
As an example of their findings, Zhang presented a SNP in GALNTL4 that was associated with response to Cisplatin, which I presume is a platinating agent. SNP genotypes were correlated with expression of GALNTL4, and that in turn was correlated with IC50 to Cisplatin. But here’s what I liked most about this example: the SNP they presented was intronic. It’s another reminder that it’s time to look outside the exons, people!
Future Directions: miRNA and Methylation
Efforts are currently under way at the University of Chicago to measure two more cell phenotypes on the HapMap samples. One is micro-RNA (miRNA) expression, which they’re assessing with something called the Exiqon miRCURY platform. The other is DNA methylation, as measured by chip-CHiP assays with CpG antibodies. I seem to recall that another group has already identified methylation-associated SNPs using HapMap data, but even so, I look forward to what Zhang and his colleagues will find.
I believe all the 1000genomes samples will be made similarly available
The current 1000genomes samples are all part of HAPMAP 3 already
Laura,
Yes, you’re correct – the HapMap samples are being sequenced as part of the 1000 Genomes project.