The nomination of Francis Collins for director of NIH is very exciting, particularly for those of us who work in genomics. It not only recognizes his leadership and scientific reputation, but it seems to me a formal recognition of the importance of genetics and genomics for human health in the 21st century. At least three major scientific efforts were conceived and initiated during Collins’ tenure at NHGRI: the Human Genome Project (HGP), the International Haplotype Map Project (HapMap), and most recently, the 1,000 Genomes Project (TGP).
What Will Collins’ Next Big Project Be?
If Collins is confirmed as director of NIH, we can only wonder what his Next Big Project might be. Given his track record, I’d wager that it will have some of the same characteristics of past projects, namely:
- Driven by new technology. Major advances in technology were key enablers of all of these big projects. For the human genome, it was capillary-based sequencing. For the HapMap project, it was high-throughput genotyping. For the 1,000 Genomes Project, it was massively parallel sequencing.
- Unprecedented in scope and scale. There seems to be this one-upmanship that occurs when big projects are initiated. “We can sequence an entire gene,” someone probably told Collins. “Great,” he no doubt replied. “Let’s sequence the entire genome.” Or perhaps when groups were genotyping 50,000 SNPs, he just suggested, “let’s do all of them.”
- A major advance in knowledge. The key to successfully initiating big projects, and certainly to getting funding for them, is a “big picture” deliverable – a huge new body of knoweldge that the project will yield, like the complete human genome sequence, or all of the variants in it. Also, they seem to promise major advances that will immediately impact scientific research, and eventually improve human health. Look at what the HapMap did for genome-wide association studies.
Unfortunately for Francis, he’s set the bar pretty high already. I think that few people appreciate the scale of the 1,000 Genomes Project and just how much information it’s already yielding about human genetic variation. TGP is also driving development of new technologies (like capture) at a pace that no one expected. If the project meets its fundamental goal, thousands of genomes will be sampled, many of them completely sequenced to high coverage. I’m not sure that anyone can one-up the TGP, at least in terms of undiseased individuals.
My Guess: Disease-Related Omics at Unprecedented Scale
Granted, I’m biased, but I believe that next-gen or next-next-gen sequencing will be the key new technology that drives Collins’ Big New Project. This time around, however, I doubt it will be anonymous, undiseased individuals with no phenotypic information that are surveyed. Instead, I foresee an ambitious effort to characterize the *functional* variation in genomes of phenotyped individuals. Let’s say that sequencing an entire genome reaches the $1,000 mark. As far as big-project budgets go, that means “free.” So if sequencing had no cost, who would you sequence? Anyone where the sequence could be interesting. What would you sequence? The genome, the transcriptome, and the methylome.
Yet with NIH, there will be a decidedly medical focus, so we’re talking patients. Perhaps 1000 patients for the 100 most prevalent genetics-influenced diseases. Such a project will keep us genome centers busy, certainly. More importantly, it will bring the research and medical communities closer together, because we’ll need geneticists, molecular biologists, and clinicians, working in concert with one another, to fully understand the relationship between genotype and phenotype.
Dan –
Interesting post and I think your prediction makes a lot of sense. Just curious whether, in your opinion, a large-scale effort NIH-led effort to sequence phenotyped individuals would involve de-identification efforts to protect the identity of the participants, even as phenotypic data was associated with genomic (and other) sequence data, or do you think Collins/NIH would ever embrace the Personal Genome Project model of identifiable research subjects?
– Dan Vorhaus