This month in the New England Journal of Medicine, James Lupski and colleagues sequenced the complete genome of an individual with familial Charcot-Marie Tooth (CMT) disease. The “individual” is Lupski himself – he not only led the study, but served as patient zero. From conversations with some of my colleagues at Baylor, it’s clear that Dr. Lupski has devoted much of his career to understanding CMT disease; his association with one of the big three genome centers was the driving force behind this project.
The clinical background is rather interesting. Dr. Lupski and three of his siblings were diagnosed with CMT that appeared to be autosomal recessive, since their parents and grandparents were unaffected. Intriguingly, however, their father and paternal grandmother both shared a less severe disorder, axonal neuropathy, that would later prove to arise from haploinsufficiency in the CMT disease-causing gene.
Why Must There Always Be A Problem?
CMT, like many Mendelian disorders, has proven to be a genetically heterogeneous disease. It can segregate in an autosomal dominant, recessive, or X-linked manner. Single nucleotide polymorphisms (SNPs) and/or copy number variants (CNVs) at some 39 loci confer susceptibility to the disease. However, when tested for some of the more common CMT gene mutations (PMP22, MPZ, PRX, GDAP1, and EGR2), the causative variant for the Lupski pedigree was not found.
Although his institution (BCM) is a leader in exome capture and sequencing, Lupski and colleagues decided on a whole-genome sequencing approach. It was probable, but not certain, that the disease-causing variant was in a coding region, and it might also have been a copy number change which wouldn’t be detected using a capture approach. Thus, with about four runs on a Life Technologies SOLiD instrument, the Lupski genome at 30x was revealed.
Finding the Causative Variants
There were something like 3.4 million SNPs and small indels, which is right on the money for WGS of a single individual, but terribly daunting when searching for a single mutation. The authors whittled down the list of suspects in a series of steps: first they isolated intragenic variants (1.17 million), then prioritized nonsynonymous variants (9,069), and finally cross-referenced these with the ~40 genes linked to CMT (54 coding SNPs). Intriguingly, two of the 54 SNPs were in SH3TC2, a gene previously implicated in CMT in eastern European families. One, carried by Dr. Lupski’s mother, was a known nonsense mutation. The other, carried by his father and paternal grandmother, was a novel missense change that segregated with axonal neuropathy in the pedigree.
Sequencing and Disease Pedigrees
The authors rightly conclude that this study demonstrates the diagnostic power of whole genome sequencing, and that “as a practical matter, the identification of rare, heterogeneous alleles by means of whole-genome sequencing may be the only way to definitively determine genetic contributions to the associated clinical phenotypes.” Another important take-home message from this study, however, is the critical importance of large, well-characterized pedigrees for the study of inherited disease. Indeed, in the absence of exhaustive functional validation, the best way to confirm that you’ve found the disease-causing mutation is to show that it segregates with the phenotype in the rest of the pedigree.
Perhaps the most important aspect of this study is the venue – the New England Journal – because it, like our study of AML2 last year, demonstrates the power of next-generation sequencing to a different audience: the clinicians and medical practitioners who interact directly with patients.
References
Lupski JR, Reid JG, Gonzaga-Jauregui C, Rio Deiros D, Chen DC, Nazareth L, Bainbridge M, Dinh H, Jing C, Wheeler DA, McGuire AL, Zhang F, Stankiewicz P, Halperin JJ, Yang C, Gehman C, Guo D, Irikat RK, Tom W, Fantin NJ, Muzny DM, & Gibbs RA (2010). Whole-Genome Sequencing in a Patient with Charcot-Marie-Tooth Neuropathy. The New England journal of medicine PMID: 20220177
ted choi says
why not just do mismatch detection/snp discovery in the ~40 candidate genes? if there’s a candidate gene filter in the analysis plan, i don’t see how this is a demonstration of the power of whole genome sequencing.