A study published online at Nature reports the identification of three recurrently mutated genes by whole-genome sequencing of four cases with chronic lymphocytic leukemia (CLL). This is the most common adult leukemia in western nations, with two major subtypes distinguished by somatic hypermutation of the immunoglobulin heavy chain (IgH) variable region. Led by Xose S. Puente of Universidad de Oviedo in Spain, the authors applied a combination of whole-genome sequencing, exome sequencing, and long-insert library sequencing to tumor samples and matched (normal) controls from two patients of each subtype.
Puente et al identified roughly 1,000 somatic mutation per tumor in unique regions, estimating a mutation rate of less than one per 1 megabase. This is consistent with other leukemias, although (to my disappointment) the authors failed to refer to the first two sequenced leukemia genomes, AML1 (Ley et al, Nature 2008) and AML2 (Mardis et al, NEJM 2009), which I’ve cited below. In these four CLL cases, the most common substitution was G>A / C>T, which usually occurred in a CpG context. Interestingly, the mutation spectrum differed between subtypes; IGHV-mutated cases showed a higher fraction of A>C / T>G substitutions, and often A>C mutations occurred at adenines preceded by a thymine. The context and patterns of mutations in IGHV-mutated cases was consistent with error-prone polymerase during the normal process of somatic hypermutation of IGHV genes.
Somatic Coding Mutations
The authors divided somatic mutations into one of three classes:
- Class 1 mutations, which include nonsynonymous substitutions and frameshift indels
- Class 2 mutations, which include synonymous and UTR substitutions
- Class 3 mutations, comprising everything else.
This classification system is similar to my group’s approach, which we apply separately to SNVs and indels. We classify variants as “tier 1” if they affect coding sequences, “tier 2” if they affect conserved bases or known regulatory elements, “tier 3” if they map to unique noncoding regions, or “tier 4” otherwise. For the present study, however, I dug into supplementary information to build this summary table of somatic coding mutations:
Summarized in the above fashion, these mutation counts are similar to the number observed in AML1 (n=10) and AML2 (n=12). The relatively small number of somatic coding mutations in leukemia is just incredible.
Recurrent Mutations in CLL
Using a pooled sequencing strategy, Puente et al screened for mutations in 26 genes among 169 additional CLL cases. The rate of recurrence is reported for 363 CLL patients; it was unclear how the ~200 additional cases were examined. In any event, four genes proved to harbor recurrent mutations in CLL:
- NOTCH1 (12% of cases), a key signaling molecule involved in developmental processes that controls cell fate decisions. The observed mutations generate a truncated protein lacking the PEST sequence, which was constitutively activated and more stable than the wild-type isoform. NOTCH1-mutated patients had more advanced CLL at presentation.
- MYD88 (2.9% of cases), an effector molecule for IL1 and TLR receptor signaling. In mutated cells, activation of IL-1 or TLR signaling triggered a dramatic over-production of IL1RA, IL6, CCL2, CCL3, and CCL4. The high production of these cytokines is known to recruit macrophages and T-lymphocytes, creating a favorable micro-environment for tumor survival. Indeed, patients with MYD88 mutations were diagnosed at a younger age, and with more advanced tumors.
- XPO1 (1.1% of cases), which encodes exportin 1, a protein implicated in the nuclear export of proteins and mRNAs (including MAP kinases). Notably, all four cases with this mutation were of the IGHV-unmutated subtype and had NOTCH1 mutations, indicating a possible synergistic effect between mutated NOTCH1 and XPO1.
- KLHL6 (0.8% of cases), which plays a role in germinal center formation during B-cell maturation. The three mutated cases harbored multiple point mutations, consistent with somatic hypermutation.
Based on functional and clinical analyses, the authors conclude that mutations in NOTCH1, MYD88, and XPO1 are oncogenic changes that contribute to the clinical evolution of CLL.
Using paired-end sequence data and a basic analytical approach, the authors identified ten somatic structural variants (SVs), most of which were known events in CLL. Three of four cases harbored a deletion of 13q14; the minimally-deleted region includes several genes and a couple of micro-RNAs. This is a known lesion in CLL, and was not pursued further in the main text. From the copy number data in Figure 1, it is clear that these CLL genomes harbor relatively few genomic rearrangements, which is again consistent with what we’ve seen for acute leukemia.
Variant Detection Sensitivity and Specificity
The authors mention that they employed not just WGS but exome sequencing, though the latter finds no place in the main text. Looking through the supplemental materials, I found that some 42 mutations were identified and validated by exome sequencing. Of these, 37 were found using WGS data, suggesting a sensitivity of ~88% for somatic coding mutations. All mutations were manually reviewed to remove common sequencing- and alignment-related artifacts. Some validation was performed using PCR and Sanger sequencing; among the 86 class 1 / class 2 variants for which PCR and Sanger sequence data were obtained, 83 proved to be valid somatic mutations. An additional 384 random mutations (96 per tumor) underwent validation as well, and 96% of these were validated. This is an impressive specificity, though I would attribute it to the manual review process, which may not scale to genomes with more than 10-20 somatic mutations.
Puente XS, Pinyol M, Quesada V, Conde L, Ordóñez GR, Villamor N, Escaramis G, Jares P, Beà S, González-Díaz M, Bassaganyas L, Baumann T, Juan M, López-Guerra M, Colomer D, Tubío JM, López C, Navarro A, Tornador C, Aymerich M, Rozman M, Hernández JM, Puente DA, Freije JM, Velasco G, Gutiérrez-Fernández A, Costa D, Carrió A, Guijarro S, Enjuanes A, Hernández L, Yagüe J, Nicolás P, Romeo-Casabona CM, Himmelbauer H, Castillo E, Dohm JC, de Sanjosé S, Piris MA, de Alava E, Miguel JS, Royo R, Gelpí JL, Torrents D, Orozco M, Pisano DG, Valencia A, Guigó R, Bayés M, Heath S, Gut M, Klatt P, Marshall J, Raine K, Stebbings LA, Futreal PA, Stratton MR, Campbell PJ, Gut I, López-Guillermo A, Estivill X, Montserrat E, López-Otín C, & Campo E (2011). Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature PMID: 21642962
Mardis, E., Ding, L., Dooling, D., Larson, D., McLellan, M., Chen, K., Koboldt, D., Fulton, R., Delehaunty, K., McGrath, S., Fulton, L., Locke, D., Magrini, V., Abbott, R., Vickery, T., Reed, J., Robinson, J., Wylie, T., Smith, S., Carmichael, L., Eldred, J., Harris, C., Walker, J., Peck, J., Du, F., Dukes, A., Sanderson, G., Brummett, A., Clark, E., McMichael, J., Meyer, R., Schindler, J., Pohl, C., Wallis, J., Shi, X., Lin, L., Schmidt, H., Tang, Y., Haipek, C., Wiechert, M., Ivy, J., Kalicki, J., Elliott, G., Ries, R., Payton, J., Westervelt, P., Tomasson, M., Watson, M., Baty, J., Heath, S., Shannon, W., Nagarajan, R., Link, D., Walter, M., Graubert, T., DiPersio, J., Wilson, R., & Ley, T. (2009). Recurring Mutations Found by Sequencing an Acute Myeloid Leukemia Genome New England Journal of Medicine, 361 (11), 1058-1066 DOI: 10.1056/NEJMoa0903840
Ley TJ, Mardis ER, Ding L, Fulton B, McLellan MD, Chen K, Dooling D, Dunford-Shore BH, McGrath S, Hickenbotham M, Cook L, Abbott R, Larson DE, Koboldt DC, Pohl C, Smith S, Hawkins A, Abbott S, Locke D, Hillier LW, Miner T, Fulton L, Magrini V, Wylie T, Glasscock J, Conyers J, Sander N, Shi X, Osborne JR, Minx P, Gordon D, Chinwalla A, Zhao Y, Ries RE, Payton JE, Westervelt P, Tomasson MH, Watson M, Baty J, Ivanovich J, Heath S, Shannon WD, Nagarajan R, Walter MJ, Link DC, Graubert TA, DiPersio JF, & Wilson RK (2008). DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature, 456 (7218), 66-72 PMID: 18987736