Genome sequencing of multiple myeloma

A recent study in Nature reports an initial view of the genome of multiple myeloma (MMY), an incurable cancer of plasma cells (B cells) in the blood. Though it is the second most common hematological malignancy, MMY remains poorly understood. Some 40% of cases harbor structural alterations that place genes in proximity to the IgH locus, leading to their over-expression. Yet these rearrangements seem insufficient to cause MMY alone, as they are also found in its pre-malignant form, called monoclonal gammopathy of unknown significance (MGUS). Indeed, other genetic events – activation of MYC, KRAS, NRAS, and the NF-KB pathway, in some cases – are required for progression to malignant disease.

Chapman et al assessed mutations in 38 MMY tumors using a combination of whole-genome (23 cases) and exome (16 cases) sequencing. They found ~35 protein-altering mutations per tumor, and estimated a genome-wide mutation rate of 2.9 per megabase for this cancer type. That’s slightly higher than what we observed for AML1 and AML2, though not as high as the mutation rates of solid tumors such as breast and lung carcinomas.

Technical Concerns: Matched “Normal” and Mutation Calling

There are some technical issues, in light of which the findings should be considered. First, the matched normal samples used to distinguish germline variation from somatic mutations were blood sample. I’ve heard some concerns about this, since blood undoubtedly contains circulating, cancerous mature B cells. This could affect the sensitivity of mutation calling, as high-frequency mutations may be misclassified as germline due to their presence in the normal sample. A skin punch, I’m told, would have been a better control.

Second, most of the mutations reported have not been experimentally validated. Instead, the authors hand-selected 100 predicted mutations for validation. They were able to design Sequenom assays for 92 of these, and 87 proved to be valid somatic mutations. From this, they infer a true-positive rate of 95%, and performed no further validation. I’m concerned that such a limited test is the basis of the authors’ claim that “mutation calling was highly accurate” and very concerned that only 87 of approximately 1,330 somatic coding mutations have been experimentally validated.

Further, the comparison of WGS-versus-exome for mutation calling, from a single tumor sequenced by both approaches, is problematic. Overall, there were 24 shared coding mutations, 5 called in exome-only, and 14 called in WGS-only. However, if one considers only the exons targeted by capture reagents, there are just 4 called in WGS-only. From this, the authors infer that exome sensitivity is 29/33 (88%) for targeted exons, 29/43 (67%) for all exons, and that WGS sensitivity is 38/42 (88%). Of course, there’s no validation data backing up either set of unique calls. Further, these estimates are all using the same algorithm, muTector, which may be under-calling mutations.

Finally, the analysis of structural variation is extremely limited. Although the authors failed to validate any putative SVs (those that they attempted couldn’t be confirmed), they make occasional references to them throughout the study. Without orthogonal validation to demonstrate that one’s SV-calling algorithm is accurate, such claims should not be made.

Differences in Genome-Wide Mutation Rate

Even so, with whole-genome sequence data in hand, the authors were able to perform a relatively unbiased, genome-wide analysis of mutation rate. Unsurprisingly, mutations occurred four times more commonly at CpG dinucleotides than at A or T bases. When they compared mutation rates between coding, intronic, and intergenic regions, two patterns were strikingly apparent. First, mutations were less frequent in coding sequences, likely due to negative selection against protein-altering mutations. Second, the mutation rate was lower in intronic sequences (within genes, but non-coding) than for intergenic sequences (outside of genes). The authors propose transcription-coupled repair as a possible explanation for this pattern. A lower mutation rate in genes that are expressed in MMY lends further support to this theory, although, technically speaking, they’d need to show this correlation in the stem cells (not tumor cells). I’m told this can’t be done, and even so, the correlation is there.

The coding/intronic/intergenic mutation rate difference is not truly a novel finding. From my HapMap days, I recall that allele frequencies of SNPs tend to be lower, the farther they get from genes. This observation could be attributed to natural selection – either negative selection against variants in regulatory sequences near genes, or else “hitch-hiking” selection in which variants near genes are affected by selective pressure on their coding-sequence neighbors.

Frequently Mutated Genes

Perhaps the strongest element of this study was the sequencing of many samples, which enabled an unbiased search for recurrently mutated genes. The authors identified 10 significantly mutated genes:

Gene	Mutations	Description
NRAS	12	Neuroblastoma RAS oncogene
KRAS	16	Kirsten rat sarcoma RAS oncogene
FAM46C	8	Family with sequence similarity, member C
DIS3	5	RNA exonuclease; homolog of mitotic control gene in yeast
TP53	4	The classic p53 tumor-suppressor
CCND1	3	Cyclin D1, a known oncogene involved in cell cycle control
PNRC1	4	Proline-rich nuclear receptor coactivator 1
ALOX12B	3	Arachidonate 12-lipoxygenase
HLA-A	2	Human lymphocyte antigen (MHC class I), alpha
MAGED1	3	Melanoma antigen family D1

Three of these (NRAS, KRAS, and TP53) were known to play a role in MMY, and two more (CCND1, MAGED1) were already linked to human cancers. The observation of two SMGs involved in translational processes (DIS3 and FAM46C) suggests a role for protein translation and homeostasis in MMY pathogenesis, though I think that more MMY genomes are necessary to strengthen such a finding.

BRAF Mutations and NF-KB Pathway Members

One of the MMY tumors studied here harbored a novel BRAF mutation, motivating the authors to screen 161 additional multiple myelomas for the 12 most common mutations in this gene. Some 4% had BRAF mutations, which has clinical relevance because of the availability of BRAF inhibitors. Again, more genomes are needed, because a finding that might help 4% of patients isn’t quite as exciting to me as it is to the authors.

Gene set analysis highlighted the NF-KB pathway, the members of which harbored 15 alterations (mutations or SVs) affecting 11 different genes (BTRC, CARD11, CYLD, IKBIP, IKBKB, MAP3K1, MAP3K13, RIPK4, TLR4, TNFRSF1A, TRAF3). Notably, MAP3K1 is one of the significantly mutated genes among 50 breast tumors sequenced by the Genome Institute at Washington University. The NF-KB pathway was already known to be activated in MMY, but the current study sheds light on the diverse mechanisms by which this activation can be achieved.

Mutations in Non-coding Regions

Whole-genome sequencing of multiple tumors also enabled an analysis of significantly mutated non-coding sequences, which I found rather interesting. The authors delineated 2.4 million non-coding regions with regulatory potential, averaging 280 bp in size, and subjected them to the same permutation-type analysis as was used for gene significance testing. They identified multiple non-coding regions with mutation frequencies significantly higher than expected by chance. Some were known regions of somatic hypermutation, where the mutation rate is 1,000x higher, as expected for mature B cells. However, there were 18 novel “SMNRs” as I’d like to coin them. Four of these were near genes that were also mutated in MMY tumors, notably BCL7A, a putative tumor suppressor. These are intriguing findings that require more work, but they were only made possible by whole-genome sequencing.

In conclusion, Chapman and colleagues present the first whole-genome sequencing of multiple tumors, bolstered by exome sequencing of additional samples. As they freely admit in the discussion, the analysis presented here is preliminary, and additional MMY genomes will be required to definitively establish the genetic landscape of this disease.

References
Chapman MA, Lawrence MS, Keats JJ, Cibulskis K, Sougnez C, Schinzel AC, Harview CL, Brunet JP, Ahmann GJ, Adli M, Anderson KC, Ardlie KG, Auclair D, Baker A, Bergsagel PL, Bernstein BE, Drier Y, Fonseca R, Gabriel SB, Hofmeister CC, Jagannath S, Jakubowiak AJ, Krishnan A, Levy J, Liefeld T, Lonial S, Mahan S, Mfuko B, Monti S, Perkins LM, Onofrio R, Pugh TJ, Rajkumar SV, Ramos AH, Siegel DS, Sivachenko A, Stewart AK, Trudel S, Vij R, Voet D, Winckler W, Zimmerman T, Carpten J, Trent J, Hahn WC, Garraway LA, Meyerson M, Lander ES, Getz G, & Golub TR (2011). Initial genome sequencing and analysis of multiple myeloma. Nature, 471 (7339), 467-72 PMID: 21430775