October is Breast Cancer Awareness Month, and the timing couldn’t be better. Our friends at the BC Cancer Agency published the whole genome sequencing of a breast cancer this week in a letter to Nature.
Using Illumina paired-end sequencing, Shah et al generated 141 Gbp of sequence to achieve 43x haploid coverage of a metastatic lobular breast cancer. Some 32 somatic, protein-altering (nonsynonymous) mutations were identified, of which 11 could be detected in the primary tumor sample from 9 years earlier. Deep RNA-Seq data from the metastatic sample also permitted transcriptome analysis, though its presentation was brief. Interestingly, the authors validated two novel RNA-editing events that change the amino acid sequences of SRP9 and COG3.
No WGS of Normal or Primary Tumor?
I realize that a letter to Nature must be brief, but even so, what struck me most about this paper is what’s missing. First of all, only the metastatic sample was whole-genome sequenced – the primary tumor and matched normal were not. Instead, the authors identified nonsynonymous coding variants in met WGS data, and validated them by PCR/3730 sequencing in the met, tumor, and normal samples. This seems laborious to me, since there were 1,120 nonsynonymous SNVs, of which 437 (39%) were valid and only 32 (<3%) were absent in the normal and therefore somatic. Another regrettable limitation of this approach is that it doesn’t offer a complete picture of the somatic mutations beyond nonsynonymous-coding events.
My understanding of Nature journals is that there’s no limit on supplementary material that accompanies publications. Thus, I don’t understand why the methods are incomplete. For example, though the authors found and confirmed >60 germline indels, there’s no description anywhere of the indel-calling algorithm. There’s a lot of text describing their internally developed SNVmix algorithm to identify SNVs, but no link to download it that I could find. No mention of dbSNP or Affy SNP array concordance for SNVmix calls was offered, so one cannot evaluate the algorithm. Also, there’s no description of read de-duplication, which is alarming because it suggests that duplicate reads from the same molecule weren’t removed prior to analysis.
The Importance of RNA-Seq
I do like that the authors performed RNA-Seq of the transcriptome, which provides insights into mechanisms like alternative splicing (AS), allele-specific expression (ASE), and RNA editing. Sadly, only the last one received mention in the results section, suggesting that no significant AS or ASE events were found. Interestingly, not only did the authors validate two instances of high-frequency, protein-altering RNA editing (COG3 and SRP9), but they found that the ADAR RNA-editing enzyme was one of the most highly expressed genes in the metastasis. The authors note that “these observations emphasize the importance of integrating RNA-seq data with tumor genomes,” although this claim would have been far better supported if one did not have to dig through a massive/disorganized Excel file for most of the RNA-seq data.
“Evolution” of a Breast Cancer Tumor
Perhaps the most intriguing – and contentious – finding of the paper (as highlighted by GT’s In Sequence magazine and Keith Robison on Omics Omics) was that few of the somatic mutations in the metastasis were detected in the primary tumor sample from 9 years earlier. PCR and deep resequencing of mutation-containing amplicons in the metastasis and primary tumor allowed for a frequency analysis of the 32 somatic mutations. Five of these (in ABCB11, HAUS3, SLC24A4, SNX4, and PALB2) were present at high levels in the primary tumor, while another six (in KIF1C, USP28, MYH8, MORC1, KIAA1468, and RNASEH2A) were detectable at lower frequencies (1-3%). Of the remaining 21 mutations, 19 were not detected at all and 2 could not be determined.
I’m not an oncologist, but I still wonder how surprising it should be that many of the mutations in a metastatic tumor are absent from a primary tumor almost a decade earlier. Are these simply passenger mutations that arose from a surviving subclone from the original tumor, or are they key drivers of metastasis and tumor growth? Or was it the intervening radition and therapy that caused these mutations? There was zero discussion of the known functions of these genes in this paper, so it’s difficult to say. The authors contrast this result with our sequencing of AML1, though I’m not sure it is an appropriate comparison since (1) we had data from a relapse 3 years post-diagnosis, whereas theirs was from a metastasis 9 years post-diagnosis. Even so, the findings in the breast cancer study are interesting enough to merit further investigation.
Shah, S., Morin, R., Khattra, J., Prentice, L., Pugh, T., Burleigh, A., Delaney, A., Gelmon, K., Guliany, R., Senz, J., Steidl, C., Holt, R., Jones, S., Sun, M., Leung, G., Moore, R., Severson, T., Taylor, G., Teschendorff, A., Tse, K., Turashvili, G., Varhol, R., Warren, R., Watson, P., Zhao, Y., Caldas, C., Huntsman, D., Hirst, M., Marra, M., & Aparicio, S. (2009). Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution Nature, 461 (7265), 809-813 DOI: 10.1038/nature08489