The Four Dimensions of a Breast Cancer Genome

Published today in the journal Nature is the whole-genome sequencing of a basal-like breast cancer tumor, metastasis, and xenograft. There’s also a News and Views article by Joe Gray of Lawrence Berkeley National Laboratory, as well as a news feature on large-scale cancer projects.


This study is a bit unlike our previous cancer genomes (AML1 and AML2). By my count it is the sixth cancer genome to be sequenced, and the third to come out of the Genome Center at Washington University. Obviously, it’s our first solid tumor. What’s particularly interesting about this study, however, is that we sequenced four DNA samples from a single patient with “double-negative” breast cancer: the primary tumor, peripheral blood (normal), a brain metastasis, and a mouse xenograft derived from the primary tumor. The xenograft is a success story in itself – we managed to create a human-in-mouse (HIM) transplant of the primary tumor that was >90% pure when harvested 101 days after engraftment.

The genomes of these four samples (tumor, normal, metastasis, and xenograft), examined with the incredible power of Illumina massively parallel sequencing, offer an unprecedented view of the somatic changes that underlie breast cancer development, growth, and metastasis.

Repertoire of Somatic Mutations

We validated a total of 50 somatic sites in at least one of the three cancer genomes, including:

  • 28 missense mutations predicted to alter the sequence of an encoded protein
  • 11 synonymous (silent) mutations in coding sequences
  • 4 small insertions ranging in size from 1 to 6 bp
  • 3 small deletions ranging in size from 1 to 13 bp
  • 2 splice site mutations at intron-exon junctions
  • 1 nonsense mutation predicted to result in a truncated protein
  • 1 RNA mutation in a gene encoding a signal recognition particle (SRP) RNA.

We employed deep Illumina sequencing of PCR amplicons to assess the frequencies of each mutation across all four tissues. Intriguingly, more than half of them exhibited differential frequencies between primary tumor, metastasis, and/or xenograft. Two mutations (a nonsense mutation in MYCBP2 and a missense mutation in TGFBI) were significantly enriched in the primary tumor (88-89% vs 14-44%). Some 26 mutations were significantly enriched in the metastasis and/or xenograft. Perhaps most interesting, however, were two sites (a missense mutation in SNED1 and a silent mutation in FLNC) that appear to be de novo mutations unique to the metastasis.

Acquired Structural Variation

Using our internally developed tools for structural variant prediction (BreakDancer) and de novo assembly (TIGRA), we predicted 59 deletions and 18 inversions that were putative somatic events. Validation by PCR and 454/3730 sequencing showed that 73/77 (94.8%) were real structural variants, of which 34 (28 deletions and 6 inversions) were somatic alterations not present in the normal genome. Among them was a 46.5 kbp heterozygous deletion affecting FBXW7 (a known cancer gene) and two overlapping 500-kb deletions affecting CTNNA1 and a handful of other genes. The latter was particularly interesting, because loss of CTNNA1 has been shown to result in global loss of cell adhesion in human breast cancer cell lines.

We also validated seven translocations with a combination of manual review (Pairoscope), assembly, and PCR/3730 sequencing. One translocation that we assembled in all three tumor samples involves a long terminal repeat (LTR) from the ERVL-MaLR family on chromosome 4 and the ABCA2 gene on chromosome 9. Two other validated translocations that assembled in all three tumors are on chromosome 2, and separated only by a 393-bp TcMar-Tigger repeat.

Insights from Comparisons of Tumor, Metastasis, and Xenograft

One of the most intriguing findings from our study was the differential mutation frequencies and structural variation patterns that we observed in the metastasis and xenograft, compared to the primary tumor. More than half of the somatic mutations (26/50) were significantly enriched in the metastasis and xenograft, while observed at relatively low frequencies in the primary tumor. This suggests that a sub-population of tumor cells, not the primary clone, gave rise to the cerebellar metastasis that eventually killed the patient.

Is there a fitness cost to the mutations that enabled metastasis? Can we develop sensitive tests to detect the cells that are likely to spread? Genome sequencing has brought us to a point where we can begin to ask these questions, and answering them brings us one step closer to unraveling the complex, devastating, deadly disease that is cancer.

Li Ding, Matthew J. Ellis, Shunqiang Li, David E. Larson, Ken Chen, John W. Wallis, Christopher C. Harris, Michael D. McLellan, Robert S. Fulton, Lucinda L. Fulton, Rachel M. Abbott, Jeremy Hoog, David J. Dooling, Daniel C. Koboldt, Heather Schmidt, Joell (2010). Genome remodelling in a basal-like breast cancer metastasis and xenograft Nature, 464 (15), 999-1005 : 10.1038/nature08989

Print Friendly
Michael T.
Michael T.

Fine tuning for you, Keith:

"the claim that cancer genomics has yielded a paltry haul ..."

Yes, that's the issue.

"...and to suggest that not much is coming later."

No, I don't think we are saying that...the issue is cost/benefit.

"...that is pretty much a call to shut it down."

The point I would stress here is that other smaller research programs, perhaps with much better cost/benefit ratios ARE being shut down right now. It's not anti-progress to question the wisdom of the approach we are taking.

WRT the small but growing number of targeted therapies. Welcome to my world. I teach about the development of imatinib to med and grad students, and when Todd Golub misrepresented Gleevec as a "genome inspired" treatment in his Nature rebuttal, I knew he was desperate. Gleevec and EGFR inhibitors are most definitely NOT the result of any cancer genomics efforts. The sole true success on this score would be BRAF in melanoma, that was discovered by high throughput means. Targeting this mutation has some short term benefit. And that's it. A lot of "promise" in the pipeline doesn't count for squat. I am part of the group that described the often sited IDH1 mutations, and believe that this has been seriously overblown for PR purposes. There are no data that this mutation will turn out to be anything other than a curiosity.

The article that Dan summarized above cost millions of dollars in reagents, personnel and equipment time. Half of Africa could have been immunized for what it cost. HIV drugs for southern India. My challenge stands--while I appreciate the interesting genetics in that paper, it doesn't do #$% for patients.

Keith Robison
Keith Robison

WRT Weinberg vs. cancer genomics -- any idea what funding balance is between "traditional" cancer research & cancer genomics. NCI budget is apparently a bit shy of $5B, though that includes a lot of clinical studies (very expensive).; I've seen $1B over a number of years cited as worldwide cancer genomics funding. Weinberg is not the first (nor the last) to claim that cancer genomics has yielded a paltry haul and to suggest that not much is coming later. Since you can't do cancer genomics on the cheap, that is pretty much a call to shut it down.

Where would you set it? Also, what I didn't touch in my post is the issue of new money vs. existing money -- one could argue that some of the funds for cancer genomics would not be available for other cancer projects - -but one could argue the other side as well.

The importance for patients is that a small -- but growing -- number of cancer mutations can be targeted by existing or in development therapies. This is particularly true for kinase inhibitors. Knowing the census of kinases which are activated in actual cancers will guide us to which can be currently hit and which are worth targeting with future programs.

Michael T.
Michael T.

Great paper and Dan and awesome overview of the meat of it. Way better than Grey's piece. Thanks. I also followed the link to Keith Robison's blog and feel compelled to pipe up.

One of the main take-homes for me from this paper is that we are sort of screwed. By the time a metastatic cancer develops, the horse has left the barn and the cancer is composed of a heterogeneous population of cells with a frightening amount of diversity. This is consistent with what we see in the clinic--man, these things are hard to kill! Or more accurately, they are pretty easy to kill, but virtually impossible to eliminate. You can see why--whatever therapy you throw at this thing, it works for a bit, but the diversity of tumor works as a massively parallel computer to figure a solution around it. Evolution on steroids.

And to Keith, who said that people who agree with Weinberg (e.g. me) are trying to stifle cancer genomics before it can really take off--wow. No one argues that cancer genomics is just starting to take off and we will generate massive amounts of info in the near future. But hello, Weinberg's argument was that huge sums of money are being sucked into this enterprise and it is killing off smaller labs. That's a fact.

Dan, would you care to estimate how much money that Nature paper cost? Yes, the cost of sequencing is going down, but that's a bit of a half-truth, isn't it? The cost of analysis is sky rocketing.

And let's please add to the list of questions we can now begin to ask: what the $#%& does it matter to a patient with cancer? That "big breakthroughs are right around the corner" is not a new story line.

Keith Robison
Keith Robison

Great paper!

There doesn't even need to be a fitness cost for the metastasis-driving mutations to be at low frequency in the primary. If you could actually both genotype & lineage trace every cell, the expectation is that most mutations will be in the newest cells. If the metastasis mutations are simply neutral for the primary, they won't be enriched and will be a minor component simply because they showed up late.