I’m back and nearly recovered from the American Society of Human Genetics meeting in Orlando, Florida. The conference was well-covered on the #ASHG17 hashtag as usual, but I also compiled detailed notes that I’ll share here. The obvious theme of this meeting, from my point of view, was the rise of clinical genome sequencing in patients with inherited disease.
Genomes for breakfast with Karl Stefansson from deCODE
I heard a great talk by Karl Stefansson at one of the early-morning exhibitor events. He said that DeCODE has genotyped 425k individuals, performed WGS on 50k, and RNA-seq on 10k. Of particular interest was their work on de novo mutations in families. Some background:
- Each one of us born with ~70 de novo mutations
- 1/10 children born with LoF mutation in a gene
- 1/20 children born with LoF mutation in a gene expressed in brain
De novo mutations shared by siblings
Decode examined 1,010 sib pairs from 253 families:
- 447 autosomal de novo mutations shared by two or more siblings. (out of 17,710 per fam?, 2.5%)
- But notes use of strict filters: no reads showing mutation in parent selects against mosaic
- Age effect: the older the father, the fewer de novo mutations shared
- 70% of the diversity in de novo mutation rate is explained by paternal age. 2x mutations if 40 yo father vs 20 yo
Stefansson also relayed some interesting work on non-transmitted alleles in parents of children, finding that even when not passed on, many were significantly associated with the kids’ educational attainment, socioeconomic status, and other factors. He called it “genetic nurturing” and I love the concept. He also offered a few memorable quotes during his talk; my favorite was, “In human genetics, your competitor always becomes your collaborator.”
Clinical WGS: One Test to Rule Them All
Much of the ASHG conversation of course, was driven by Illumina. At their lunch symposium, CSO Ryan Taft touted clinical WGS as “one test to rule them all” — a single assay for detecting SNPs, indels, repeat expansions (an area of active software development at the company, i.e. ExpansionHunter), and large structural/copy number variants. Although I’m personally dubious about the ultimate ability of short-read sequencing to fully characterize large repeat expansions like fragile X and C9orf72, Taft sounded optimistic. He also shared some vignettes from the iHope network, an initiative to provide clinical WGS pro bono to families who can’t afford genetic testing. Their 2017 cohort included 81 cases; I took some quick numbers down (note, not
- 62 trios, 14 duos, 5 quads
- 7% were already likely positives (VUS that later got bumped)
- 7% reached a partial explanation (some but not all of disease)
- 21 cases with likely causal variants; many of these involved CNVs
- Clinically signif. variant types: 48% SNVs, 4% indels 1-4 bp, 31% CNVs 19kb-15mbp, 4% gross abnormalities, 11% multiple variant types, 2% LOH/UPD
Taft described one case of a de novo deletion on 19q13 (KMT2B) in 9-yo male. Array results were “not accessible” at time of referral. Obviously such an alteration would have been detected by an array at a significantly lower cost. Taft said that the patient had gotten an array, but the results were “not available” when they decided to proceed with sequencing. Even so, WGS certainly does have an advantage over clinical arrays for detecting kilobase-scale CNVs. The challenge, of course, is the cost.
The Clinical benefit of WGS in 300 families
Peter Bauer from Centogene gave a nice summary of clinical WGS in 300 families. Their interpretation strategy, it must be said, is nevertheless very coding-centric: they focus on coding region +/- 10bp, examining deep intronic variants only if already described as pathogenic, in an established candidate gene, or a region of homozygosity. He also highlighted the importance of previous testing on clinical WGS yield: they saw a 26% diagnostic rate for cases without previous WES testing, compared to 18% for cases that had been through WES. Interestingly, 55% of their diagnosed cases had a positive family history. I interpreted that as an illustration of the power of family-based sequencing with multiple affected relatives. A breakdown of pathogenic variants:
- 29% splice site (wow!)
- 25% missense variants
- 22% nonsense variants
- 22% indels (20% frameshift, 2% inframe)
- 2% large deletions
Note the contrast with Illumina’s numbers, which featured CNVs much more prominently.
Mendelian Disease Updates
Here are some highlights across a number of talks about clinical/research sequencing in rare inherited disorders.
Posey on behalf of CMGs on clinical impact of gene discovery
Jennifer Posey gave a nice report on progress at the Centers for Mendelian Genomics, from the Baylor perspective. She reminded us that the CMGs post their candidate genes on a website, many before publication. Baylor alone has discovered 300+ disease genes; 14% of these have already led to a diagnosis in their clinical lab (accounting for 3-4% of positive reports). She also mentioned that clinically negative WES cases are referred to the research lab for further evaluation, and 51% (yes, HALF) of these ultimately get a candidate diagnosis. I asked a question about this, because the rescue rate seemed almost too good to be true. She said that many of those rescues were achieved after additional family samples sequenced, which allowed a more definitive interpretation of a VUS. However, sometimes new disease gene publications were the source.
Rare and Undiagnosed Diseases in Pediatrics Initiative in Japan
Japan’s initiative has a 32% diagnostic rate, 77.4% autosomal dominant (presumably de novo), 12% X-linked, 9.1% autosomal recessive. I think it’s valuable to include these numbers because they show similar trends to what we’ve seen in predominantly European ancestry cohorts: a 1-in-3 solve rate, and a majority of positive clinical reports due to de novo mutations.
The contribution of rare recessive coding variants to severe developmental disorders in DDD
Hilary Martin gave a nice talk about the DDD project, which has enrolled over 10,000 cases and already yielded some valuable insights into the genetics of developmental disorders. For example, de novo mutations account for 40-45% of cases. Their analysis framework for recessive variants suggests that parents show lower frequency of biallelic variants than expected by chance … because presumably lethal or disease-causing biallelic pairs are not in healthy parents. There are twice as many recessive genes as dominant ones, but even so, recessive genes account for only a fraction (~5%) of their mostly-European-ancestry cohort.
A Review of Clinical Exome and Genome Sequencing
Stephen Kingsmore from the Rady Institute presented a meta-analysis of exome/genome sequencing in 19,715 infants/children from 75 studies. It’s an impressive dataset, and I think the results are thus very compelling:
- Diagnostic sensitivity: 0.40 for WGS, 0.35 for WES, 0.09 for chromosomal microarray.
- Between 2013 and 2017, the rate of diagnosis increased 16% each year.
- Rate of consanguinity in study is inversely proportional to rate of pathogenic de novo mutations
- Odds of Dx by trio testing were TWICE that for singleton testing
- The diagnostic rate was higher at hospital labs (0.41) than reference labs (0.28?)
- In six 2017 studies, odds of Dx by WGS/WES were 8.3x higher than that for CMA
Kingsmore found only 2 small studies which compared diagnostic sensitivity of WES versus WGS; both reported no significant difference. He said “the literature needs to catch up,” which I take to mean that he thinks WGS has a clear diagnostic advantage.
Automated reanalysis of clinical exome data in a fee-for-service clinical Dx lab.
One of my favorite talks was from Sam Baker from CHOP on their pipeline for automated re-analysis of clinical exomes which were reported as negative (80% of their first 300 probands in the lab). They have an approach for mining weekly PubMed abstracts (from new papers) to see if they inform past exome cases. He described “Clinical correlation” (manual assessment of gene-disease phenotype overlap) as the most important and most time-consuming step. They have a
Phenotype terms retrieved from chart review were used with genotype data. Automated correlation comes from 3 filters:
1.) Patient phenotype to gene-disease compares gene symbols and phenotype terms in PubMed Abstracts/OMIM entries
2.) Filters based on variant characteristics, quality, MAF, effect
3.) Filter based on segregation / inheritance
Their reanalysis of 240 negative exomes revealed 27 novel diagnoses
- 17 new disease gene
- 6 classification update of previous VUSes
- 1 new phenotype described for a known disease gene
- 3 candidate genes that don’t meet lab’s diagnostic criteria but they think they have it
- The diagnostic yield went from 20% in the initial analysis to to 29% with automated reanalysis. Their pipelines have really whittled down the monthly re-analysis workload to make it feasible (it’s usually 1-5 variants per case per month).