Sanger Sequencing Necessity in Clinical Labs

Exome sequencing continues to displace traditional panel-based approaches to genetic diagnosis. The maturity and efficiency of current exome kits, together with the ever-moving target of bona fide disease genes, make a strong argument for sequencing the exons of all genes, rather than the ones currently known to be associated with a specific phenotype. Indeed, many clinical laboratories now offer “panel” tests that are actually whole-exome sequencing, with reports limited to a specific set of genes.

I’ve already written about some of the things clinicians want from NGS panel tests, including the nebulous term of “complete coverage.” Some laboratories achieve this by performing NGS sequencing first, and then filling in under-covered regions by Sanger (capillary) sequencing. This is a laborious and costly step, as the regions sufficiently covered by NGS will vary between individuals and sequencing runs. I suspect the practice is going away somewhat rapidly.

That raises the possibility that Sanger sequencing will be discontinued entirely in clinical labs. I’m sure that many labs would like to mothball their 3730xl instruments. They take up a lot of space in the lab (though not as much as some instruments named after oceans) and also tend to break, requiring expensive service contacts to keep them in operation.

Their final grip on the clinical laboratory might be the independent verification of sequence variants that are deemed pathogenic or likely-pathogenic. I don’t like the term gold standard that’s often bandied about as the reasoning for this. Sanger sequencing has its own issues — such as allele dropout and the difficulty of resolving indels — that affect its accuracy. Nevertheless, it is a common practice to use Sanger to independently verify variants that will go on a clinical report.

The Argument Against Sanger Confirmation

As NGS technologies and analysis pipelines improve, some are asking if this, too, is no longer necessary. In July 2015, Baudhuin et al published a study on the necessity of confirming variants in NGS panels by Sanger sequencing. They examined the concordance between NGS and Sanger sequencing for 919 NGS variants detected by panel sequencing (177-gene) in 77 samples. For the 919 variants (797 SNVs and 122 indels) with both NGS and Sanger data, the concordance was 100%. They concluded:

Confirmatory analysis by Sanger sequencing of SNVs detected via capture-based NGS testing that meets appropriate quality thresholds is unnecessarily redundant.

That’s a fairly bold statement, though the qualifier, of course, is “appropriate quality thresholds.” So what did these authors apply? Well, their average sequencing depth per sample was not disclosed, but they required that SNV calls have:

Variant quality score >20
>100x sequence depth
Flanking region mean base quality >15

Yet the authors also included over 100 indels in their comparison, and still reported 100% concordance. This is a rather incredible result, given the difficulty of predicting and representing indels, so I delved into the method by which indels were compared NGS and Sanger data:

For the purposes of this study, if Sanger analysis confirmed the presence of one of the major indels called in a cluster of indel variants reported by NGS, we considered the Sanger and NGS to be concordant for the presence of the indel. In addition, for the purposes of this study, if a cluster of variant calls at an indel site included a falsely called SNV (such as the example provided earlier in this paragraph), we did not include this as a discordant SNV in our data.

In other words, if they predicted an indel in the region, and Sanger confirmed one or more indels, that counted as a match. Now I understand how they got to 100%. The authors admit that:

If the genomic location of an indel variant is still uncertain, Sanger confirmatory analysis is indeed appropriate.

Given that indel variant positions are usually uncertain, this is a nice catch-all. There’s also the tidbit about SNVs near indel positions (a common source of artifactual SNV calls) being excluded from concordance calculations. And later, the authors mention that SNV calls from pseudogenes are beyond the scope of this comparison. So basically, the conclusion is that SNV calls which:

Had high genotype quality and flanking base quality scores
Had >100x sequence depth
Are not near indels
Are not in any of those pesky pseudogene regions

… then they don’t need to be independently confirmed by Sanger. Sure, I’ll buy that.

The Argument for Sanger Confirmation

A year later, a group from Ambry Genetics published a paper in the same journal arguing that Sanger confirmation is still required for maximum sensitivity in clinical genetic tests. Their study examines the results from a slightly smaller (47-gene) hereditary cancer panel, but a much larger sample set: 7,845 sequence variants from 20,000 panel tests that were independently confirmed by Sanger sequencing. Of these, 98.7% were concordant between NGS and Sanger sequencing; 1.3% were identified as NGS false-positives, located mainly in complex genomic regions (A/T-rich regions, G/C-rich regions, homopolymer stretches, and pseudogene regions).

They conclude that NGS panel tests yield largely accurate results, but still have a false-positive rate above 1%. By adjusting their analysis pipelines, they can get that false-positive rate down to zero, but they lose about 2.2% of true variants. Thus, they argue that slightly less stringent analysis with Sanger confirmation is the best option.

The Balanced Approach: Selective Confirmation of Certain Variants

My position is somewhere in the middle. I appreciate that Sanger is a time-consuming and expensive addition to any NGS-based test. This problem will only be exacerbated as we broaden genetic testing to encompass most or all known genes. I do believe that a known variant that has sufficient sequencing coverage, variant allele frequency, and quality metrics does not need independent confirmation. However, some categories of sequence variants — private variants, de novo mutations, and variant calls in problematic regions — should probably be confirmed if they’re going on the clinical report.

That confirmation doesn’t need to be Sanger sequencing. In my opinion, a second NGS-based assay using a different enrichment method (e.g. custom capture or digital-droplet PCR) is probably good enough.